EEA & Eionet documentation hub

Browse documentation for IT-systems used by the European Environment Agency and the Eionet network.

Elasticsearch Docker image with EEA RDF River installed



This image is based on the official Elasticsearch Docker image ( In addition to this, it has the following plugins installed:

  • Analysis ICU
  • eea.elasticsearch.river.rdf
  • head

Useful configurations

To configure an Elastic Cluster using this image you can add parameters to the elasticsearch command so that the node can have a certain role in the cluster.

Runing a master only node

This ensures that the node can’t hold data and can’t be accessed with the HTTP API. Also this node can’t run river processes.

    image: eeacms/elastic
        - -Des.http.enabled=false
        - -Des.node.master=true
        - -Des.node.river=_none_

Running a client

This ensures that the node can’t be elected as master and is in charge only for responding to HTTP requests for clients. Also this node can’t run river processes.

    image: eeacms/elastic
    command: # No data, http, can't be master
        - elasticsearch
        - -Des.http.enabled=true
        - -Des.node.master=false
        - -Des.node.river=_none_

Running a data node

This ensures that the node only holds data without being accessible using the REST api and without being able to be elected as a master.

    image: eeacms/elastic
    command: # No data, http, can't be master
        - elasticsearch
        - -Des.http.enabled=false
        - -Des.node.master=false

Increasing java heap space for the node

Follow the instructions here:

You can add the ES_HEAP_SIZE environment variable using the -e Docker parameter. Alternativelly you can pass the -Xmx and -Xms directly to the command that runs inside the container.

Keeping the data persistent

By default the data is stored inside a volume. The voulume is mounted inside the container at /usr/share/elasticsearch/data

You can also create a propper data volume by running for example a busybox container having as a volume /usr/share/elasticsearch/data

In order to create backups of the data just run another container with --volumes-from. e.g. Make a backup on your local system at /path/to/backup.

docker run --volumes-from myelasticsearch -v /path/to/backup/:/backup busybox cp -r /usr/share/elasticsearch/data /backup

Alternatively you can mount that directory onto a local path

Monitoring the Elastic cluster health

When running the Elastic cluster in production you would often want to monitor its health status and be notified when the cluster is downa or experience some stability issues.

A cluster health API is accessible at http://<elasticserver- ip>:9200/_cluster/health?pretty=true and returns a JSON response like:

  "cluster_name" : "SearchServices",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 4,
  "number_of_data_nodes" : 2,
  "active_primary_shards" : 16,
  "active_shards" : 32,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "number_of_pending_tasks" : 0

The status will tell you how healthy the cluster is. It can be green, yellow or red. More info at Elastic cluster health API.

Another Cluster stats API call will give even more details about memory usage, cpus usage, open file descriptors and so on.


If you want to integrate a local build of the RDF River plugin into this image for development you can:

pushd /your/work/dir
git clone
# modify the code there
# Now you have a local image called eeacms/elastic:dev that you can use
# locally with your latest build of the river plugin

As a general practice, if you want to build the image locally: docker build -t eeacms/elastic:dev .

Edit this page