Browse documentation for IT-systems used by the European Environment Agency and the Eionet network.
ElasticSearch + Facetview complete Docker Stack Orchestration
This repo is DEPRECATED: We now deploy via EEA Rancher catalog templates.
More details on the source repository
1.2 pam [repo], - Node.js frontend to an ElasticSearch cluster
More details on the source repository
1.3 aide [repo], - Node.js frontend to an ElasticSearch cluster
More details on the source repository
|**1.7 dataw[1||2]** - Data Volume Containers|
1.8 datam - Data Container for Master node
- More information about elasticsearch node roles can be found here
- More information about elasticsearch node discovery can be found here
git clone --recurse https://github.com/eea/eea.docker.searchservices cd eea.docker.searchservices docker-compose up -d
To see all commands an elastic app can do type
docker-compose run --rm
Troubleshooting: Data is not indexed? Sometimes during the indexing and even after finishing it queries on the new index throws an error. Restarting elasticsearch solves the problem:
# Restarting the elastic workers if the index is not built docker-compose restart esworker1 docker-compose restart esworker2
Now go to the <serverip>:9200/_plugin/head/ to see if the index is being built.
Also you can try to increment the ES_HEAP_SIZE for the clients in the docker- compose.yml.
All elastic search apps run a create index at startup if they haven’t indexes or not have data.
You can stop this feature adding
AUTO_INDEXING=false into environment
section of the docker-compose.yml
... environment: - AUTO_INDEXING=false ...
After you can run the follow steps to index
# Wait a while for the elastic cluster to get initialized # Start indexing data docker-compose run --rm eeasearch create_index # Check the logs docker-compose logs # If the river is not indexing just perform a couple of reindex commands docker-compose run --rm eeasearch reindex # Go to this host:3000 to see that data is being harvested # And the same for PAM # Start indexing data docker-compose run --rm pam create_index # And the same for AIDE # Start indexing data docker-compose run --rm aide create_index # Check the logs docker-compose logs # If the river is not indexing just perform a couple of reindex commands docker-compose run --rm pam reindex # Go to this host:3010 to see that data is being harvested for pam # Go to this host:3020 to see that data is being harvested for aide
The data is kept persistent by using two explicit data containers. The data is
/usr/share/elasticsearch/data Follow te steps from the “Backup,
restore, or migrate data volumes” section in the Docker
Change the tags in this repo to match the image version you want to upgrade to. Then, push the changes on this repo. On the host runnig this compose-file do:
docker-compose stop # stop the running containers git pull origin master # and get the docker-compose-prod.yml containing the latests tags # Before this step you should backup the data containers if the update procedure fails docker-compose pull # get the images and their tags docker images | grep eeacms # inspect that the new images have been downloaded docker-compose rm -vf eeasearch aide pam # remove the old containers befor start docker-compose up -d --no-recreate # start the running containers
In some cases the containers cannot be stopped because for some reason they have no names. This happens mostly for the elastic containers. Running
docker ps -a
Displays the list of containers but some of them have no names. First these containers should be removed with
docker rm --force <container_id>
Second the containers should be rebuilt with
docker-compose up -d --no-recreate
Given a webapp and the fact that you can access esclient from your office you can reindex the data or force a sync using this command.
Assuming that esclient:9200 is available at
staging:80/elasticsearch/ and you have permission to perform PUT POST and
DELETE over that endpoint from your office, you can run this oneliner to
reindex the data from a given app.
docker run --rm -e elastic_host=some-staging -e elastic_path=/elasticsearch/ -e elastic_port=80 eeacms/eeasearch reindex
To see a list of all available commands run:
docker run --rm -e elastic_host=some-staging -e elastic_path=/elasticsearch/ -e elastic_port=80 eeacms/eeasearch help
9200. So you can omit them if esclient is accessible on port
TL;DR - it won’t work with docker-compose scale because the overhead is in worker nodes which need additional ops to be scaled.
By default, ElasticSearch breaks an index into 5 shards (holding different parts of the data). Each shard will have one replica. If we have 4 workers with this setup, then shards could be distributed as such:
If Node3 and Node4 are scaled down, Shard 4 will get lost and it would be hard to recover.
Scaling up will not automatically move shards to other nodes in order to better distribute the jobs.
Scaling down will not move shards to remaining nodes to keep availability.
Running on the same host would increase the number of parallel disk accesses which can trash the cache, resulting in poor performance.
Worker nodes perform most of the work. If something runs slow it’s a high change that something is taking too long on the workers, not the client or the master.
Maintaining a more complex ElasticSearch Cluster means distributing it over more hosts and performing careful operations for scaling so data is not lost. Just don’t do docker scale over elastic nodes.
The provided docker-compose-prod.yml in this repo is already configured to run within Rancher PaaS.
Make sure you have the appropriate labels on the docker hosts in your Rancher cluster. See docker-compose-prod.yml and look for labels io.rancher.scheduler.affinity:host_label.
Go to your Rancher Web interface and generate your API key (API & Keys for “…” Environment):
$ export RANCHER_URL=<(Endpoint URL)> $ export RANCHER_ACCESS_KEY=<(ACCESS KEY)> $ export RANCHER_SECRET_KEY=<(SECRET KEY)> $ git clone https://github.com/eea/eea.docker.searchservices.git $ cd eea.docker.searchservices $ rancher-compose up
The above will automatically create a stack named eea-docker-searchservices and run it. Now look at the exposed rancher loadbalancer and configure your DNS/proxy to point to it.
Perform this steps to be able to easily make changes to any of the EEA maintained parts of this stack.
sudo apt-get install mavenand a Java environment
This repository glues together all the components of the stack and also offers a template for a development docker-compose file. Change directory to your home or working folder and clone the project using:
user@host ~/ $ git clone --recursive firstname.lastname@example.org:eea/eea.docker.searchservices.git
Building the elastic containers from sources is rarely used, and takes lot of time, so we have 2 options:
docker-compose -f docker-compose-dev.yml up to start all services.
Check http://localhost:9200 or http://localhost:9200/_plugin/head/ to see if elastic is up and running. When it’s up, you can go to http://localhost:3000, http://localhost:3010 and http://localhost:3020 then make yourself a coffee, everything works now.
river plugin from sources
docker-compose -f docker-compose-dev-elastic.yml up to start all
docker-compose -f docker-compose-dev.yml run --rm eeasearch create_index
to create the index for EEASearch
docker-compose -f docker-compose-dev.yml run --rm pam create_index to
create the index for PAM
docker-compose -f docker-compose-dev.yml run --rm aide create_index to
create the index for AIDE
Assuming you have tested locally and implemented the needed features, depending on the code you changed, perform the following steps to make the changes available in Docker Registry.
You can also use repo specific docker-compose.yml files if the changes affect only a part of the stack.
Note: make sure that all the applications using this package work with your new changes before publishing anything.
First, you need to publish the new version of the package.
This repository will not automatically build the eeacms/eeasearch (and other apps) Docker images.
Note: make sure that all the applications using the river work with your new changes before publishing anything.
First, you need to add a new release of the river.
mvn clean installto make a new build
eea.elasticsearch.river.rdf/target/releases/eea-rdf-river-plugin-version.zipas a binary release
This repository will not automatically build the eeacms/elastic Docker images.
Pushing to master will automatically trigger a build with the :latest tag. Make sure that you are building with the correct tags and wait for the builds to complete bofore performing these steps.
All elastic applications will display in the page footer information about the current index and container, like below:
Application data last refreshed 05 April 2016 12:52 PM. Version info eeacms/pam:v2.7.3 and git tag number v2.8 on 718b1e09d6a0.
eeacms/pam:v2.7.3 - current image version used; this is an optional value that can be specified in the docker compose file like below:
environment: - VERSION_INFO=eeacms/pam:v2.7.3