I have a problem with my Elasticsearch nodes running in a docker environment. I'm starting them up with docker-compose and after a few minutes they tell me:flood stage disk watermark [95%] exceeded
I'm running it on a cluster with rather high storage capacity and I already tried to increase the watermark settings in the elasticsearch.yml file, but I still get the error. Maybe it has to do with the size of the docker containers.
Does anyone know what could be the problem? Any help is much appreciated.
The docker-compose.yml for reference:
version: '3.4'
services: es01: image: container_name: es01 environment: #- discovery.type=single-node - node.name=es01 - cluster.name=es-docker-cluster - discovery.seed_hosts=es02,es03 - cluster.initial_master_nodes=es01,es02,es03 - bootstrap.memory_lock=true - xpack.security.enabled=false - "ES_JAVA_OPTS=-Xms512m -Xmx512m" ulimits: memlock: soft: -1 hard: -1 volumes: - data01:/usr/share/elasticsearch/data ports: - 9200:9200 networks: - elastic es02: image: container_name: es02 environment: - node.name=es02 - cluster.name=es-docker-cluster - discovery.seed_hosts=es01,es03 - cluster.initial_master_nodes=es01,es02,es03 - bootstrap.memory_lock=true - "ES_JAVA_OPTS=-Xms512m -Xmx512m" ulimits: memlock: soft: -1 hard: -1 volumes: - data02:/usr/share/elasticsearch/data networks: - elastic es03: image: container_name: es03 environment: - node.name=es03 - cluster.name=es-docker-cluster - discovery.seed_hosts=es01,es02 - cluster.initial_master_nodes=es01,es02,es03 - bootstrap.memory_lock=true - "ES_JAVA_OPTS=-Xms512m -Xmx512m" ulimits: memlock: soft: -1 hard: -1 volumes: - data03:/usr/share/elasticsearch/data networks: - elastic kib01: image: container_name: kib01 depends_on: - es01 - es02 - es03 ports: - 5601:5601 environment: ELASTICSEARCH_URL: ELASTICSEARCH_HOSTS: networks: - elastic client: image: appropriate/curl:latest depends_on: - es01 - es02 - es03 networks: - elastic command: sh -c "curl es01:9200 && curl kib01:5601" dash_app: build: . ports: - 0.0.0.0:8050:8050 depends_on: - es01 - es02 - es03 - kib01 networks: - elastic
#mapping:
# image: appropriate/curl:latest
# depends_on:
# - es01
# - es02
# - es03
# networks:
# - elastic
# command: "curl -v -XPUT 'es01:9200/urteile' -H 'Content-Type: application/json' -d '
# {
# 'mappings': {
# 'properties': {
# 'date': {
# 'type': 'date'
# }
# }
# }
# }
# '" #web: # build: . # ports: # - 8000:8000 #depends_on: # - es01 # - es02 # - es03 #networks: # - elastic
volumes: data01: driver: local data02: driver: local data03: driver: local
networks: elastic: driver: bridgeAnd docker info:
Server: Containers: 6 Running: 3 Paused: 0 Stopped: 3 Images: 185 Server Version: 19.03.12 Storage Driver: overlay Backing Filesystem: extfs Supports d_type: true Logging Driver: json-file Cgroup Driver: cgroupfs Plugins: Volume: local Network: bridge host ipvlan macvlan null overlay Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog Swarm: inactive Runtimes: runc nvidia Default Runtime: runc Init Binary: docker-init containerd version: 7ad184331fa3e55e52b890ea95e65ba581ae3429 runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd init version: fec3683 Security Options: seccomp Profile: default Kernel Version: 5.7.2-kd-cluster Operating System: Debian GNU/Linux 9 (stretch) OSType: linux Architecture: x86_64 CPUs: 32 Total Memory: 125.8GiB Name: dpl01 ID: KBGO:2E6L:NIHR:UQAL:K5CN:XWBI:R7TK:WWZF:MZBT:BCHE:HUQW:UKKM Docker Root Dir: /data/docker Debug Mode: false Registry: Labels: Experimental: false Insecure Registries: 127.0.0.0/8 Live Restore Enabled: false 1 5 Answers
I found the solution. The problem has to do with the disk usage in total as described in the answer from sastorsl here:low disk watermark [??%] exceeded on
The storage of the cluster I worked on was 98% used. While there were 400GB free, Elasticsearch only looks at the percentages, thus shutting down any write permissions of indices.
The solution is to manually set the watermarks after the nodes have started (setting them in the elasticsearch.yml didn't work for me for some reason):
curl -XPUT -H "Content-Type: application/json" -d '{ "transient": { "cluster.routing.allocation.disk.threshold_enabled": false } }'
curl -XPUT -H "Content-Type: application/json" -d '{"index.blocks.read_only_allow_delete": null}'Of course you have to put in your index names. After that, the indices will be writable again.
In my case I was getting this with elasticsearch 8.7 and ELK in docker-compose.
Adding cluster.routing.allocation.disk.threshold_enabled=false to the ES's service in docker-compose.yml fixed the problem for me
environment: - network.host=0.0.0.0 - http.port=9200 - transport.host=localhost - cluster.name=docker-cluster - bootstrap.memory_lock=true - xpack.security.enabled=false - cluster.routing.allocation.disk.threshold_enabled=false # <---@Stimmot's solution also worked for me,
curl -XPUT -H "Content-Type: application/json" -d '{ "transient": { "cluster.routing.allocation.disk.threshold_enabled": false } }'Running the curl in a healthcheck did not work for me
elasticsearch: healthcheck: test: ["CMD", "curl", "-XPUT", "-H", "'Content-Type: application/json'", "", "-d", "'{ \"transient\": { \"cluster.routing.allocation.disk.threshold_enabled\": false } }'"] interval: 10s timeout: 10s retries: 1 kivana: depends_on: elasticsearch: condition: service_healthy Using docker compose:
- Copy
/usr/share/elasticsearch/config/elasticsearch.ymlto your host machine - Add
cluster.routing.allocation.disk.threshold_enabled: false
toelasticsearch.yml - Add to
docker-compose.yml
elasticsearch: ... volumes: - ./elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml:ro,Z ... I came across this error while developing in local environment. It turned out disk image size set for docker was indeed 98% used. The solution was simply increase disk allocation for containers. (Macos docker desktop > resources > advanced > Disk image size) Hope it helps someone.
You can just set cluster.routing.allocation.disk.threshold_enabled to false in the config/elasticsearch.yml file anywhere:
cluster.routing.allocation.disk.threshold_enabled:falseand then restart your elastic cluster.