Fix watermark errorsedit

When a data node is critically low on disk space and has reached the flood-stage disk usage watermark, the following error is logged: Error: disk usage exceeded flood-stage watermark, index has read-only-allow-delete block.

To prevent a full disk, when a node reaches this watermark, Elasticsearch blocks writes to any index with a shard on the node. If the block affects related system indices, Kibana and other Elastic Stack features may become unavailable.

Elasticsearch will automatically remove the write block when the affected node’s disk usage goes below the high disk watermark. To achieve this, Elasticsearch automatically moves some of the affected node’s shards to other nodes in the same data tier.

To verify that shards are moving off the affected node, use the cat shards API.

response =
  v: true
puts response
GET _cat/shards?v=true

If shards remain on the node, use the cluster allocation explanation API to get an explanation for their allocation status.

GET _cluster/allocation/explain
  "index": "my-index",
  "shard": 0,
  "primary": false,
  "current_node": "my-node"

To immediately restore write operations, you can temporarily increase the disk watermarks and remove the write block.

PUT _cluster/settings
  "persistent": {
    "cluster.routing.allocation.disk.watermark.low": "90%",
    "cluster.routing.allocation.disk.watermark.low.max_headroom": "100GB",
    "cluster.routing.allocation.disk.watermark.high": "95%",
    "cluster.routing.allocation.disk.watermark.high.max_headroom": "20GB",
    "cluster.routing.allocation.disk.watermark.flood_stage": "97%",
    "cluster.routing.allocation.disk.watermark.flood_stage.max_headroom": "5GB",
    "cluster.routing.allocation.disk.watermark.flood_stage.frozen": "97%",
    "cluster.routing.allocation.disk.watermark.flood_stage.frozen.max_headroom": "5GB"

PUT */_settings?expand_wildcards=all
  "index.blocks.read_only_allow_delete": null

As a long-term solution, we recommend you add nodes to the affected data tiers or upgrade existing nodes to increase disk space. To free up additional disk space, you can delete unneeded indices using the delete index API.

DELETE my-index

When a long-term solution is in place, reset or reconfigure the disk watermarks.

response = client.cluster.put_settings(
  body: {
    persistent: {
      "cluster.routing.allocation.disk.watermark.low": nil,
      "cluster.routing.allocation.disk.watermark.low.max_headroom": nil,
      "cluster.routing.allocation.disk.watermark.high": nil,
      "cluster.routing.allocation.disk.watermark.high.max_headroom": nil,
      "cluster.routing.allocation.disk.watermark.flood_stage": nil,
      "cluster.routing.allocation.disk.watermark.flood_stage.max_headroom": nil,
      "cluster.routing.allocation.disk.watermark.flood_stage.frozen": nil,
      "cluster.routing.allocation.disk.watermark.flood_stage.frozen.max_headroom": nil
puts response
PUT _cluster/settings
  "persistent": {
    "cluster.routing.allocation.disk.watermark.low": null,
    "cluster.routing.allocation.disk.watermark.low.max_headroom": null,
    "cluster.routing.allocation.disk.watermark.high": null,
    "cluster.routing.allocation.disk.watermark.high.max_headroom": null,
    "cluster.routing.allocation.disk.watermark.flood_stage": null,
    "cluster.routing.allocation.disk.watermark.flood_stage.max_headroom": null,
    "cluster.routing.allocation.disk.watermark.flood_stage.frozen": null,
    "cluster.routing.allocation.disk.watermark.flood_stage.frozen.max_headroom": null