Loading

Diagnose unassigned shards

Stack

Simplify monitoring with AutoOps Serverless ECH ECK ECE Self-Managed

AutoOps is a monitoring tool that simplifies cluster management through performance recommendations, resource utilization visibility, and real-time issue detection with resolution paths. Learn more about AutoOps.

An unassigned shard is a shard that exists in the cluster metadata but is not currently allocated to any node, which means its data is unavailable for both search and indexing operations.

Shards can become unassigned for many reasons, such as node failures, cluster or indices configuration, insufficient resources, or allocation rules that prevent Elasticsearch from placing the shard on any available node.

Unassigned shards directly affects the cluster health status:

  • If at least one replica shard is unassigned, the cluster health becomes yellow. The cluster can still serve all data, but redundancy is reduced.
  • If at least one primary shard is unassigned, the cluster health becomes red. In this state, some data is unavailable, and affected indices cannot fully operate.

To diagnose the unassigned shards in your deployment, use the following steps. You can use either API console, or direct Elasticsearch API calls.

  1. View the unassigned shards using the cat shards API.

    				GET _cat/shards?v=true&h=index,shard,prirep,state,node,unassigned.reason&s=state&format=json
    		

    The response looks like this:

    [
      {
        "index": "my-index-000001",
        "shard": "0",
        "prirep": "p",
        "state": "UNASSIGNED",
        "node": null,
        "unassigned.reason": "INDEX_CREATED"
      }
    ]
    		

    Unassigned shards have a state of UNASSIGNED. The prirep value is p for primary shards and r for replicas.

    The index in the example has a primary shard unassigned.

  2. To understand why an unassigned shard is not being assigned and what action you must take to allow Elasticsearch to assign it, use the cluster allocation explanation API.

    				GET _cluster/allocation/explain
    					{
      "index": "my-index-000001",
      "shard": 0,
      "primary": true
    }
    		
    1. The index we want to diagnose.
    2. The unassigned shard ID.
    3. Indicates that we are diagnosing a primary shard.

    The response looks like this:

    {
      "index" : "my-index-000001",
      "shard" : 0,
      "primary" : true,
      "current_state" : "unassigned",
      "unassigned_info" : {
        "reason" : "INDEX_CREATED",
        "at" : "2022-01-04T18:08:16.600Z",
        "last_allocation_status" : "no"
      },
      "can_allocate" : "no",
      "allocate_explanation" : "Elasticsearch isn't allowed to allocate this shard to any of the nodes in the cluster. Choose a node to which you expect this shard to be allocated, find this node in the node-by-node explanation, and address the reasons which prevent Elasticsearch from allocating this shard there.",
      "node_allocation_decisions" : [
        {
          "node_id" : "8qt2rY-pT6KNZB3-hGfLnw",
          "node_name" : "node-0",
          "transport_address" : "127.0.0.1:9401",
          "roles": ["data_content", "data_hot"],
          "node_attributes" : {},
          "node_decision" : "no",
          "weight_ranking" : 1,
          "deciders" : [
            {
              "decider" : "filter",
              "decision" : "NO",
              "explanation" : "node does not match index setting [index.routing.allocation.include] filters [_name:\"nonexistent_node\"]"
            }
          ]
        }
      ]
    }
    		
    1. The current state of the shard.
    2. The reason for the shard originally becoming unassigned.
    3. Whether to allocate the shard.
    4. Whether to allocate the shard to the particular node.
    5. The decider which led to the no decision for the node.
    6. An explanation as to why the decider returned a no decision, with a helpful hint pointing to the setting that led to the decision.
  3. The explanation in our case indicates the index allocation configurations are not correct. To review your allocation settings, use the get index settings and cluster get settings APIs.

    				GET my-index-000001/_settings?flat_settings=true&include_defaults=true
    				GET _cluster/settings?flat_settings=true&include_defaults=true
    		
  4. Change the settings using the update index settings and cluster update settings APIs to the correct values to allow the index to be allocated.

For more guidance on fixing the most common causes for unassigned shards, follow Red or yellow cluster health status > Fix a red or yellow cluster status, refer to Using the cluster allocation API for troubleshooting, or contact Elastic Support.

Watch this video for a walkthrough of monitoring allocation health.

The following sections provide advice for resolving some of the common causes for unassigned shards.

A primary shard might be unassigned due to conflicting settings. View this video for a walkthrough of troubleshooting a node and index setting mismatch.

When Elasticsearch is unable to allocate a shard, it attempts to retry allocation up to the maximum number of retries allowed. After this, Elasticsearch stops attempting to allocate the shard to prevent infinite retries, which might impact cluster performance. You can use an API to reroute the cluster, which allocates the shard if the issue preventing allocation has been resolved. For example:

				POST _cluster/reroute?retry_failed
		

If a shard is unassigned with an allocation status of no_valid_shard_copy, you should make sure that all nodes are in the cluster. If all the nodes containing in-sync copies of a shard are lost, then you can recover the data for the shard.

View this video for a walkthrough of troubleshooting no_valid_shard_copy.