Peer recovery is the process used to build a new copy of a shard on a node by copying data from the primary. Elasticsearch uses this peer recovery process to rebuild shard copies that were lost if a node has failed, and uses the same process when migrating a shard copy between nodes to rebalance the cluster or to honor any changes to the shard allocation settings.
The following expert setting can be set to manage the resources consumed by peer recoveries:
Limits the total inbound and outbound peer recovery traffic on each node.
Since this limit applies on each node, but there may be many nodes
performing peer recoveries concurrently, the total amount of peer recovery
traffic within a cluster may be much higher than this limit. If you set
this limit too high then there is a risk that ongoing peer recoveries will
consume an excess of bandwidth (or other resources) which could destabilize
the cluster. Defaults to
Controls the number of file chunk requests that can be sent in parallel per recovery.
As multiple recoveries are already running in parallel (controlled by
cluster.routing.allocation.node_concurrent_recoveries), increasing this expert-level
setting might only help in situations where peer recovery of a single shard is not
reaching the total inbound and outbound peer recovery traffic as configured by
indices.recovery.max_bytes_per_sec, but is CPU-bound instead, typically when using
transport-level security or compression. Defaults to
This setting can be dynamically updated on a live cluster with the cluster-update-settings API.