Indexing documents into Elasticsearch introduces system load in the form of memory and CPU load. Each indexing operation includes coordinating, primary, and replica stages. These stages can be performed across multiple nodes in a cluster.
Indexing pressure can build up through external operations, such as indexing requests, or internal mechanisms, such as recoveries and cross-cluster replication. If too much indexing work is introduced into the system, the cluster can become saturated. This can adversely impact other operations, such as search, cluster coordination, and background processing.
To prevent these issues, Elasticsearch internally monitors indexing load. When the load exceeds certain limits, new indexing work is rejected
External indexing operations go through three stages: coordinating, primary, and replica. See Basic write model.
indexing_pressure.memory.limit node setting restricts the number of bytes
available for outstanding indexing requests. This setting defaults to 10% of
At the beginning of each indexing stage, Elasticsearch accounts for the bytes consumed by an indexing request. This accounting is only released at the end of the indexing stage. This means that upstream stages will account for the request overheard until all downstream stages are complete. For example, the coordinating request will remain accounted for until primary and replica stages are complete. The primary request will remain accounted for until each in-sync replica has responded to enable replica retries if necessary.
A node will start rejecting new indexing work at the coordinating or primary stage when the number of outstanding coordinating, primary, and replica indexing bytes exceeds the configured limit.
A node will start rejecting new indexing work at the replica stage when the number of outstanding replica indexing bytes exceeds 1.5x the configured limit. This design means that as indexing pressure builds on nodes, they will naturally stop accepting coordinating and primary work in favor of outstanding replica work.
indexing_pressure.memory.limit setting’s 10% default limit is generously
sized. You should only change it after careful consideration. Only indexing
requests contribute to this limit. This means there is additional indexing
overhead (buffers, listeners, etc) which also require heap space. Other
components of Elasticsearch also require memory. Setting this limit too high can deny
operating memory to other operations and components.
You can use the node stats API to retrieve indexing pressure metrics.
Indexing pressure settingsedit
- Number of outstanding bytes that may be consumed by indexing requests. When this limit is reached or exceeded, the node will reject new coordinating and primary operations. When replica operations consume 1.5x this limit, the node will reject new replica operations. Defaults to 10% of the heap.