Create or update an autoscaling policy
Added in 7.11.0
NOTE: This feature is designed for indirect use by Elasticsearch Service, Elastic Cloud Enterprise, and Elastic Cloud on Kubernetes. Direct use is not supported.
Path parameters
-
name
string Required the name of the autoscaling policy
Query parameters
-
master_timeout
string Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.
-
timeout
string Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.
Body
Required
-
roles
array[string] Required -
deciders
object Required Decider settings.
External documentation
curl \
--request PUT 'http://api.example.com/_autoscaling/policy/{name}' \
--header "Authorization: $API_KEY" \
--header "Content-Type: application/json" \
--data '"{\n \"roles\": [],\n \"deciders\": {\n \"fixed\": {\n }\n }\n}"'
{
"roles": [],
"deciders": {
"fixed": {
}
}
}
{
"roles" : [ "data_hot" ],
"deciders": {
"fixed": {
}
}
}
{
"acknowledged": true
}
Get node statistics
Get statistics for nodes in a cluster. By default, all stats are returned. You can limit the returned information by using metrics.
Path parameters
-
node_id
string | array[string] Required Comma-separated list of node IDs or names used to limit returned information.
Query parameters
-
completion_fields
string | array[string] Comma-separated list or wildcard expressions of fields to include in fielddata and suggest statistics.
-
fielddata_fields
string | array[string] Comma-separated list or wildcard expressions of fields to include in fielddata statistics.
-
fields
string | array[string] Comma-separated list or wildcard expressions of fields to include in the statistics.
-
groups
boolean Comma-separated list of search groups to include in the search statistics.
-
include_segment_file_sizes
boolean If true, the call reports the aggregated disk usage of each one of the Lucene index files (only applies if segment stats are requested).
-
level
string Indicates whether statistics are aggregated at the cluster, index, or shard level.
Values are
cluster
,indices
, orshards
. -
timeout
string Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.
-
types
array[string] A comma-separated list of document types for the indexing index metric.
-
include_unloaded_segments
boolean If
true
, the response includes information from segments that are not loaded into memory.
curl \
--request GET 'http://api.example.com/_nodes/{node_id}/stats' \
--header "Authorization: $API_KEY"
Get node statistics
Get statistics for nodes in a cluster. By default, all stats are returned. You can limit the returned information by using metrics.
Query parameters
-
completion_fields
string | array[string] Comma-separated list or wildcard expressions of fields to include in fielddata and suggest statistics.
-
fielddata_fields
string | array[string] Comma-separated list or wildcard expressions of fields to include in fielddata statistics.
-
fields
string | array[string] Comma-separated list or wildcard expressions of fields to include in the statistics.
-
groups
boolean Comma-separated list of search groups to include in the search statistics.
-
include_segment_file_sizes
boolean If true, the call reports the aggregated disk usage of each one of the Lucene index files (only applies if segment stats are requested).
-
level
string Indicates whether statistics are aggregated at the cluster, index, or shard level.
Values are
cluster
,indices
, orshards
. -
timeout
string Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.
-
types
array[string] A comma-separated list of document types for the indexing index metric.
-
include_unloaded_segments
boolean If
true
, the response includes information from segments that are not loaded into memory.
curl \
--request GET 'http://api.example.com/_nodes/{node_id}/stats/{metric}' \
--header "Authorization: $API_KEY"
Activate the connector draft filter
Technical preview
Activates the valid draft filtering for a connector.
Path parameters
-
connector_id
string Required The unique identifier of the connector to be updated
curl \
--request PUT 'http://api.example.com/_connector/{connector_id}/_filtering/_activate' \
--header "Authorization: $API_KEY"
Pause an auto-follow pattern
Added in 7.5.0
Pause a cross-cluster replication auto-follow pattern. When the API returns, the auto-follow pattern is inactive. New indices that are created on the remote cluster and match the auto-follow patterns are ignored.
You can resume auto-following with the resume auto-follow pattern API. When it resumes, the auto-follow pattern is active again and automatically configures follower indices for newly created indices on the remote cluster that match its patterns. Remote indices that were created while the pattern was paused will also be followed, unless they have been deleted or closed in the interim.
Path parameters
-
name
string Required The name of the auto-follow pattern to pause.
Query parameters
-
master_timeout
string The period to wait for a connection to the master node. If the master node is not available before the timeout expires, the request fails and returns an error. It can also be set to
-1
to indicate that the request should never timeout.
curl \
--request POST 'http://api.example.com/_ccr/auto_follow/{name}/pause' \
--header "Authorization: $API_KEY"
{
"acknowledged" : true
}
Query parameters
-
master_timeout
string Period to wait for a connection to the master node.
curl \
--request GET 'http://api.example.com/_enrich/policy' \
--header "Authorization: $API_KEY"
Delete component templates
Added in 7.8.0
Component templates are building blocks for constructing index templates that specify index mappings, settings, and aliases.
Path parameters
-
name
string | array[string] Required Comma-separated list or wildcard expression of component template names used to limit the request.
Query parameters
-
master_timeout
string Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.
-
timeout
string Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.
curl \
--request DELETE 'http://api.example.com/_component_template/{name}' \
--header "Authorization: $API_KEY"
Get index templates
Get information about one or more index templates.
IMPORTANT: This documentation is about legacy index templates, which are deprecated and will be replaced by the composable templates introduced in Elasticsearch 7.8.
Path parameters
-
name
string | array[string] Required Comma-separated list of index template names used to limit the request. Wildcard (
*
) expressions are supported. To return all index templates, omit this parameter or use a value of_all
or*
.
Query parameters
-
flat_settings
boolean If
true
, returns settings in flat format. -
local
boolean If
true
, the request retrieves information from the local node only. -
master_timeout
string Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.
curl \
--request GET 'http://api.example.com/_template/{name}' \
--header "Authorization: $API_KEY"
Force a merge
Added in 2.1.0
Perform the force merge operation on the shards of one or more indices. For data streams, the API forces a merge on the shards of the stream's backing indices.
Merging reduces the number of segments in each shard by merging some of them together and also frees up the space used by deleted documents. Merging normally happens automatically, but sometimes it is useful to trigger a merge manually.
WARNING: We recommend force merging only a read-only index (meaning the index is no longer receiving writes). When documents are updated or deleted, the old version is not immediately removed but instead soft-deleted and marked with a "tombstone". These soft-deleted documents are automatically cleaned up during regular segment merges. But force merge can cause very large (greater than 5 GB) segments to be produced, which are not eligible for regular merges. So the number of soft-deleted documents can then grow rapidly, resulting in higher disk usage and worse search performance. If you regularly force merge an index receiving writes, this can also make snapshots more expensive, since the new documents can't be backed up incrementally.
Blocks during a force merge
Calls to this API block until the merge is complete (unless request contains wait_for_completion=false
).
If the client connection is lost before completion then the force merge process will continue in the background.
Any new requests to force merge the same indices will also block until the ongoing force merge is complete.
Running force merge asynchronously
If the request contains wait_for_completion=false
, Elasticsearch performs some preflight checks, launches the request, and returns a task you can use to get the status of the task.
However, you can not cancel this task as the force merge task is not cancelable.
Elasticsearch creates a record of this task as a document at _tasks/<task_id>
.
When you are done with a task, you should delete the task document so Elasticsearch can reclaim the space.
Force merging multiple indices
You can force merge multiple indices with a single request by targeting:
- One or more data streams that contain multiple backing indices
- Multiple indices
- One or more aliases
- All data streams and indices in a cluster
Each targeted shard is force-merged separately using the force_merge threadpool.
By default each node only has a single force_merge
thread which means that the shards on that node are force-merged one at a time.
If you expand the force_merge
threadpool on a node then it will force merge its shards in parallel
Force merge makes the storage for the shard being merged temporarily increase, as it may require free space up to triple its size in case max_num_segments parameter
is set to 1
, to rewrite all segments into a new one.
Data streams and time-based indices
Force-merging is useful for managing a data stream's older backing indices and other time-based indices, particularly after a rollover. In these cases, each index only receives indexing traffic for a certain period of time. Once an index receive no more writes, its shards can be force-merged to a single segment. This can be a good idea because single-segment shards can sometimes use simpler and more efficient data structures to perform searches. For example:
POST /.ds-my-data-stream-2099.03.07-000001/_forcemerge?max_num_segments=1
Path parameters
-
index
string | array[string] Required A comma-separated list of index names; use
_all
or empty string to perform the operation on all indices
Query parameters
-
allow_no_indices
boolean Whether to ignore if a wildcard indices expression resolves into no concrete indices. (This includes
_all
string or when no indices have been specified) -
expand_wildcards
string | array[string] Whether to expand wildcard expression to concrete indices that are open, closed or both.
Supported values include:
all
: Match any data stream or index, including hidden ones.open
: Match open, non-hidden indices. Also matches any non-hidden data stream.closed
: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.hidden
: Match hidden data streams and hidden indices. Must be combined withopen
,closed
, orboth
.none
: Wildcard expressions are not accepted.
-
flush
boolean Specify whether the index should be flushed after performing the operation (default: true)
-
max_num_segments
number The number of segments the index should be merged into (default: dynamic)
-
only_expunge_deletes
boolean Specify whether the operation should only expunge deleted documents
-
wait_for_completion
boolean Should the request wait until the force merge is completed.
curl \
--request POST 'http://api.example.com/{index}/_forcemerge' \
--header "Authorization: $API_KEY"
Get mapping definitions
Retrieves mapping definitions for one or more fields. For data streams, the API retrieves field mappings for the stream’s backing indices.
This API is useful if you don't need a complete mapping or if an index mapping contains a large number of fields.
Path parameters
-
index
string | array[string] Required Comma-separated list of data streams, indices, and aliases used to limit the request. Supports wildcards (
*
). To target all data streams and indices, omit this parameter or use*
or_all
. -
fields
string | array[string] Required Comma-separated list or wildcard expression of fields used to limit returned information. Supports wildcards (
*
).
Query parameters
-
allow_no_indices
boolean If
false
, the request returns an error if any wildcard expression, index alias, or_all
value targets only missing or closed indices. This behavior applies even if the request targets other open indices. -
expand_wildcards
string | array[string] Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as
open,hidden
. Valid values are:all
,open
,closed
,hidden
,none
.Supported values include:
all
: Match any data stream or index, including hidden ones.open
: Match open, non-hidden indices. Also matches any non-hidden data stream.closed
: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.hidden
: Match hidden data streams and hidden indices. Must be combined withopen
,closed
, orboth
.none
: Wildcard expressions are not accepted.
-
include_defaults
boolean If
true
, return all default settings in the response.
curl \
--request GET 'http://api.example.com/{index}/_mapping/field/{fields}' \
--header "Authorization: $API_KEY"
{
"publications": {
"mappings": {
"title": {
"full_name": "title",
"mapping": {
"title": {
"type": "text"
}
}
}
}
}
}
{
"publications": {
"mappings": {
"author.id": {
"full_name": "author.id",
"mapping": {
"id": {
"type": "text"
}
}
},
"abstract": {
"full_name": "abstract",
"mapping": {
"abstract": {
"type": "text"
}
}
}
}
}
}
{
"publications": {
"mappings": {
"author.name": {
"full_name": "author.name",
"mapping": {
"name": {
"type": "text"
}
}
},
"abstract": {
"full_name": "abstract",
"mapping": {
"abstract": {
"type": "text"
}
}
},
"author.id": {
"full_name": "author.id",
"mapping": {
"id": {
"type": "text"
}
}
}
}
}
}
Get index recovery information
Get information about ongoing and completed shard recoveries for one or more indices. For data streams, the API returns information for the stream's backing indices.
All recoveries, whether ongoing or complete, are kept in the cluster state and may be reported on at any time.
Shard recovery is the process of initializing a shard copy, such as restoring a primary shard from a snapshot or creating a replica shard from a primary shard. When a shard recovery completes, the recovered shard is available for search and indexing.
Recovery automatically occurs during the following processes:
- When creating an index for the first time.
- When a node rejoins the cluster and starts up any missing primary shard copies using the data that it holds in its data path.
- Creation of new replica shard copies from the primary.
- Relocation of a shard copy to a different node in the same cluster.
- A snapshot restore operation.
- A clone, shrink, or split operation.
You can determine the cause of a shard recovery using the recovery or cat recovery APIs.
The index recovery API reports information about completed recoveries only for shard copies that currently exist in the cluster. It only reports the last recovery for each shard copy and does not report historical information about earlier recoveries, nor does it report information about the recoveries of shard copies that no longer exist. This means that if a shard copy completes a recovery and then Elasticsearch relocates it onto a different node then the information about the original recovery will not be shown in the recovery API.
Path parameters
-
index
string | array[string] Required Comma-separated list of data streams, indices, and aliases used to limit the request. Supports wildcards (
*
). To target all data streams and indices, omit this parameter or use*
or_all
.
Query parameters
-
active_only
boolean If
true
, the response only includes ongoing shard recoveries. -
detailed
boolean If
true
, the response includes detailed information about shard recoveries.
curl \
--request GET 'http://api.example.com/{index}/_recovery' \
--header "Authorization: $API_KEY"
{
"index1" : {
"shards" : [ {
"id" : 0,
"type" : "SNAPSHOT",
"stage" : "INDEX",
"primary" : true,
"start_time" : "2014-02-24T12:15:59.716",
"start_time_in_millis": 1393244159716,
"stop_time" : "0s",
"stop_time_in_millis" : 0,
"total_time" : "2.9m",
"total_time_in_millis" : 175576,
"source" : {
"repository" : "my_repository",
"snapshot" : "my_snapshot",
"index" : "index1",
"version" : "{version}",
"restoreUUID": "PDh1ZAOaRbiGIVtCvZOMww"
},
"target" : {
"id" : "ryqJ5lO5S4-lSFbGntkEkg",
"host" : "my.fqdn",
"transport_address" : "my.fqdn",
"ip" : "10.0.1.7",
"name" : "my_es_node"
},
"index" : {
"size" : {
"total" : "75.4mb",
"total_in_bytes" : 79063092,
"reused" : "0b",
"reused_in_bytes" : 0,
"recovered" : "65.7mb",
"recovered_in_bytes" : 68891939,
"recovered_from_snapshot" : "0b",
"recovered_from_snapshot_in_bytes" : 0,
"percent" : "87.1%"
},
"files" : {
"total" : 73,
"reused" : 0,
"recovered" : 69,
"percent" : "94.5%"
},
"total_time" : "0s",
"total_time_in_millis" : 0,
"source_throttle_time" : "0s",
"source_throttle_time_in_millis" : 0,
"target_throttle_time" : "0s",
"target_throttle_time_in_millis" : 0
},
"translog" : {
"recovered" : 0,
"total" : 0,
"percent" : "100.0%",
"total_on_start" : 0,
"total_time" : "0s",
"total_time_in_millis" : 0
},
"verify_index" : {
"check_index_time" : "0s",
"check_index_time_in_millis" : 0,
"total_time" : "0s",
"total_time_in_millis" : 0
}
} ]
}
}
{
"index1" : {
"shards" : [ {
"id" : 0,
"type" : "EXISTING_STORE",
"stage" : "DONE",
"primary" : true,
"start_time" : "2014-02-24T12:38:06.349",
"start_time_in_millis" : "1393245486349",
"stop_time" : "2014-02-24T12:38:08.464",
"stop_time_in_millis" : "1393245488464",
"total_time" : "2.1s",
"total_time_in_millis" : 2115,
"source" : {
"id" : "RGMdRc-yQWWKIBM4DGvwqQ",
"host" : "my.fqdn",
"transport_address" : "my.fqdn",
"ip" : "10.0.1.7",
"name" : "my_es_node"
},
"target" : {
"id" : "RGMdRc-yQWWKIBM4DGvwqQ",
"host" : "my.fqdn",
"transport_address" : "my.fqdn",
"ip" : "10.0.1.7",
"name" : "my_es_node"
},
"index" : {
"size" : {
"total" : "24.7mb",
"total_in_bytes" : 26001617,
"reused" : "24.7mb",
"reused_in_bytes" : 26001617,
"recovered" : "0b",
"recovered_in_bytes" : 0,
"recovered_from_snapshot" : "0b",
"recovered_from_snapshot_in_bytes" : 0,
"percent" : "100.0%"
},
"files" : {
"total" : 26,
"reused" : 26,
"recovered" : 0,
"percent" : "100.0%",
"details" : [ {
"name" : "segments.gen",
"length" : 20,
"recovered" : 20
}, {
"name" : "_0.cfs",
"length" : 135306,
"recovered" : 135306,
"recovered_from_snapshot": 0
}, {
"name" : "segments_2",
"length" : 251,
"recovered" : 251,
"recovered_from_snapshot": 0
}
]
},
"total_time" : "2ms",
"total_time_in_millis" : 2,
"source_throttle_time" : "0s",
"source_throttle_time_in_millis" : 0,
"target_throttle_time" : "0s",
"target_throttle_time_in_millis" : 0
},
"translog" : {
"recovered" : 71,
"total" : 0,
"percent" : "100.0%",
"total_on_start" : 0,
"total_time" : "2.0s",
"total_time_in_millis" : 2025
},
"verify_index" : {
"check_index_time" : 0,
"check_index_time_in_millis" : 0,
"total_time" : "88ms",
"total_time_in_millis" : 88
}
} ]
}
}
Shrink an index
Added in 5.0.0
Shrink an index into a new index with fewer primary shards.
Before you can shrink an index:
- The index must be read-only.
- A copy of every shard in the index must reside on the same node.
- The index must have a green health status.
To make shard allocation easier, we recommend you also remove the index's replica shards. You can later re-add replica shards as part of the shrink operation.
The requested number of primary shards in the target index must be a factor of the number of shards in the source index. For example an index with 8 primary shards can be shrunk into 4, 2 or 1 primary shards or an index with 15 primary shards can be shrunk into 5, 3 or 1. If the number of shards in the index is a prime number it can only be shrunk into a single primary shard Before shrinking, a (primary or replica) copy of every shard in the index must be present on the same node.
The current write index on a data stream cannot be shrunk. In order to shrink the current write index, the data stream must first be rolled over so that a new write index is created and then the previous write index can be shrunk.
A shrink operation:
- Creates a new target index with the same definition as the source index, but with a smaller number of primary shards.
- Hard-links segments from the source index into the target index. If the file system does not support hard-linking, then all segments are copied into the new index, which is a much more time consuming process. Also if using multiple data paths, shards on different data paths require a full copy of segment files if they are not on the same disk since hardlinks do not work across disks.
- Recovers the target index as though it were a closed index which had just been re-opened. Recovers shards to the
.routing.allocation.initial_recovery._id
index setting.
IMPORTANT: Indices can only be shrunk if they satisfy the following requirements:
- The target index must not exist.
- The source index must have more primary shards than the target index.
- The number of primary shards in the target index must be a factor of the number of primary shards in the source index. The source index must have more primary shards than the target index.
- The index must not contain more than 2,147,483,519 documents in total across all shards that will be shrunk into a single shard on the target index as this is the maximum number of docs that can fit into a single shard.
- The node handling the shrink process must have sufficient free disk space to accommodate a second copy of the existing index.
Query parameters
-
master_timeout
string Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.
-
timeout
string Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.
-
wait_for_active_shards
number | string The number of shard copies that must be active before proceeding with the operation. Set to
all
or any positive integer up to the total number of shards in the index (number_of_replicas+1
).
curl \
--request PUT 'http://api.example.com/{index}/_shrink/{target}' \
--header "Authorization: $API_KEY" \
--header "Content-Type: application/json" \
--data '"{\n \"settings\": {\n \"index.routing.allocation.require._name\": null,\n \"index.blocks.write\": null\n }\n}"'
{
"settings": {
"index.routing.allocation.require._name": null,
"index.blocks.write": null
}
}
curl \
--request GET 'http://api.example.com/_inference' \
--header "Authorization: $API_KEY"
Create a Hugging Face inference endpoint
Added in 8.12.0
Create an inference endpoint to perform an inference task with the hugging_face
service.
You must first create an inference endpoint on the Hugging Face endpoint page to get an endpoint URL.
Select the model you want to use on the new endpoint creation page (for example intfloat/e5-small-v2
), then select the sentence embeddings task under the advanced configuration section.
Create the endpoint and copy the URL after the endpoint initialization has been finished.
The following models are recommended for the Hugging Face service:
all-MiniLM-L6-v2
all-MiniLM-L12-v2
all-mpnet-base-v2
e5-base-v2
e5-small-v2
multilingual-e5-base
multilingual-e5-small
When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running.
After creating the endpoint, wait for the model deployment to complete before using it.
To verify the deployment status, use the get trained model statistics API.
Look for "state": "fully_allocated"
in the response and ensure that the "allocation_count"
matches the "target_allocation_count"
.
Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.
Path parameters
-
task_type
string Required The type of the inference task that the model will perform.
Value is
text_embedding
. -
huggingface_inference_id
string Required The unique identifier of the inference endpoint.
Body
-
chunking_settings
object -
service
string Required Value is
hugging_face
. -
service_settings
object Required
curl \
--request PUT 'http://api.example.com/_inference/{task_type}/{huggingface_inference_id}' \
--header "Authorization: $API_KEY" \
--header "Content-Type: application/json" \
--data '"{\n \"service\": \"hugging_face\",\n \"service_settings\": {\n \"api_key\": \"hugging-face-access-token\", \n \"url\": \"url-endpoint\" \n }\n}"'
{
"service": "hugging_face",
"service_settings": {
"api_key": "hugging-face-access-token",
"url": "url-endpoint"
}
}
Get GeoIP statistics
Added in 7.13.0
Get download statistics for GeoIP2 databases that are used with the GeoIP processor.
curl \
--request GET 'http://api.example.com/_ingest/geoip/stats' \
--header "Authorization: $API_KEY"