Azure repository

You can use Azure Blob storage as a repository for Snapshot and restore.

Elasticsearch uses an internal client module to connect to Azure Blob storage, referred to in this document as the Azure client or the Azure repository client. Clients are configured through a combination of secure settings defined in the Elasticsearch keystore, and standard settings defined in the elasticsearch.yml configuration file.

Setup

To enable Azure repositories, first configure an Azure repository client by specifying one or more settings of the form azure.client.CLIENT_NAME.SETTING_NAME. By default, azure repositories use a client named default, but you may specify a different client name when registering each repository.

The only mandatory setting for an Azure repository client is account, which is a secure setting defined in the Elasticsearch keystore. To provide this setting, use the elasticsearch-keystore tool on each node:

bin/elasticsearch-keystore add azure.client.default.account

If you adjust this setting after a node has started, call the Nodes reload secure settings API to reload the new value.

You may define more than one client by setting their account values. For example, to set the default client and another client called secondary, run the following commands on each node:

		bin/elasticsearch-keystore add azure.client.default.account
bin/elasticsearch-keystore add azure.client.secondary.account

The key and sas_token settings are also secure settings and can be set using commands like the following:

		bin/elasticsearch-keystore add azure.client.default.key
bin/elasticsearch-keystore add azure.client.secondary.sas_token

Other Azure repository client settings must be set in elasticsearch.yml before the node starts. For example:

		azure.client.default.timeout: 10s
azure.client.default.max_retries: 7
azure.client.default.endpoint_suffix: core.chinacloudapi.cn
azure.client.secondary.timeout: 30s
		
	

In this example, the client side timeout is 10s per try for repositories which use the default client, with 7 retries before failing and an endpoint suffix of core.chinacloudapi.cn. Repositories which use the secondary client will have a timeout of 30s per try, but will use the default endpoint and will fail after the default number of retries.

Once an Azure repository client is configured correctly, register an Azure repository as follows, providing the client name using the client repository setting:

						PUT _snapshot/my_backup
					{
  "type": "azure",
  "settings": {
    "client": "secondary"
  }
}
		
	

If you are using the default client, you may omit the client repository setting:

						PUT _snapshot/my_backup
					{
  "type": "azure"
}
		
	

Note

In progress snapshot or restore jobs will not be preempted by a reload of the storage secure settings. They will complete using the client as it was built when the operation started.

Client settings

You can configure Azure client settings for authentication, service connectivity, request handling, and network proxy behavior. For a complete list of all Azure client settings, refer to Azure repository client settings.

Obtaining credentials from the environment

If you specify neither the key nor the sas_token settings for a client then Elasticsearch will attempt to automatically obtain credentials from the environment in which it is running using mechanisms built into the Azure SDK. This is ideal for when running Elasticsearch on the Azure platform.

When running Elasticsearch on an Azure Virtual Machine, you should use Azure Managed Identity to provide credentials to Elasticsearch. To use Azure Managed Identity, assign a suitably authorized identity to the Azure Virtual Machine on which Elasticsearch is running.

When running Elasticsearch in Azure Kubernetes Service, for instance using Elastic Cloud on Kubernetes, you should use Azure Workload Identity to provide credentials to Elasticsearch. To use Azure Workload Identity, mount the azure-identity-token volume as a subdirectory of the Elasticsearch config directory and set the AZURE_FEDERATED_TOKEN_FILE environment variable to point to a file called azure-identity-token within the mounted volume.

The Azure SDK has several other mechanisms to automatically obtain credentials from its environment, but the two methods described above are the only ones that are tested and supported for use in Elasticsearch.

Repository settings

The Azure repository supports a number of settings to customize how data is stored, which may be specified when creating the repository.

Repository settings cover storage location, data layout, transfer behavior, throughput limits, and cleanup tuning. For a complete list of all Azure repository settings, refer to Azure repository settings.

Repository validation rules

According to the containers naming guide, a container name must be a valid DNS name, conforming to the following naming rules:

Container names must start with a letter or number, and can contain only letters, numbers, and the dash (-) character.
Every dash (-) character must be immediately preceded and followed by a letter or number; consecutive dashes are not permitted in container names.
All letters in a container name must be lowercase.
Container names must be from 3 through 63 characters long.

Supported Azure Storage Account types

The Azure repository type works with all Standard storage accounts

Standard Locally Redundant Storage - Standard_LRS
Standard Zone-Redundant Storage - Standard_ZRS
Standard Geo-Redundant Storage - Standard_GRS
Standard Read Access Geo-Redundant Storage - Standard_RAGRS

Premium Locally Redundant Storage (Premium_LRS) is not supported as it is only usable as VM disk storage, not as general storage.

Linearizable register implementation

The linearizable register implementation for Azure repositories is based on Azure’s support for strongly consistent leases. Each lease may only be held by a single node at any time. The node presents its lease when performing a read or write operation on a protected blob. Lease-protected operations fail if the lease is invalid or expired. To perform a compare-and-exchange operation on a register, Elasticsearch first obtains a lease on the blob, then reads the blob contents under the lease, and finally uploads the updated blob under the same lease. This process ensures that the read and write operations happen atomically.