Nodes orchestration
editNodes orchestration
editNodeSets overview
editThe Elasticsearch cluster is specified using a list of NodeSets. Each NodeSet represents a group of Elasticsearch nodes sharing the same specification (both Elasticsearch configuration and Kubernetes Pod configuration).
apiVersion: elasticsearch.k8s.elastic.co/v1beta1
kind: Elasticsearch
metadata:
name: quickstart
spec:
version: 8.19.8
nodeSets:
- name: master-nodes
count: 3
config:
master: true
data: false
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: standard
- name: data-nodes
count: 10
config:
master: false
data: true
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1000Gi
storageClassName: standard
The Elasticsearch resource above defines two NodeSets: one for master nodes, using 10Gi volumes, and one for data nodes, using 1000Gi volumes. The Elasticsearch cluster is composed of 13 nodes: 3 master nodes and 10 data nodes.
Upgrading the cluster
editECK handles smooth upgrades from one cluster specification to another. You can apply a new Elasticsearch specification at any time.
Here are a few examples based on the Elasticsearch specification above:
-
To add five additional Elasticsearch data nodes: change
count: 10tocount: 15in thedata-nodesNodeSet. -
To increase the RAM memory limit of data nodes to 32Gi: set a different resources limits in the existing
data-nodesNodeSetPodTemplate. -
To replace dedicated master and dedicated data nodes by nodes having both master and data roles: replace the 2 existing
NodeSetsby a single one with a different name and the corresponding Elasticsearch configuration settings. -
To upgrade Elasticsearch from version
7.2.0to7.3.0: change the value in theversionfield.
ECK orchestrates NodeSet changes with no downtime and makes sure that:
- Before a node is removed, the relevant data is migrated to other nodes.
-
When a cluster topology changes, the Elasticsearch orchestration settings
discovery.seed_hosts,cluster.initial_master_nodes,discovery.zen.minimum_master_nodes,_cluster/voting_config_exclusionsare adjusted accordingly. -
Rolling upgrades are performed safely, reusing the
PersistentVolumesof the upgraded Elasticsearch nodes.
StatefulSets orchestration
editBehind the scenes, ECK translates each NodeSet specified in the Elasticsearch resource into a StatefulSet in Kubernetes. The StatefulSet specification is based on the NodeSet specification:
-
countcorresponds to the number of replicas in theStatefulSet, each replica leading to the creation of aPod, which corresponds to a single Elasticsearch node -
podTemplatecan be used to specify custom settings for the Elasticsearch Pod, overriding the default ones set by ECK on the generatedStatefulSetspecification -
the
StatefulSetname is built from the Elasticsearch resource name and theNodeSetname. Each Pod will be assigned theStatefulSetname suffixed by an ordinal. The corresponding Elasticsearch node has the same name as thePod.
The actual Pod creation is handled by the StatefulSet controller in Kubernetes. ECK relies on the OnDelete StatefulSet update strategy since it needs full control over when and how Pods get upgraded to a new revision.
When a Pod is removed and recreated (maybe with a newer revision), the StatefulSet controller makes sure that the PersistentVolumes attached to the original Pod are then attached to the new Pod.
Cluster upgrade patterns
editDepending on how the NodeSets are updated, ECK handles the Kubernetes resources reconciliation in various ways.
-
When a new
NodeSetis added to the Elasticsearch resource, ECK creates the correspondingStatefulSet. It also sets up Secrets and ConfigMaps to hold the TLS certificates and Elasticsearch configuration files. -
When the node count of an existing
NodeSetis increased, ECK increases the replicas of the correspondingStatefulSet. -
When the node count of an existing
NodeSetis decreased, ECK migrates data away from the corresponding Elasticsearch nodes to remove, then decreases the replicas of the correspondingStatefulSet, once data migration is over. Corresponding PersistentVolumeClaims are automatically removed. -
When an existing
NodeSetis removed, ECK migrates data away from the corresponding Elasticsearch nodes to remove, decreases theStatefulSetreplicas accordingly, then finally removes the correspondingStatefulSet. -
When the specification of an existing
NodeSetis updated (for example the Elasticsearch configuration, or thePodTemplateresources requirements), ECK performs a rolling upgrade of the corresponding Elasticsearch nodes. In order to do so, it follows Elasticsearch rolling upgrade best practices, to slowly upgradePodsto the newest revision while preventing unavailability of the Elasticsearch cluster. In most cases, it corresponds to restarting Elasticsearch nodes one by one and reusing the samePersistentVolumedata. Note that some cluster topologies may cause the cluster to be unavailable during the upgrade. -
When an existing
NodeSetis renamed, ECK performs the creation of a newNodeSetwith the new name, and the removal of the oldNodeSet, according to theNodeSetcreation and removal patterns described above. Elasticsearch data is migrated away from the deprecatedNodeSetbefore removal. The Elasticsearch resource update strategy controls how many nodes can exist above or below the target node count during the upgrade.
In all these cases, ECK handles StatefulSet operations according to the Elasticsearch orchestration best practices, by adjusting the orchestration settings discovery.seed_hosts, cluster.initial_master_nodes, discovery.zen.minimum_master_nodes, and _cluster/voting_config_exclusions accordingly.
Limitations
editBased on how Kubernetes and StatefulSets operate, ECK orchestration has the following limitations:
-
Storage requirements (including volume size) of an existing
NodeSetcannot be updated. StatefulSet volumes expansion is not available in Kubernetes yet. To upgrade the storage size, you can create a newNodeSet, or rename an existing one. Renaming aNodeSetautomatically creates a newStatefulSetwith the specified storage size. The originalStatefulSetis removed once the Elasticsearch data is migrated to the nodes of the newStatefulSet. -
Cluster availability is not be guaranteed in the following cases:
- During the rolling upgrade of single-node clusters
- For clusters that have indices with no replicas
If an Elasticsearch node holds the only copy of a shard, this shard becomes unavailable while the node is upgraded. Clusters with more than one node and at least one replica per index are considered best practice.
-
Elasticsearch
Podsmay stayPendingduring a rolling upgrade if the Kubernetes scheduler cannot re-schedule them back. This is especially important when using localPersistentVolumes. If the Kubernetes node bound to a localPersistentVolumedoes not have enough capacity to host an upgradedPodwhich was temporarily removed, thatPodwill stay Pending. -
Rolling upgrades can only make progress if the Elasticsearch cluster health is green. It is risky to attempt upgrading a cluster in the yellow state as some shards could become completely unavailable and degrade the cluster health to red. ECK takes the cautionary approach of waiting for green health before progressing but advanced users may force an upgrade by manually deleting
Podsthemselves. The deletedPodswill be automatically recreated at the latest revision. There are two exceptions to this rule:-
If all the Elasticsearch nodes of a
NodeSetare unavailable, probably caused by a misconfiguration, the operator ignores the cluster health and upgrades nodes of theNodeSet. - If an Elasticsearch node to upgrade is not healthy, and not part of the Elasticsearch cluster, the operator ignores the cluster health and upgrades the Elasticsearch node.
-
If all the Elasticsearch nodes of a
- Elasticsearch versions cannot be downgraded. For example it is impossible to downgrade an existing cluster from version 7.3.0 to 7.2.0. This is not supported by Elasticsearch.