Discovery is the process by which the cluster formation module finds other nodes with which to form a cluster. This process runs when you start an Elasticsearch node or when a node believes the master node failed and continues until the master node is found or a new master node is elected.
This process starts with a list of seed addresses from one or more seed hosts providers, together with the addresses of any master-eligible nodes that were in the last-known cluster. The process operates in two phases: First, each node probes the seed addresses by connecting to each address and attempting to identify the node to which it is connected and to verify that it is master-eligible. Secondly, if successful, it shares with the remote node a list of all of its known master-eligible peers and the remote node responds with its peers in turn. The node then probes all the new nodes that it just discovered, requests their peers, and so on.
If the node is not master-eligible then it continues this discovery process
until it has discovered an elected master node. If no elected master is
discovered then the node will retry after
If the node is master-eligible then it continues this discovery process until
it has either discovered an elected master node or else it has discovered
enough masterless master-eligible nodes to complete an election. If neither of
these occur quickly enough then the node will retry after
discovery.find_peers_interval which defaults to
Once a master is elected, it will normally remain as the elected master until it is deliberately stopped. It may also stop acting as the master if fault detection determines the cluster to be faulty. When a node stops being the elected master, it begins the discovery process again.
In most cases, the discovery process completes quickly, and the master node remains elected for a long period of time. If the cluster has no master for more than a few seconds or the master is unstable, the logs for each node will contain information explaining why:
All nodes repeatedly log messages indicating that a master cannot be
discovered or elected using a logger called
org.elasticsearch.cluster.coordination.ClusterFormationFailureHelper. By default, this happens every 10 seconds.
If a node wins the election, it logs a message containing
elected-as-master. If this happens repeatedly, the master node is unstable.
When a node discovers the master or believes the master to have failed, it
logs a message containing
master node changed.
- If a node is unable to discover or elect a master for several minutes, it starts to report additional details about the failures in its logs. Be sure to capture log messages covering at least five minutes of discovery problems.
If your cluster doesn’t have a stable master, many of its features won’t work correctly. The cluster may report many kinds of error to clients and in its logs. You must fix the master node’s instability before addressing these other issues. It will not be possible to solve any other issues while the master node is unstable.
The logs from the
ClusterFormationFailureHelper may indicate that a master
election requires a certain set of nodes and that it has not discovered enough
nodes to form a quorum. If so, you must address the reason preventing Elasticsearch from
discovering the missing nodes. The missing nodes are needed to reconstruct the
cluster metadata. Without the cluster metadata, the data in your cluster is
meaningless. The cluster metadata is stored on a subset of the master-eligible
nodes in the cluster. If a quorum cannot be discovered then the missing nodes
were the ones holding the cluster metadata. If you cannot bring the missing
nodes back into the cluster, start a new cluster and restore data from a recent
snapshot. Refer to Quorum-based decision making for more information.
The logs from the
ClusterFormationFailureHelper may also indicate that it has
discovered a possible quorum of master-eligible nodes. If so, the usual reason
that the cluster cannot elect a master is that one of the other nodes cannot
discover a quorum. Inspect the logs on the other master-eligible nodes and
ensure that every node has discovered a quorum.
By default the cluster formation module offers two seed hosts providers to
configure the list of seed nodes: a settings-based and a file-based seed
hosts provider. It can be extended to support cloud environments and other
forms of seed hosts providers via discovery plugins.
Seed hosts providers are configured using the
setting, which defaults to the settings-based hosts provider. This setting
accepts a list of different providers, allowing you to make use of multiple
ways to find the seed hosts for your cluster.
Each seed hosts provider yields the IP addresses or hostnames of the seed
nodes. If it returns any hostnames then these are resolved to IP addresses
using a DNS lookup. If a hostname resolves to multiple IP addresses then Elasticsearch
tries to find a seed node at all of these addresses. If the hosts provider does
not explicitly give the TCP port of the node by then, it will implicitly use the
first port in the port range given by
transport.profiles.default.port, or by
transport.profiles.default.port is not set. The number of
concurrent lookups is controlled by
discovery.seed_resolver.max_concurrent_resolvers which defaults to
the timeout for each lookup is controlled by
which defaults to
5s. Note that DNS lookups are subject to
JVM DNS caching.
Settings-based seed hosts provideredit
The settings-based seed hosts provider uses a node setting to configure a static list of the addresses of the seed nodes. These addresses can be given as hostnames or IP addresses; hosts specified as hostnames are resolved to IP addresses during each round of discovery.
The list of hosts is set using the
static setting. For example:
The port will default to
If a hostname resolves to multiple IP addresses, Elasticsearch will attempt to connect to every resolved address.
File-based seed hosts provideredit
The file-based seed hosts provider configures a list of hosts via an external file. Elasticsearch reloads this file when it changes, so that the list of seed nodes can change dynamically without needing to restart each node. For example, this gives a convenient mechanism for an Elasticsearch instance that is run in a Docker container to be dynamically supplied with a list of IP addresses to connect to when those IP addresses may not be known at node startup.
To enable file-based discovery, configure the
file hosts provider as follows
Then create a file at
$ES_PATH_CONF/unicast_hosts.txt in the format described
below. Any time a change is made to the
unicast_hosts.txt file the new
changes will be picked up by Elasticsearch and the new hosts list will be used.
Note that the file-based discovery plugin augments the unicast hosts list in
elasticsearch.yml: if there are valid seed addresses in
discovery.seed_hosts then Elasticsearch uses those addresses in addition to those
unicast_hosts.txt file contains one node entry per line. Each node entry
consists of the host (host name or IP address) and an optional transport port
number. If the port number is specified, is must come immediately after the
host (on the same line) separated by a
:. If the port number is not
specified, Elasticsearch will implicitly use the first port in the port range given by
transport.profiles.default.port, or by
transport.profiles.default.port is not set.
For example, this is an example of
unicast_hosts.txt for a cluster with four
nodes that participate in discovery, some of which are not running on the
10.10.10.5 10.10.10.6:9305 10.10.10.5:10005 # an IPv6 address [2001:0db8:85a3:0000:0000:8a2e:0370:7334]:9301
Host names are allowed instead of IP addresses and are resolved by DNS as described above. IPv6 addresses must be given in brackets with the port, if needed, coming after the brackets.
You can also add comments to this file. All comments must appear on their lines
# (i.e. comments cannot start in the middle of a line).
EC2 hosts provideredit
Azure Classic hosts provideredit
The Azure Classic discovery plugin adds a hosts provider that uses the Azure Classic API find a list of seed nodes.
Google Compute Engine hosts provideredit
The GCE discovery plugin adds a hosts provider that uses the GCE API find a list of seed nodes.