The X-Pack monitoring features let you monitor Elasticsearch through Kibana. You can view your cluster’s health and performance in real time and analyze past cluster, index, and node metrics. In Elasticsearch versions before 5.0, Marvel provides similar monitoring functionality.
In Elasticsearch 5.0 and later, the monitoring features of Marvel became part of X-Pack. If you are using an Elasticsearch version before 5.0, think Marvel whenever you read about the X-Pack monitoring features.
Monitoring consists of two components:
- A Monitoring agent that is installed on each node in your Elasticsearch cluster. The Monitoring agent collects and indexes metrics from Elasticsearch, either on the same cluster or by sending metrics to an external monitoring cluster. Elasticsearch Service manages the installation and configuration of the monitoring agent for you, and you should not modify any of the settings.
- The Monitoring (formerly Marvel) application plugin in Kibana that visualizes the monitoring metrics through a dashboard.
The steps in this section cover only the enablement of the monitoring features. For more information on how to use the monitoring features, refer to the following documentation:
For production use, you should log your Elasticsearch cluster metrics to a dedicated monitoring cluster. Monitoring indexes metrics into Elasticsearch and these indexes consume storage, memory, and CPU cycles like any other index. By using a separate monitoring cluster, you avoid affecting your other production clusters.
You should also create a dedicated user for the clusters sending metrics and the monitoring cluster receiving them. For more information on creating a user with the right privileges, see Monitoring and Security (for version 5.0 and later) and Using Marvel with Shield (for versions before 5.0).
How many monitoring clusters you use depends on your requirements:
- You can ship metrics for many clusters to a single monitoring cluster, if your business requirements permit it.
- While monitoring will work with a cluster running a single node, you need a minimum of three monitoring nodes to make monitoring highly available.
You might need to create dedicated monitoring clusters for isolation purposes in some cases. For example:
- If you have many clusters and some of them are much larger than others, creating separate monitoring clusters prevents a large cluster from potentially affecting monitoring performance for smaller clusters.
- If you need to silo Elasticsearch data for different business departments. Clusters that have been configured to ship metrics to a target monitoring cluster have access to indexing data and can manage monitoring index templates, which is addressed by creating separate monitoring clusters.
Monitoring data that gets sent to a dedicated monitoring Elasticsearch cluster is not cleaned up automatically and might require some additional steps to remove excess data periodically.
To avoid compatibility issues between versions, the cluster sending monitoring metrics and the monitoring cluster receiving them should be at the same Elasticsearch version. If using the same version is not feasible, check for breaking changes in the X-Pack Release Notes or the Marvel Release Notes to make sure that your versions are compatible.
When you enable monitoring in Elasticsearch Service by configuring your cluster to send monitoring data to itself, your monitoring indices are retained for a certain period by default. After the retention period has passed, the monitoring indices are deleted automatically. The retention period when a cluster sends monitoring data to itself depends on the version of Elasticsearch:
For Elasticsearch 5.x clusters: Monitoring data is retained for three days by default or as specified by the
- For Elasticsearch 2.x clusters: Monitoring data is retained for seven days.
When monitoring for production use, where you configure your clusters to send monitoring data to a dedicated monitoring cluster for indexing, this retention period does not apply. Monitoring indices on a dedicated monitoring cluster are retained until you remove them. There are two options open to you:
- To enable the automatic deletion of monitoring indices from dedicated monitoring clusters, enable monitoring on your dedicated monitoring cluster in Elasticsearch Service to send monitoring data to itself. When an Elasticsearch cluster sends monitoring data to itself, all monitoring indices are deleted automatically after the retention period, regardless of the origin of the monitoring data.
- To retain monitoring indices on a dedicated monitoring cluster as is without deleting them automatically, no additional steps are required other than making sure that you do not enable the monitoring cluster to send monitoring data to itself. You should also monitor the cluster for disk space usage and upgrade your cluster periodically, if necessary.
For long-term index management, you can run Curator on-premise to manage indices on Elasticsearch Service. For example, you can use Curator to clean up monitoring indices like any other time-based index.
Tips for using Curator with Elasticsearch Service:
Be sure to configure your
curator.ymlfile to point to your Elasticsearch Service cluster.
To clean up monitoring indices, use the
delete_indicesaction. You can refer to this delete_indices example as a guide or use our Metricbeat example in this section.
For connections to Elasticsearch Service, there may be proxy level timeout issues, if you do not set
timeout_overrideparameter; this means that Curator will take the timeout as a cancelation of that task (which is still running on the cluster itself). Alternatively, you could script running cURL commands using the Delete Index API that connect to your Elasticsearch Service cluster.
On Linux, you could use cron to execute a Curator task on a schedule:
crontab -l 00 8 * * * linux_user curator /etc/curator/delete_ess_indices.yml --config /etc/curator/curator.yml cat delete_ess_indices.yml actions: 1: action: delete_indices description: >- Delete metricbeat-* indices older than 30 days (based on index name). Ignore the error if the filter does not result in an actionable list of indices (ignore_empty_list) and exit cleanly. Adjust timeout_override in the event of proxy timeout issues. options: ignore_empty_list: True disable_action: True timeout_override: 21600 continue_if_exception: False #[other option configurations here] filters: - filtertype: pattern kind: prefix value: metricbeat- - filtertype: age source: name direction: older timestring: '%Y.%m.%d' unit: days unit_count: 30 cat curator.yml client: hosts: - ELASTICSEARCH_SERVICE_ENDPOINT_URL:PORT http_auth: user:pass timeout: 30
The Curator action file
The Curator configuration file
The Elasticsearch cluster endpoint URL that is unique to your cluster, as obtained from the Elasticsearch Service Console
The port for the RESTful API, typically
Elasticsearch Service manages the installation and configuration of the Monitoring agent (formerly Marvel) for you. When you enable Monitoring on an Elasticsearch cluster, you are configuring where the monitoring agent for your current cluster should send its metrics.
To enable the Monitoring agent:
- Log into the Elasticsearch Service Console.
On the Deployments page, select your deployment.
Narrow your deployments by name, ID, or choose from several other filters. To customize your view, use a combination of filters, or change the format from a grid to a list.
- From your deployment menu, go to the Elasticsearch page.
- In the Monitoring panel, click Enable.
Choose where to send your metrics.
If a cluster is not listed, make sure that it is running a compatible version and is configured to use the Elastic Stack security features (X-Pack for Elasticsearch 5.0 and later or Shield for versions before 5.0).
Remember to send metrics for production clusters to a dedicated monitoring cluster, so that your production clusters are not impacted by the overhead of indexing and storing monitoring data. A dedicated monitoring cluster also gives you more control over the retention period for monitoring data.
With monitoring enabled for your cluster, you can access the Monitoring (formerly Marvel) application through Kibana. The application is a plugin that runs in Kibana.
To access the Monitoring application:
Open Kibana on the cluster that is receiving monitoring metrics.
For example, if you have a monitoring and a production cluster, and the production cluster is shipping metrics to the monitoring cluster, then you need to open the Monitoring application on the monitoring cluster to see the metrics.
If you are not sure where to access Kibana on the cluster, log into the Cloud UI on the cluster that is receiving the metrics, and look up the Kibana endpoint URL.
In Kibana, open the Monitoring application:
- In Kibana 5, click on Monitoring in the sidebar on the left.
In Kibana 4.5, first click on the App Switcher icon and then click on the Marvel app icon.
After you open the application in Kibana, you see a list of the clusters that you are monitoring.
Start exploring monitoring data:
Detailed monitoring information is available through the X-Pack monitoring features (called Marvel in versions before 5.0), but some monitoring information reports only host-level statistics and not container statistics. Because your Elasticsearch cluster nodes run within containers on Elasticsearch Service, some system metrics do not reflect the utilization for your cluster nodes but rather the utilization of the host that they’re running on. We have begun to make improvements to X-Pack monitoring to show accurate metrics and, starting in version 5.2, the CPU usage and throttle times are accurately captured over time.
For now, the following system metrics in the Monitoring app should not be used to monitor your clusters hosted on Elasticsearch Service:
- IP address
- Reports the internal private IP address, which is not usable externally.
- Free disk space
- Reports the total free space on the host, not the storage assigned to your cluster.
- CPU statistics
- For versions before 5.2: Reports CPU utilization for the host, not the container that cluster nodes run in.
- For version 5.2 and later: Can be used, as CPU utilization accurately reflects the CPU resources assigned to your cluster nodes, based on the reported cgroups statistics. The CPU utilization and throttle information can be found on the Nodes tab and the details view when you click on a node.
- System load average
- Reports the load average for the host, not the container that cluster nodes run in.
For accurate system metrics within the last 24 hours, you can always use the cluster performance metrics available directly from the Elasticsearch page for each deployment in the Elasticsearch Service Console.