How to centralize infrastructure metrics and planning for scale with the Elastic Stack | Elastic Blog
Engineering

Centralizing infrastructure metrics and planning for scale with the Elastic Stack

This post is the second in our series on system metrics where we cover:

In the previous post, we went through some built-in tools and methods for identifying key metrics and values on your systems. In this post, we'll provide a tutorial on how to use Metricbeat to consolidate metrics, store and analyze them in the long term, and discuss some of the benefits of a centralized metric store.

Four steps to monitor system and infrastructure metrics using Elastic

Elastic Observability takes the guesswork out of gathering, viewing, and acting on your system metrics — providing a single tool to gather your system metrics: Metricbeat. Metricbeat is easy to set up and will have you analyzing your system in a few minutes. For this tutorial, I am going to assume that you already have Elasticsearch and Kibana deployed. If you don't already have the Elastic Stack up and running, the easiest way to get started is with a free trial of the Elasticsearch Service on Elastic Cloud, or you can download the Elastic Stack.

Getting started collecting system metrics with Elastic Stack is easy, and the instructions are right in Kibana.

If it's your first time logging into Kibana, you'll be welcomed and given an opportunity to load some sample data; feel free to do that, it's really handy when experimenting with visualizations. If you're not already there, click on the Kibana "K" logo in the top left corner of Kibana and you'll land on the Kibana home screen, which provides tools to help you add, manage, and visualize your data.

We're interested in system metrics, so we'll click on the "Add metric data" button.

We now have several choices of metric types to ingest, so scroll down to and click on the System metrics option...

This takes us right to the instructions for how to ingest system metrics. Under the "Getting Started" section are instructions based on what type of system we want to send the metrics from. I am on a Mac, so those are the steps that I will follow, but I'll also provide some alternative approaches for Linux environments. As a note, the steps for most of the other metric choices on the Add metric data page are pretty similar to what we're going to do for the system metrics.

Each step explains what to do, and provides a Copy snippet button to copy the commands so you can paste them into your terminal.  Let's go through the steps one by one.

Step 1: Download and install Metricbeat

The system will provide the download link for the version of Metricbeat that matches your version of the Elastic Stack — in my case, 7.7.0.  In this example the steps from the snippet are:

curl -L -O https://artifacts.elastic.co/downloads/beats/metricbeat/metricbeat-7.7.0-darwin-x86_64.tar.gz 
tar xzvf metricbeat-7.7.0-darwin-x86_64.tar.gz 
cd metricbeat-7.7.0-darwin-x86_64/

Which has us download the software based on the environment selected, uncompress it, then change to the new directory.

Note that if you're on Linux you'll have two options based on the flavor of Linux you're running: RPM or DEB. If you'd prefer, you can still follow the same download steps by clicking on the "Download page" button on the respective tabs:

(Although the page mentions 32-bit packages you will have multiple download options on the download page.)

Once you've completed step one, it's time to start setting up.

Step 2: Edit the configuration

Metricbeat, like the other Beats, allows configuration via command line flags and/or a YAML configuration file. For Metricbeat this file is called metricbeat.yml. In my case, Kibana detects that I am running on Elasticsearch Service, so it provides context-aware instructions. When I first created my deployment, I was provided with a random password for the elastic user. We'll substitute really-long-generated-password for the actual password.

We need to edit the metricbeat.yml file, and find the area described in the instructions. If I search for cloud.id in my YAML file I can jump to the right spot:

#============================= Elastic Cloud ==================================                                                        
# These settings simplify using Metricbeat with the Elastic Cloud (https://cloud.elastic.co/).                                         
# The cloud.id setting overwrites the `output.elasticsearch.hosts` and                                                                 
# `setup.kibana.host` options.                                                                                                         
# You can find the `cloud.id` in the Elastic Cloud web UI.                                                                             
#cloud.id:                                                                                                                             
# The cloud.auth setting overwrites the `output.elasticsearch.username` and                                                            
# `output.elasticsearch.password` settings. The format is `<user>:<pass>`.                                                             
#cloud.auth:

I simply need to remove the # from the beginning to enable the cloud.id and cloud.auth lines, paste in the cloud.id from Kibana, and specify the username and password.

Our config now looks like this:

#============================= Elastic Cloud ==================================                                                        
# These settings simplify using Metricbeat with the Elastic Cloud (https://cloud.elastic.co/).                                         
# The cloud.id setting overwrites the `output.elasticsearch.hosts` and                                                                 
# `setup.kibana.host` options.
# You can find the `cloud.id` in the Elastic Cloud web UI.
cloud.id: "Logging_Blog:ZWFzdHVzMi5zdGFnaW5...
# The cloud.auth setting overwrites the `output.elasticsearch.username` and                                                            
# `output.elasticsearch.password` settings. The format is `<user>:<pass>`.                                                             
cloud.auth:elastic:really-long-generated-password

(note that in the above example I have truncated the cloud.id so it doesn't wrap)

If you are not running on Elasticsearch Service, but rather a local instance, you'll notice that step two in Kibana is slightly different because it detects that you are running from a download rather than in Elastic Cloud:

If you’re running from a download, configure metricbeat.yml with the information for your cluster, including:

  • Where to find the Elasticsearch Server
  • The user name (default is elastic)
  • The password that you created when setting up
  • The Kibana URL

If you're running Elasticsearch locally in a default setup, it will look like this:

output.elasticsearch: 
  hosts: ["localhost:9200"] 
  username: "elastic" 
  password: "<the-password-you-created>" 
setup.kibana: 
  host: "localhost:5601" 
  username: "elastic" 
  password: "<<span style="white-space: normal;">the-password-you-created</span><span style="white-space: normal;">>"</span>

Using a text editor, search for output.elasticsearch and setup.kibana in the metricbeat.yml config file to find their respective sections. Note that the setup.kibana section is before output.elasticsearch in the file.

Step 3: Enable and configure the system module

metricbeat-step-3-enable.png

If you've been following along you should still be in the unzipped Metricbeat directory in your terminal or command prompt.  Metricbeat has support for several different services built in, though they’re disabled by default. You can get a full list of services by running ./metricbeat modules list, which will show you which services are enabled and which are disabled:

Enabled: 
system 
Disabled: 
activemq 
aerospike 
apache 
appsearch 
aws 
azure 
(...)

As you can see, the system module in my environment has already been enabled. If it wasn’t enabled, I would run ./metricbeat modules enable system. Some services may require additional configuration, especially when things aren't set up in default locations. For the system data, it will work out of the box.

Step 4: Start Metricbeat

We're ready to push the proverbial button — almost ready to see our data!

There are two commands for this step:

./metricbeat setup 
./metricbeat -e

The very first time that we send metric data to our cluster we'll need to run that ./metricbeat setup step. This creates the underlying index, loads the default dashboards, and sets up default policies. The second step, ./metricbeat -e, is what starts the data flowing. In the long run you'll want to make sure that Metricbeat is set up to start on your hosts automatically, but for now this is enough to start looking around.

Navigating your infrastructure metrics

Go ahead and click that "System metrics dashboard" button at the bottom of the instructions page. This takes us to the [Metricbeat System] Overview ECS dashboard. In my example below, I am actually monitoring four different hosts, but you can see how this dashboard provides a birds-eye view of an environment.

Clicking on one of the host names drills down to the individual host's detailed information.

The best part is that each and every one of the charts (and metrics) on these dashboards can be reused on custom dashboards.

Going beyond dashboards, we can navigate to the Metrics app in the Kibana sidebar for an even more interactive experience. Here we can browse our data, change views, drill down to see specifics and trends, and even set up custom multi-dimensional threshold alerts right from the user interface:

Wrapping up

In this post we've gone through the steps for getting your system metrics into the Elastic Stack for long-term storage and analysis using Metricbeat. We've shown how to find the default dashboards, and touched on some of the capabilities of the Metrics app in Kibana. By following the simple instructions right in Kibana we were able to quickly set up and start shipping our data to our Elasticsearch deployment.

You may have noticed in this process that we used the elastic user, which is a superuser, and that's overkill. In our next segment we'll address that and create custom roles and users that leverage Elastic role-based access controls (RBAC) for that "just right" level of permissions for metric ingest.

Start monitoring your hosts and infrastructure today

While there are native tools to display system metrics on most platforms, they really only give you a point-in-time glimpse of how your hosts, servers, and desktops are spending their valuable resources. The more information you can get from your infrastructure, the better equipped you are to make decisions. Adopting a centralized infrastructure metrics solution like the Elastic Stack allows you to not only look at what is going on right now, but it also lets you see how "now" compares to yesterday, and how this week or month compared to last. Powerful features such as machine learning will allow you to automatically plot trends and detect when something is abnormal, while alerting lets you know when something notable happens.

You can get started monitoring your systems and infrastructure today. Sign up for a free trial of Elasticsearch Service on Elastic Cloud, or download the Elastic Stack and host it yourself. Once you are up and running, monitor the availability of your hosts with uptime monitoring, and instrument the applications running on your hosts with Elastic APM. You'll be on your way to a fully observable system, completely integrated with your new metrics cluster. If you run into any hurdles or have questions, jump over to our Discuss forums — we're here to help.

Next: Ingesting metrics securely using role-based access control (RBAC)