View infrastructure metrics by resource type

Serverless Stack

The Infrastructure Inventory page provides a metrics-driven view of your entire infrastructure grouped by the resources you are monitoring. All monitored resources emitting a core set of infrastructure metrics are displayed to give you a quick view of the overall health of your infrastructure.

To open the Infrastructure inventory page in:

Elastic Stack, find Infrastructure in the main menu or use the global search field.
Serverless, go to Infrastructure inventory in your Observability Serverless project.

To learn more about the metrics shown on this page, refer to the Metrics reference.

Note

Don’t see any metrics?

If you haven’t added data yet, click Add data to search for and install an Elastic integration.

Need help getting started? Follow the steps in Get started with system metrics.

Filter the Inventory view

To get started with your analysis, select the type of resources you want to show in the high-level view. From the Show menu, select one of the following:

Hosts: the default view. Refer to Host metrics for more on supported data types for this view.
Kubernetes Pods: Refer to Kubernetes pod metrics for more on supported data types for this view.
Docker Containers: shows all containers, not only Docker containers. Refer to Container metrics for more on supported data types for this view.
AWS: includes EC2 instances, S3 buckets, RDS databases, and SQS queues. Refer to AWS metrics for more on supported data types for this view.

When you hover over each resource in the waffle map, the metrics specific to that resource are displayed.

You can sort by resource, group the resource by specific fields related to it, and sort by either name or metric value. For example, you can filter the view to display the memory usage of your Kubernetes pods, grouped by namespace, and sorted by the memory usage value.

You can also use the search bar to create structured queries using Kibana Query Language. For example, enter host.hostname : "host1" to view only the information for host1.

To examine the metrics for a specific time, use the time filter to select the date and time.

View host metrics

Note

Refer to Host metrics for more on supported data types for this view.

By default the Infrastructure Inventory page displays a waffle map that shows the hosts you are monitoring and the current CPU usage for each host. Alternatively, you can click the Table view icon to switch to a table view.

Without leaving the Infrastructure Inventory page, you can view enhanced metrics relating to each host running in your infrastructure. On the waffle map, select a host to display the host details overlay.

Note

Serverless Stack 9.2.0 When showing Hosts, the Schema dropdown menu shows the available data collection schemas for the current query. If data from both the Elastic System integration and OpenTelemetry is available, the schema defaults to OpenTelemetry. Select Elastic System Integration to see data collected by the Elastic System integration.

Tip

To expand the overlay and view more detail, click Open as page in the upper-right corner.

The host details overlay contains the following tabs:

Note

Serverless Stack 9.2.0 To view processes for OpenTelemetry hosts, you need to configure the EDOT collector to send process metrics. Refer to Process metrics for more information.

The Processes tab lists the total number of processes (system.process.summary.total) running on the host, along with the total number of processes in these various states:

Running (system.process.summary.running)
Sleeping (system.process.summary.sleeping)
Stopped (system.process.summary.stopped)
Idle (system.process.summary.idle)
Dead (system.process.summary.dead)
Zombie (system.process.summary.zombie)
Unknown (system.process.summary.unknown)

The processes listed in the Top processes table are based on an aggregation of the top CPU and the top memory consuming processes. The number of top processes is controlled by process.include_top_n.by_cpu and process.include_top_n.by_memory.


Command	Full command line that started the process, including the absolute path to the executable, and all the arguments (`system.process.cmdline`).
PID	Process id (`process.pid`).
User	User name (`user.name`).
CPU	The percentage of CPU time spent by the process since the last event (`system.process.cpu.total.pct`).
Time	The time the process started (`system.process.cpu.start_time`).
Memory	The percentage of memory (`system.process.memory.rss.pct`) the process occupied in main memory (RAM).
State	The current state of the process and the total number of processes (`system.process.state`). Expected values are: `running`, `sleeping`, `dead`, `stopped`, `idle`, `zombie`, and `unknown`.

The Universal Profiling tab shows CPU usage down to the application code level. From here, you can find the sources of resource usage, and identify code that can be optimized to reduce infrastructure costs. The Universal Profiling tab has the following views.


Flamegraph	A visual representation of the functions that consume the most resources. Each rectangle represents a function. The rectangle width represents the time spent in the function. The number of stacked rectangles represents the stack depth, or the number of functions called to reach the current function.
Top 10 Functions	A list of the most expensive lines of code on your host. See the most frequently sampled functions, broken down by CPU time, annualized CO2, and annualized cost estimates.

For more on Universal Profiling, refer to the Universal Profiling docs.

The Logs tab displays logs relating to the host that you have selected. By default, the logs tab displays the following columns.


Timestamp	The timestamp of the log entry from the `timestamp` field.
Message	The message extracted from the document. The content of this field depends on the type of log message. If no special log message type is detected, the Elastic Common Schema (ECS) base field, `message`, is used.

To view the logs in the Logs app for a detailed analysis, click Open in Logs.

Note

These metrics are also available when viewing hosts on the Hosts page.

View container metrics

Note

Refer to Container metrics for more on supported data types for this view.

When you select Docker containers, the Infrastructure inventory page displays a waffle map that shows the containers you are monitoring and the current CPU usage for each container. Alternatively, you can click the Table view icon to switch to a table view.

Without leaving the Infrastructure inventory page, you can view enhanced metrics relating to each container running in your infrastructure.

Note

Why do some containers report 0% or null (-) values in the waffle map?

The waffle map shows all monitored containers, including containerd, provided that the data collected from the container has the container.id field. However, the waffle map currently only displays metrics for Docker fields. This display problem will be resolved in a future release.

On the waffle map, select a container to display the container details overlay.

Tip

To expand the overlay and view more detail, click Open as page in the upper-right corner.

The container details overlay contains the following tabs:

The Logs tab displays logs relating to the container that you have selected. By default, the logs tab displays the following columns.


Timestamp	The timestamp of the log entry from the `timestamp` field.
Message	The message extracted from the document. The content of this field depends on the type of log message. If no special log message type is detected, the Elastic Common Schema (ECS) base field, `message`, is used.

To view the logs in the Logs app for a detailed analysis, click Open in Logs.

View metrics for other resources

Note

Refer to Kubernetes pod metrics and AWS metrics for more on supported data types for this view.

When you have searched and filtered for a specific resource, you can drill down to analyze the metrics relating to it. For example, when viewing Kubernetes Pods in the high-level view, click the Pod you want to analyze and select Kubernetes Pod metrics to see detailed metrics:

Add custom metrics

If the predefined metrics displayed on the Inventory page for each resource are not sufficient for your specific use case, you can add and define custom metrics.

Select your resource, and from the Metric filter menu, click Add metric.

Integrate with Logs, Uptime, and APM

Depending on the features you have installed and configured, you can view logs or traces relating to a specific resource. For example, in the high-level view, when you click a Kubernetes Pod resource, you can choose:

Kubernetes Pod logs to view corresponding logs in the Logs app.
Kubernetes Pod APM traces to view corresponding APM traces in the APM app.
Kubernetes Pod in Uptime to view related uptime information in the Uptime app.