Metricsedit

Metrics measure the state of a system by gathering information on a regular interval. There are two types of APM metrics:

  • System metrics: Basic infrastructure and application metrics.
  • Calculated metrics: Aggregated trace event metrics used to power visualizations in the APM app.

System metricsedit

APM agents automatically pick up basic host-level metrics, including system and process-level CPU and memory metrics. Agent specific metrics are also available, like JVM metrics in the Java Agent, and Go runtime metrics in the Go Agent.

Infrastructure and application metrics are important sources of information when debugging production systems, which is why we’ve made it easy to filter metrics for specific hosts or containers in the Kibana metrics overview.

Metrics have the processor.event property set to metric.

Most agents limit keyword fields (e.g. processor.event) to 1024 characters, non-keyword fields (e.g. system.memory.total) to 10,000 characters.

Metrics are stored in metric indices.

For a full list of tracked metrics, see the relevant agent documentation:

Calculated metricsedit

APM agents and APM Server calculate metrics from trace events to power visualizations in the APM app. These metrics are described below.

Breakdown metricsedit

To power the Time spent by span type graph, agents collect summarized metrics about the timings of spans and transactions, broken down by span type.

span.self_time.count and span.self_time.sum.us

These metrics measure the "self-time" for a span type, and optional subtype, within a transaction group. Together these metrics can be used to calculate the average duration and percentage of time spent on each type of operation within a transaction group.

These metric documents can be identified by searching for metricset.name: span_breakdown.

You can filter and group by these dimensions:

  • transaction.name: The name of the enclosing transaction group, for example GET /
  • transaction.type: The type of the enclosing transaction, for example request
  • span.type: The type of the span, for example app, template or db
  • span.subtype: The sub-type of the span, for example mysql (optional)
Transaction metricsedit

To power APM app visualizations, APM Server aggregates transaction events into latency distribution metrics.

transaction.duration.histogram

This metric measures the latency distribution of transaction groups, used to power visualizations and analytics in Elastic APM.

These metric documents can be identified by searching for metricset.name: transaction.

You can filter and group by these dimensions (some of which are optional, for example container.id):

  • transaction.name: The name of the transaction, for example GET /
  • transaction.type: The type of the transaction, for example request
  • transaction.result: The result of the transaction, for example HTTP 2xx
  • transaction.root: A boolean flag indicating whether the transaction is the root of a trace
  • event.outcome: The outcome of the transaction, for example success
  • agent.name: The name of the APM agent that instrumented the transaction, for example java
  • service.name: The name of the service that served the transaction
  • service.version: The version of the service that served the transaction
  • service.node.name: The name of the service instance that served the transaction
  • service.environment: The environment of the service that served the transaction
  • service.language.name: The language name of the service that served the transaction, for example Go
  • service.language.version: The language version of the service that served the transaction
  • service.runtime.name: The runtime name of the service that served the transaction, for example jRuby
  • service.runtime.version: The runtime version that served the transaction
  • host.hostname: The hostname of the service that served the transaction
  • host.os.platform: The platform name of the service that served the transaction, for example linux
  • container.id: The container ID of the service that served the transaction
  • kubernetes.pod.name: The name of the Kubernetes pod running the service that served the transaction
  • cloud.provider: The cloud provider hosting the service instance that served the transaction
  • cloud.region: The cloud region hosting the service instance that served the transaction
  • cloud.availability_zone: The cloud availability zone hosting the service instance that served the transaction
  • cloud.account.id: The cloud account id of the service that served the transaction
  • cloud.account.name: The cloud account name of the service that served the transaction
  • cloud.machine.type: The cloud machine type or instance type of the service that served the transaction
  • cloud.project.id: The cloud project identifier of the service that served the transaction
  • cloud.project.name: The cloud project name of the service that served the transaction
  • cloud.service.name: The cloud service name of the service that served the transaction
  • faas.coldstart: Whether the serverless service that served the transaction had a cold start
  • faas.trigger.type: The trigger type that the lambda function was executed by of the service that served the transaction
  • faas.id: The unique identifier of the invoked serverless function
  • faas.name: The name of the lambda function
  • faas.version: The version of the lambda function

The @timestamp field of these documents holds the start of the aggregation interval.

Service-destination metricsedit

To power APM app visualizations, APM Server aggregates span events into service-destination metrics.

span.destination.service.response_time.count and span.destination.service.response_time.sum.us

These metrics measure the count and total duration of requests from one service to another service. These are used to calculate the throughput and latency of requests to backend services such as databases in Service maps.

These metric documents can be identified by searching for metricset.name: service_destination.

You can filter and group by these dimensions:

  • span.destination.service.resource: The destination service resource, for example mysql
  • event.outcome: The outcome of the operation, for example success
  • agent.name: The name of the APM agent that instrumented the operation, for example java
  • service.name: The name of the service that made the request
  • service.environment: The environment of the service that made the request

The @timestamp field of these documents holds the start of the aggregation interval.

Data streamsedit

Metrics are stored in the following data streams:

  • APM internal metrics: metrics-apm.internal-<namespace>
  • APM profiling metrics: metrics-apm.profiling-<namespace>
  • Application metrics: metrics-apm.app.<service.name>-<namespace>

See Data streams to learn more.

Example metric documentedit

This example shows what metric documents can look like when indexed in Elasticsearch.

Expand Elasticsearch document

This example contains JVM metrics produced by the Elastic APM Java agent. and contains two related metrics: jvm.gc.time and jvm.gc.count. These are accompanied by various fields describing the environment in which the metrics were captured: service name, host name, Kubernetes pod UID, container ID, process ID, and more. These fields make it possible to search and aggregate across various dimensions, such as by service, host, and Kubernetes pod.

{
  "container": {
    "id": "a47ed147c6ee269400f7ea4e296b3d01ec7398471bb2951907e4ea12f028bc69"
  },
  "kubernetes": {
    "pod": {
      "uid": "b0cb3baa-4619-4b82-bef5-84cc87b5f853",
      "name": "opbeans-java-7c68f48dc6-n6mzc"
    }
  },
  "process": {
    "pid": 8,
    "title": "/opt/java/openjdk/bin/java",
    "ppid": 1
  },
  "agent": {
    "name": "java",
    "ephemeral_id": "29a27947-ed3a-4d87-b2e6-28f7a940ec2d",
    "version": "1.25.1-SNAPSHOT.UNKNOWN"
  },
  "jvm.gc.time": 11511,
  "processor": {
    "name": "metric",
    "event": "metric"
  },
  "labels": {
    "name": "Copy"
  },
  "metricset.name": "app",
  "observer": {
    "hostname": "3c5ac040e8f9",
    "name": "instance-0000000002",
    "id": "6657d6e6-f3e8-4ce4-aa22-e7fe2ad77b5e",
    "type": "apm-server",
    "ephemeral_id": "b7f21735-d283-4945-ab80-ce8df494a207",
    "version": "7.15.0"
  },
  "@timestamp": "2021-09-14T09:52:49.454Z",
  "ecs": {
    "version": "1.11.0"
  },
  "service": {
    "node": {
      "name": "a47ed147c6ee269400f7ea4e296b3d01ec7398471bb2951907e4ea12f028bc69"
    },
    "environment": "production",
    "name": "opbeans-java",
    "runtime": {
      "name": "Java",
      "version": "11.0.11"
    },
    "language": {
      "name": "Java",
      "version": "11.0.11"
    },
    "version": "2021-09-08 03:55:06"
  },
  "jvm.gc.count": 2224,
  "host": {
    "os": {
      "platform": "Linux"
    },
    "ip": ["35.240.52.17"],
    "architecture": "amd64"
  },
  "event": {
    "ingested": "2021-09-14T09:53:00.834276431Z"
  }
}