Log correlation

IMPORTANT: This documentation is no longer updated. Refer to Elastic's version policy and the latest documentation.

Log correlation

Log correlation allows you to navigate to all logs belonging to a particular trace, and vice-versa — for a specific log, see in which context it has been logged, and which parameters the user provided.

The Agent provides integrations with both the default Python logging library, as well as structlog.

`ecs-logging`

edit

The easiest way to integrate your logs with APM is to use the ecs-logging library, which is also provided by Elastic. This library provides formatters for both logging and structlog which create ECS-compatible logs and will include the tracing information required for log correlation in kibana. Coupled with something like Filebeat, it is the easiest way to get logs into Elasticsearch.

Logging integrations

edit

`logging`

edit

For Python 3.2+, we use logging.setLogRecordFactory() to decorate the default LogRecordFactory to automatically add new attributes to each LogRecord object:

elasticapm_transaction_id
elasticapm_trace_id
elasticapm_span_id

This factory also adds these fields to a dictionary attribute, elasticapm_labels, using the official ECS tracing fields.

You can disable this automatic behavior by using the disable_log_record_factory setting in your configuration.

For Python versions <3.2, we also provide a filter which will add the same new attributes to any filtered LogRecord:

import logging
from elasticapm.handlers.logging import LoggingFilter

console = logging.StreamHandler()
console.addFilter(LoggingFilter())
# add the handler to the root logger
logging.getLogger("").addHandler(console)

Because filters are not propagated to descendent loggers, you should add the filter to each of your log handlers, as handlers are propagated, along with their attached filters.

`structlog`

edit

We provide a processor for structlog which will add three new keys to the event_dict of any processed event:

transaction.id
trace.id
span.id

from structlog import PrintLogger, wrap_logger
from structlog.processors import JSONRenderer
from elasticapm.handlers.structlog import structlog_processor

wrapped_logger = PrintLogger()
logger = wrap_logger(wrapped_logger, processors=[structlog_processor, JSONRenderer()])
log = logger.new()
log.msg("some_event")

Use structlog for agent-internal logging

edit

The Elastic APM Python agent uses logging to log internal events and issues. By default, it will use a logging logger. If your project uses structlog, you can tell the agent to use a structlog logger by setting the environment variable ELASTIC_APM_USE_STRUCTLOG to true.

Log correlation in Elasticsearch

edit

In order to correlate logs from your app with transactions captured by the Elastic APM Python Agent, your logs must contain one or more of the following identifiers:

transaction.id
trace.id
span.id

If you’re using structured logging, either with a custom solution or with structlog (recommended), then this is fairly easy. Throw the JSONRenderer in, and use Filebeat to pull these logs into Elasticsearch.

Without structured logging the task gets a little trickier. Here we recommend first making sure your LogRecord objects have the elasticapm attributes (see logging), and then you’ll want to combine some specific formatting with a Grok pattern, either in Elasticsearch using the grok processor, or in logstash with a plugin.

Say you have a Formatter that looks like this:

import logging

fh = logging.FileHandler('spam.log')
formatter = logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s")
fh.setFormatter(formatter)

You can add the APM identifiers by simply switching out the Formatter object for the one that we provide:

import logging
from elasticapm.handlers.logging import Formatter

fh = logging.FileHandler('spam.log')
formatter = Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s")
fh.setFormatter(formatter)

This will automatically append apm-specific fields to your format string:

formatstring = "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
formatstring = formatstring + " | elasticapm " \
                              "transaction.id=%(elasticapm_transaction_id)s " \
                              "trace.id=%(elasticapm_trace_id)s " \
                              "span.id=%(elasticapm_span_id)s"

Then, you could use a grok pattern like this (for the Elasticsearch Grok Processor):

{
  "description" : "...",
  "processors": [
    {
      "grok": {
        "field": "message",
        "patterns": ["%{GREEDYDATA:msg} | elasticapm transaction.id=%{DATA:transaction.id} trace.id=%{DATA:trace.id} span.id=%{DATA:span.id}"]
      }
    }
  ]
}

« OpenTracing API Performance tuning »