Field References Deep Diveedit

It is often useful to be able to refer to a field or collection of fields by name. To do this, you can use the Logstash field reference syntax.

The syntax to access a field specifies the entire path to the field, with each fragment wrapped in square brackets. When a field name contains square brackets, they must be properly escaped.

Field References can be expressed literally within Conditional statements in your pipeline configurations, as string arguments to your pipeline plugins, or within sprintf statements that will be used by your pipeline plugins:

filter {
  #  +----literal----+     +----literal----+
  #  |               |     |               |
  if [@metadata][date] and [@metadata][time] {
    mutate {
      add_field {
        "[@metadata][timestamp]" => "%{[@metadata][date]} %{[@metadata][time]}"
      # |                      |    |  |               |    |               | |
      # +----string-argument---+    |  +--field-ref----+    +--field-ref----+ |
      #                             +-------- sprintf format string ----------+
      }
    }
  }
}

Formal Grammaredit

Below is the formal grammar of the Field Reference, with notes and examples.

Field Reference Literaledit

A Field Reference Literal is a sequence of one or more Path Fragments that can be used directly in Logstash pipeline conditionals without any additional quoting (e.g. [request], [response][status]).

fieldReferenceLiteral
  : ( pathFragment )+
  ;

In Logstash 7.x and earlier, a quoted value (such as ["foo"]) is considered a field reference and isn’t treated as a single element array. This behavior might cause confusion in conditionals, such as [message] in ["foo", "bar"] compared to [message] in ["foo"]. We discourage using names with quotes, such as "\"foo\"", as this behavior might change in the future.

Field Reference (Event APIs)edit

The Event API’s methods for manipulating the fields of an event or using the sprintf syntax are more flexible than the pipeline grammar in what they accept as a Field Reference. Top-level fields can be referenced directly by their Field Name without the square brackets, and there is some support for Composite Field References, simplifying use of programmatically-generated Field References.

A Field Reference for use with the Event API is therefore one of:

  • a single Field Reference Literal; OR
  • a single Field Name (referencing a top-level field); OR
  • a single Composite Field Reference.
eventApiFieldReference
  : fieldReferenceLiteral
  | fieldName
  | compositeFieldReference
  ;

Path Fragmentedit

A Path Fragment is a Field Name wrapped in square brackets (e.g., [request]).

pathFragment
  : '[' fieldName ']'
  ;

Field Nameedit

A Field Name is a sequence of characters that are not square brackets ([ or ]).

fieldName
  : ( ~( '[' | ']' ) )+
  ;

Composite Field Referenceedit

In some cases, it may be necessary to programmatically compose a Field Reference from one or more Field References, such as when manipulating fields in a plugin or while using the Ruby Filter plugin and the Event API.

    fieldReference = "[path][to][deep nested field]"
    compositeFieldReference = "[@metadata][#{fieldReference}][size]"
    # => "[@metadata][[path][to][deep nested field]][size]"
Canonical Representations of Composite Field Referencesedit
Acceptable Composite Field Reference Canonical Field Reference Representation

+[[deep][nesting]][field]+

+[deep][nesting][field]+

+[foo][[bar]][bingo]+

+[foo][bar][bingo]+

+[[ok]]+

+[ok]+

A Composite Field Reference is a sequence of one or more Path Fragments or Embedded Field References.

compositeFieldReference
  : ( pathFragment | embeddedFieldReference )+
  ;

Composite Field References are supported by the Event API, but are not supported as literals in the Pipeline Configuration.

Embedded Field Referenceedit

embeddedFieldReference
  : '[' fieldReference ']'
  ;

An Embedded Field Reference is a Field Reference that is itself wrapped in square brackets ([ and ]), and can be a component of a Composite Field Reference.

Escape Sequencesedit

For Logstash to reference a field whose name contains a character that has special meaning in the field reference grammar, the character must be escaped. Logstash can be globally configured to use one of two field reference escape modes:

  • none (default): no escape sequence processing is done. Fields containing literal square brackets cannot be referenced by the Event API.
  • percent: URI-style percent encoding of UTF-8 bytes. The left square bracket ([) is expressed as %5B, and the right square bracket (]) is expressed as %5D.
  • ampersand: HTML-style ampersand encoding (&# + decimal unicode codepoint + ;). The left square bracket ([) is expressed as [, and the right square bracket (]) is expressed as ].