Engineering

Discover in Kibana uses the fields API in 7.12

With Elastic 7.12, Discover now uses the fields API by default. Reading from _source is still supported through a switch in the Advanced Settings. This change stems from updates made to Elasticsearch in 7.11 with the extension of the Search API to include the new fields parameter.

When using the new search parameter, both a document’s raw source and the index mappings to load and return values are used. Because it makes use of the mappings, fields has some advantages over referencing the _source directly: it accepts multifields and field aliases, and also formats field values like dates in a consistent way.

In short, the fields option simplifies all the nuances of _source and other places by dealing with multi-mapping and aliases, type coercion, docvalues, and all the other edge cases.

What are the benefits of the new search parameter?

More and more use cases arise where Elasticsearch is treated as a data store. In order to improve the simplicity of retrieving fields without the expectation of expertise in areas like mappings, doc_values, stored fields, etc., we decided to extend the existing API. 

Some benefits of the new parameter:

  • If a non-standard field like a field alias, multi-field, or constant_keyword is specified in fields, mappings will find and return the right value.
  • The fields are returned in a flat list, as opposed to structured JSON.
  • Each value would be returned in a 'canonical' format — for example if a field is mapped as an integer, it will be returned as an integer even if it was specified as a string in the _source.

What should I expect?

The most meaningful change is the support for runtime fields. When a runtime field is defined in a mapping, it will show up in Discover like any other field type. The next sections expand on all of the improvements and changes.

Default column is now called Document

If there are no columns configured, the default column changes from _source to Document.

Existing saved searches with _source column will be changed to Document. Unlike source, the Document column cannot be combined with other columns in the grid. 

discover-fields-api-document-columns.png

Although it is formatted similar to the _source column, it is a collection of fields as returned from an Elasticsearch response and therefore, combining it with other columns would not make much sense.

Multi-fields are grouped together

Multi-field is the field mapped differently for different purposes. For instance, a string field could be mapped as a text field for full-text search, and as a keyword field for sorting or aggregations.

Due to the changes in the API, multi-sub fields will now have values where before they didn’t. In order to avoid suddenly increasing the number of fields that are shown, we’ve decided to group multi-sub fields under the root field so they are still usable as columns, but not as prominent.  

discover-fields-api-multi-fields.png

Differences in how object fields are displayed

Object fields usually consist of one or more leaf fields. Example of an object field:

"manager": {  
    "age":     30, 
    "name": {  
      "first": "John", 
      "last":  "Smith" 
    } 
  }

Before the introduction of the fields API, Discover would show the object roots, together with the leaf fields.

With the introduction of the fields API, object roots won't be shown in the field list anymore (unless there is a previously saved search containing one as a column). Leaf fields will now be correctly detected as having data, where previously they were not. This will be the case even when the document has an array of objects. Leaf fields will work for all objects, and the values of all objects in that array will be shown flattened.

discover-fields-api-object-fields.png

Differences in how nested fields are displayed

The nested data type is used for indexing arrays of objects when there is a need to maintain the independence of each object in the array. They need to be specified as a type in the mapping, otherwise they would be treated as object data type.

Before the introduction of the fields API, Discover would show a root field if at least one document loaded has an array of objects for that field. Otherwise, the root field wouldn’t be shown. The leaf fields will be shown if they appear in at least one document. Leaf fields that do not appear in any document or only in documents that have arrays of objects for this object field, will not appear in the field list unless "Hide missing fields" is switched off.

In the table, root fields as columns will show the JSON of the fields’ content ONLY if that document contains an array of objects and will be empty otherwise. 

Let’s say we have a mapping as follows:

"products": {
     "type": "nested",
     "properties": {
       "name": { "type": "keyword" },
       "price": { "type": "double" }
     }
   },

And the following documents:

POST /discover_test/_doc
{
"products": [
    { "name": "Kibana", "price": 42.23 },
    { "name": "Faber-Castell Polychromos", "price": 29.95 }
  ],
}
POST /discover_test/_doc
{
  "products": { "name": "Product name", "price": 123.21 },
}

In this case, when using _source, Discover would only correctly display the JSON of the products object in the first document since it is an array. The second document, however, is a single object, meaning it would display as empty:

discover-fields-api-document-1.png

Leaf fields as columns will only show values if the document did not contain an array of objects and stay empty otherwise:

discover-fields-api-document-2.png

With the introduction of the fields API, the root of a nested field can be added from the field list and will show the structured JSON of that field in the document. Nested field columns now show the JSON correctly for all documents:

discover-fields-api-document-3.png

In the case of an existing saved search with a nested leaf field as a column, this will show as empty now:

discover-fields-api-document-4.png

We want to have this behavior going forward since we assume the nested mapping type is only used when an array of objects is indexed.

Unmapped fields won’t show up if field filters are configured

When a previously unseen field is found in a document, Elasticsearch will add the new field to the type mapping (unless dynamic parameter is set to false). Discover will continue showing these fields as it did previously, except in the case when field filters are configured. In that case, unmapped fields will not show up.

Limitations of _source

Source contains the raw information from the document, exactly as it was ingested. This means users lose the benefit of features built on top of _source, like runtime fields, multi-fields, date formatting, and alias fields. Also, the whole _source object must be loaded and parsed even if only a small number of fields are requested.

With the introduction of the fields parameter, developers can retrieve the field values after the logic from these features is applied. This abstraction removes the need for developers to know implementation details about a field. It doesn’t matter if the field is runtime or not, it will always be accessed in the same way.

Here is an example. Let’s say we have a mapping defined as follows:

PUT /my-index
{
  "mappings": {
    "properties": {
      "xlong": {
        "type": "long"
      },
      "xfloat": {
        "type": "float"
      }
    }
  }
}
POST my-index/_doc/1
{
  "xlong": "1.5",
  "Xfloat": "1.5"
}
POST my-index/_doc/2
{
  "Xlong": "2",
  "Xfloat": "2"
}

The field xlong is mapped as long, which means that internally its value will be rounded towards zero. In that case, 1.5 becomes 1. When executing the following search in Discover,

xlong > 1 and xlong < 2

the search will not return any values, which is expected. However, when viewing documents in Discover, when reading fields from _source, the value will be displayed exactly as it was ingested, ie. 1.5:

discover-fields-api-document-ingest-1.png

This used to create a lot of confusion. With the fields option enabled, the actual mapped value (1)  is displayed:

discover-fields-api-document-ingest-2.png

Wrap up

We are excited to bring the power and flexibility of runtime fields to Discover, including the switch to using the fields API. If you are currently not on Elastic 7.12, but would still like to test out these updates, starting a free Elasticsearch Service trial on Elastic Cloud is a fast and easy way to do so. 

To learn more about other recent product updates, be sure to read our Elastic 7.12 release blog
ElasticON Global 2021

Join us at ElasticON Global for free!

Our biggest event of the year is back Oct 5-7. Take your organization's search, observability, or security capabilities to a whole new level.