Tutorial: First visualization in Vega-Lite
editTutorial: First visualization in Vega-Lite
editIn this tutorial, you will learn about how to edit Vega-Lite in Kibana to create a stacked area chart from an Elasticsearch search query. It will give you a starting point for a more comprehensive introduction to Vega-Lite, while only covering the basics.
In this tutorial, you will build a stacked area chart from one of the Kibana sample data sets.
Before beginning this tutorial, install the eCommerce sample data set.
When you first open the Vega editor in Kibana, you will see a pre-populated line chart which shows the total number of documents across all your indices within the time range.
The text editor contains a Vega-Lite spec written in HJSON, which is similar to JSON but optimized for human editing. HJSON supports:
- Comments using // or /* syntax
- Object keys without quotes
- String values without quotes
- Optional commas
- Double or single quotes
- Multiline strings
Small steps
editAlways work on Vega in the smallest steps possible, and save your work frequently. Small changes will cause unexpected results. Click the "Save" button now.
The first step is to change the index to one of the sample data sets. Change
index: _all
to:
index: kibana_sample_data_ecommerce
Click "Update". The result is probably not what you expect. You should see a flat line with 0 results.
You’ve only changed the index, so the difference must be the query is returning no results. You can try the Vega debugging process, but intuition may be faster for this particular problem.
In this case, the problem is that you are querying the field @timestamp
,
which does not exist in the kibana_sample_data_ecommerce
data. Find and replace
@timestamp
with order_date
. This fixes the problem, leaving you with this spec:
Expand Vega-Lite spec
{ $schema: https://vega.github.io/schema/vega-lite/v4.json title: Event counts from ecommerce data: { url: { %context%: true %timefield%: order_date index: kibana_sample_data_ecommerce body: { aggs: { time_buckets: { date_histogram: { field: order_date interval: {%autointerval%: true} extended_bounds: { min: {%timefilter%: "min"} max: {%timefilter%: "max"} } min_doc_count: 0 } } } size: 0 } } format: {property: "aggregations.time_buckets.buckets" } } mark: line encoding: { x: { field: key type: temporal axis: { title: null } } y: { field: doc_count type: quantitative axis: { title: "Document count" } } } }
Now, let’s make the visualization more interesting by adding another aggregation to create a stacked area chart. To verify that you have constructed the right query, it is easiest to use the Kibana Dev Tools in a separate tab from the Vega editor. Open the Dev Tools from the Management section of the navigation.
This query is roughly equivalent to the one that is used in the default Vega-Lite spec. Copy it into the Dev Tools:
POST kibana_sample_data_ecommerce/_search { "query": { "range": { "order_date": { "gte": "now-7d" } } }, "aggs": { "time_buckets": { "date_histogram": { "field": "order_date", "fixed_interval": "1d", "extended_bounds": { "min": "now-7d" }, "min_doc_count": 0 } } }, "size": 0 }
There’s not enough data to create a stacked bar in the original query, so we will add a new terms aggregation:
POST kibana_sample_data_ecommerce/_search { "query": { "range": { "order_date": { "gte": "now-7d" } } }, "aggs": { "categories": { "terms": { "field": "category.keyword" }, "aggs": { "time_buckets": { "date_histogram": { "field": "order_date", "fixed_interval": "1d", "extended_bounds": { "min": "now-7d" }, "min_doc_count": 0 } } } } }, "size": 0 }
You’ll see that the response format looks different from the previous query:
{ "aggregations" : { "categories" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [{ "key" : "Men's Clothing", "doc_count" : 1661, "time_buckets" : { "buckets" : [{ "key_as_string" : "2020-06-30T00:00:00.000Z", "key" : 1593475200000, "doc_count" : 19 }, { "key_as_string" : "2020-07-01T00:00:00.000Z", "key" : 1593561600000, "doc_count" : 71 }] } }] } } }
Now that we have data that we’re happy with, it’s time to convert from an
isolated Elasticsearch query into a query with Kibana integration. Looking at the
reference for writing Elasticsearch queries in Vega, you will
see the full list of special tokens that are used in this query, such
as %context: true
. This query has also replaced "fixed_interval": "1d"
with interval: {%autointerval%: true}
. Copy the final query into
your spec:
data: { url: { %context%: true %timefield%: order_date index: kibana_sample_data_ecommerce body: { aggs: { categories: { terms: { field: "category.keyword" } aggs: { time_buckets: { date_histogram: { field: order_date interval: {%autointerval%: true} extended_bounds: { min: {%timefilter%: "min"} max: {%timefilter%: "max"} } min_doc_count: 0 } } } } } size: 0 } } format: {property: "aggregations.categories.buckets" } }
If you copy and paste that into your Vega-Lite spec, and click "Update",
you will see a warning saying Infinite extent for field "key": [Infinity, -Infinity]
.
Let’s use our Vega debugging skills to understand why.
Vega-Lite generates data using the names source_0
and data_0
. source_0
contains
the results from the Elasticsearch query, and data_0
contains the visually encoded results
which are shown in the chart. To debug this problem, you need to compare both.
To look at the source, open the browser dev tools console and type
VEGA_DEBUG.view.data('source_0')
. You will see:
[{ doc_count: 454 key: "Men's Clothing" time_buckets: {buckets: Array(57)} Symbol(vega_id): 12822 }, ...]
To compare to the visually encoded data, open the browser dev tools console and type
VEGA_DEBUG.view.data('data_0')
. You will see:
[{ doc_count: 454 key: NaN time_buckets: {buckets: Array(57)} Symbol(vega_id): 13879 }]
The issue seems to be that the key
property is not being converted the right way,
which makes sense because the key
is now Men's Clothing
instead of a timestamp.
To fix this, try updating the encoding
of your Vega-Lite spec to:
encoding: { x: { field: time_buckets.buckets.key type: temporal axis: { title: null } } y: { field: time_buckets.buckets.doc_count type: quantitative axis: { title: "Document count" } } }
This will show more errors, and you can inspect VEGA_DEBUG.view.data('data_0')
to
understand why. This now shows:
[{ doc_count: 454 key: "Men's Clothing" time_buckets: {buckets: Array(57)} time_buckets.buckets.doc_count: undefined time_buckets.buckets.key: null Symbol(vega_id): 14094 }]
It looks like the problem is that the time_buckets
inner array is not being
extracted by Vega. The solution is to use a Vega-lite
flatten transformation, available in Kibana 7.9 and later.
If using an older version of Kibana, the flatten transformation is available in Vega
but not Vega-Lite.
Add this section in between the data
and encoding
section:
transform: [{ flatten: ["time_buckets.buckets"] }]
This does not yet produce the results you expect. Inspect the transformed data
by typing VEGA_DEBUG.view.data('data_0')
into the console again:
[{ doc_count: 453 key: "Men's Clothing" time_bucket.buckets.doc_count: undefined time_buckets: {buckets: Array(57)} time_buckets.buckets: { key_as_string: "2020-06-30T15:00:00.000Z", key: 1593529200000, doc_count: 2 } time_buckets.buckets.key: null Symbol(vega_id): 21564 }]
The debug view shows undefined
values where you would expect to see numbers, and
the cause is that there are duplicate names which are confusing Vega-Lite. This can
be fixed by making this change to the transform
and encoding
blocks:
transform: [{ flatten: ["time_buckets.buckets"], as: ["buckets"] }] mark: area encoding: { x: { field: buckets.key type: temporal axis: { title: null } } y: { field: buckets.doc_count type: quantitative axis: { title: "Document count" } } color: { field: key type: nominal } }
At this point, you have a stacked area chart that shows the top categories, but the chart is still missing some common features that we expect from a Kibana visualization. Let’s add hover states and tooltips next.
Hover states are handled differently in Vega-Lite and Vega. In Vega-Lite this is
done using a concept called selection
, which has many permutations that are not
covered in this tutorial. We will be adding a simple tooltip and hover state.
Because Kibana has enabled the Vega tooltip plugin, tooltips can be defined in several ways:
-
Automatic tooltip based on the data, via
{ content: "data" }
-
Array of fields, like
[{ field: "key", type: "nominal" }]
-
Defining a custom Javascript object using the
calculate
transform
For the simple tooltip, add this to your encoding:
encoding: { tooltip: [{ field: buckets.key type: temporal title: "Date" }, { field: key type: nominal title: "Category" }, { field: buckets.doc_count type: quantitative title: "Count" }] }
As you hover over the area series in your chart, a multi-line tooltip will appear, but it won’t indicate the nearest point that it’s pointing to. To indicate the nearest point, we need to add a second layer.
The first step is to remove the mark: area
from your visualization.
Once you’ve removed the previous mark, add a composite mark at the end of
the Vega-Lite spec:
layer: [{ mark: area }, { mark: point }]
You’ll see that the points are not appearing to line up with the area chart, and the reason is that the points are not being stacked. Change your Y encoding to this:
y: { field: buckets.doc_count type: quantitative axis: { title: "Document count" } stack: true }
Now, we will add a selection
block inside the point mark:
layer: [{ mark: area }, { mark: point selection: { pointhover: { type: single on: mouseover clear: mouseout empty: none fields: ["buckets.key", "key"] nearest: true } } encoding: { size: { condition: { selection: pointhover value: 100 } value: 5 } fill: { condition: { selection: pointhover value: white } } } }]
Now that you’ve enabled a selection, try moving the mouse around the visualization and seeing the points respond to the nearest position:
The final result of this tutorial is this spec:
Expand final Vega-Lite spec
{ $schema: https://vega.github.io/schema/vega-lite/v4.json title: Event counts from ecommerce data: { url: { %context%: true %timefield%: order_date index: kibana_sample_data_ecommerce body: { aggs: { categories: { terms: { field: "category.keyword" } aggs: { time_buckets: { date_histogram: { field: order_date interval: {%autointerval%: true} extended_bounds: { min: {%timefilter%: "min"} max: {%timefilter%: "max"} } min_doc_count: 0 } } } } } size: 0 } } format: {property: "aggregations.categories.buckets" } } transform: [{ flatten: ["time_buckets.buckets"] as: ["buckets"] }] encoding: { x: { field: buckets.key type: temporal axis: { title: null } } y: { field: buckets.doc_count type: quantitative axis: { title: "Document count" } stack: true } color: { field: key type: nominal title: "Category" } tooltip: [{ field: buckets.key type: temporal title: "Date" }, { field: key type: nominal title: "Category" }, { field: buckets.doc_count type: quantitative title: "Count" }] } layer: [{ mark: area }, { mark: point selection: { pointhover: { type: single on: mouseover clear: mouseout empty: none fields: ["buckets.key", "key"] nearest: true } } encoding: { size: { condition: { selection: pointhover value: 100 } value: 5 } fill: { condition: { selection: pointhover value: white } } } }] }