Tech Topics

Using Elastic Stack to Explore Australia 2018 Budget

The Treasurer handed down Budget 2018-19 at 7:30pm on Tuesday 8 May 2018. The budget is a complex set of documents and for an ordinary individual understanding or even navigating the document could be quite a daunting task. Interactive visuals have always been a preferred choice for many when it comes to deciphering numbers, particularly multi-dimensional dollar values that run across a number of years.

In this article, we will demonstrate how Kibana could be used to visualise a subset of the budget data elegantly than one would normally do using a spreadsheet application.

The challenge of multi-dimensional data in its raw form

With data (e.g., .csv or .xls) provided in their raw form, one would normally open a spreadsheet application, select the rows and columns in a sheet, and configure some sort of visualisation. The visuals thus created would be static, and one would not be able to select a dimension or dig deeper into that column. Moreover, the process of creating the visual, sharing across an organisation or publicly becomes inefficient and error prone. And by the way, who wants to read a csv or .xls file cell by cell?

We need a simple tool to be able to visualise data and understand its content.

Visualising Australia 2018 Budget - An example

Budget 2018-19 Tables and Data are publicly available here.

One of the files “2018-19 Budget Dept Expenses.csv” if opened in a spreadsheet, would look like:

spreadsheet-image.jpg

Using Kibana, that data is now presented in an interactive manner as below:

Australia budget data presented in Kibana

If you were interested in drilling-down on the “human services” department expenses, then just clicking on that bar would take you to a visualisation like below:

Drilling down into budget data with Kibana

By the way, have you noticed that these visuals are a part of a webpage and the http link could be shared with anyone you wish?

We wanted to see and understand more. So we opened the file “2018-19-pbs-program-expense-line-items.csv”, and the results on piechart showed so much detail.

Detail drilldown via pie chart in Kibana

Drilling down onto one of the programs, e.g,. Employment Services, we can ascertain how much funds the budget has allocated, as depicted below:

Drilling into programs within the pie chart

Post the technical setup, visualisation creations and final touches, the dashboard is ready to be shared in a multi-tenant way, and it could look as simple as below:

The final budget dashboard in Kibana

At Elastic, we make things simple!

Now on the technical side of things...

Importing, Storing, and Visualizing Australia 2018 Budget - An Engineering Approach

Steps to follow to get to the Kibana visuals that could be shared with masses.

  • Sign up for an Elastic Cloud account.
  • Download and install Logstash.
  • Copy-paste the below configuration in your Logstash’s config path.
  • Run the Logstash pipeline.
  • Start creating visualisations and dashboard in Kibana.

Below, you will find a Logstash configuration for importing a csv file. You may name this file as “2018-19-pbs-program-expense-line-items.conf”:

# This is where your .csv file will be read
input {
    file {
        path => "/usr/share/logstash/pipeline/aubudget2018/2018-19-pbs-program-expense-line-items.csv"
        start_position => beginning
    }
}
# This section will allow you to specify the columns, their types, and any additional formatting
filter {
   dissect {
      mapping => {
        "message" => "%{Portfolio},%{Department/Agency},%{Outcome},%{Program},%{Expense type},%{Appropriation type},%{Description},%{2017-18},%{2018-19},%{2019-20},%{2020-21},%{2021-22},%{Source document},%{Source table},%{URL}"
       }
   }
   mutate {
    convert => { "2017-18" => "float" }
    convert => { "2018-19" => "float" }
    convert => { "2019-20" => "float" }
    convert => { "2020-21" => "float" }
    convert => { "2021-22" => "float" }
   }
}
# This section defines the Elasticsearch output configuration
output {
   elasticsearch {
    hosts => "https://elasticsearch-hostname:9243/"
    ssl => true
    user => "elastic"
    password => "youwishyouknewthisdidn'tyou?"
    index => "2018-19-pbs-program-expense-line-items"
   }
}

If you have a docker installed, then you could use the following command to run Logstash pipeline:

docker run --rm -it -v `pwd`:/usr/share/logstash/pipeline -e xpack.monitoring.enabled=false -e log.level=debug -e pipeline.workers=4 docker.elastic.co/logstash/logstash:6.2.4

For more information on the dissect filter being used above, please visit the documentation.

This is it for now friends.