Use elastic-package to create and bootstrap a new Elastic integration

blog-thumb-rocket-launch.png

At Elastic we use elastic-package on a daily basis to create and maintain Integrations. Today we’ll learn how to quickly bootstrap a new Elastic Integration using a built-in creator, and how to start observing the service.

Elastic Integrations are a new way of using Elastic Agents to observe logs and metrics. By using Internet of Things smart devices, you can track data like electricity consumption, gas, or water usage in your household. However, there are countless IoT devices out there, and they all have proprietary apps or web pages to monitor data. What if you want to have a dashboard with all your IoT data in one place? That’s where Integrations come in.

Say you want to monitor your home’s water usage for leaks. An Elastic Integration makes this easy. Your IoT device has a REST endpoint with a data feed. By using an Elastic integration to create a smart pipeline to Kibana Alerts, you’ve got your service. Repeat as needed and you’ve got a dashboard to monitor your home. Further, with a small investment you can build IoT detectors on your own using any prototyping platform (for example Arduino) and basic electronic components (like water sensors).

These Integrations don’t contain any Go code — all functionality is provided using YAML or JSON configuration. If writing configuration files sounds like an exhausting challenge, that’s where the built-in creator comes in, shortening the initial time for bootstrapping the package using an embedded package archetype.

If you have never heard of Elastic Agent or Elastic Integrations and want to learn more before jumping in, this blog post is a great place to start. Otherwise, let's dive in!

Review observed service

The candidate for implementing a new package will be the IPFS Node application, a client to the decentralized peer-to-peer (P2P) network for storing and sharing data in a distributed file system. It uses content-addressing to uniquely identify each file. The application can run in both, daemon and console modes, and it exposes few network ports:

  • 4001 - default libp2p swarm port
  • 5001 - API internal port, shouldn’t be exposed publicly
  • 8080 - Gateway to serve content

The node application is also distributed as Docker image, which can be easily started:

$ docker run --rm --name ipfs-node -p 127.0.0.1:5001:5001 ipfs/go-ipfs@sha256:f7e30972e35a839ea8ce00c060412face29aa31624fd2dc87a5e696f99835a91

Changing user to ipfs
ipfs version 0.9.1
generating ED25519 keypair...done
peer identity: 12D3KooWR6NcsnsW7bzaMnfYWL9D8f411P351TnfTX2D9pxRY74t
initializing IPFS node at /data/ipfs
to get started, enter:

	ipfs cat /ipfs/QmQPeNsJPyVWPFDVHb77w8G42Fvo15z4bG2X8D2GhfbSXc/readme

Initializing daemon...
go-ipfs version: 0.9.1-dc2715a
Repo version: 11
System version: amd64/linux
Golang version: go1.15.2
2021/09/27 13:01:00 failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/lucas-clemente/quic-go/wiki/UDP-Receive-Buffer-Size for details.
Swarm listening on /ip4/127.0.0.1/tcp/4001
Swarm listening on /ip4/127.0.0.1/udp/4001/quic
Swarm listening on /ip4/172.17.0.2/tcp/4001
Swarm listening on /ip4/172.17.0.2/udp/4001/quic
Swarm listening on /p2p-circuit
Swarm announcing /ip4/127.0.0.1/tcp/4001
Swarm announcing /ip4/127.0.0.1/udp/4001/quic
Swarm announcing /ip4/172.17.0.2/tcp/4001
Swarm announcing /ip4/172.17.0.2/udp/4001/quic
API server listening on /ip4/0.0.0.0/tcp/5001
WebUI: http://0.0.0.0:5001/webui
Gateway (readonly) server listening on /ip4/0.0.0.0/tcp/8080
Daemon is readyRead more

In this tutorial we will be interested in the API running on port 5001, which exposes following methods:

  • /api/v0/stats/bw - get IPFS bandwidth information
  • /api/v0/repo/stat - get statistics for currently used repository

Both API methods respond to POST calls:

$ curl -X POST http://127.0.0.1:5001/api/v0/stats/bw
{"TotalIn":18136337,"TotalOut":944694,"RateIn":34125.006805975005,"RateOut":4088.311294056906}

$ curl -X POST http://127.0.0.1:5001/api/v0/repo/stat
{"RepoSize":9321761,"StorageMax":10000000000,"NumObjects":95,"RepoPath":"/data/ipfs","Version":"fs-repo@11"}

The goal of this exercise is to create an integration which scraps metrics from the described API and collects standard application logs from the ipfs-node.

Bootstrap new "ipfs_node" package

Let's start with creating a new repository (or you can select your own place) to store the integration. Next, use the built-in creator and your new package:

$ elastic-package create package
Create a new package
? Package name: ipfs_node
? Version: 0.0.1
? Package title: IPFS Node
? Description: Collect logs and metrics from IPFS node.
? Categories: custom, network
? Release: experimental
? Kibana version constraint: ^7.15.0
? Github owner: mtojek
New package has been created: ipfs_node
DoneRead more

The package has been created, but we also need three additional data streams - traffic (metrics), repository (metrics) and application (logs):

$ cd ipfs_node
$ elastic-package create data-stream
Create a new data stream
? Data stream name: traffic
? Data stream title: Traffic
? Type: metrics
New data stream has been created: traffic
Done
$ elastic-package create data-stream
Create a new data stream
? Data stream name: repository
? Data stream title: Repository
? Type: metrics
New data stream has been created: repository
Done
$ elastic-package create data-stream
Create a new data stream
? Data stream name: application
? Data stream title: Application logs
? Type: logs
New data stream has been created: application
DoneRead more
Once the creator finished its job, let’s check all files in the directory:
$ tree
.
├── changelog.yml
├── data_stream
│   ├── application
│   │   ├── agent
│   │   │   └── stream
│   │   │       └── stream.yml.hbs
│   │   ├── elasticsearch
│   │   │   └── ingest_pipeline
│   │   │       └── default.yml
│   │   ├── fields
│   │   │   └── base-fields.yml
│   │   └── manifest.yml
│   ├── repository
│   │   ├── agent
│   │   │   └── stream
│   │   │       └── stream.yml.hbs
│   │   ├── fields
│   │   │   └── base-fields.yml
│   │   └── manifest.yml
│   └── traffic
│       ├── agent
│       │   └── stream
│       │       └── stream.yml.hbs
│       ├── fields
│       │   └── base-fields.yml
│       └── manifest.yml
├── docs
│   └── README.md
├── img
│   ├── sample-logo.svg
│   └── sample-screenshot.png
└── manifest.yml

17 directories, 15 filesRead more

The package root contains 3 different data streams (traffic, repository and application), a basic README file, a changelog file, and sample graphics (icon and screenshot). Every data stream contains a manifest, an agent's stream definition, field definitions and an optional stub for ingest pipeline.

Adjust configuration of data streams

Now it's time to fill all the templates. Let's modify the package manifest (manifest.yml) and replace default policy templates with following:

policy_templates:
 - name: application
   title: IPFS node logs and metrics
   description: Collect IPFS node logs and metrics
   inputs:
     - type: logfile
       title: Collect application logs
       description: Collecting application logs from IPFS node
     - type: http/metrics
       title: Collect application metrics
       description: Collecting repository and traffic metrics from IPFS node
       vars:
         - name: hosts
           type: text
           title: Hosts
           description: Base URL of the internal endpoint
           required: true
           default: http://localhost:5001Read more

The package manifest describes two kinds of inputs - logfile and http/metrics. The logfile input is a standard filebeat's input, which allows for reading entries from files, but HTTP metrics is a Beats module, which fetches data from external HTTP endpoints (it supports JSON format). In the package manifest there can be defined common variables, which apply to multiple data streams - in this case we can keep the base URL to the IPFS node.

Let's define data streams:

  • application - read standard application logs
  • repository - read IPFS repository statistics
  • traffic - read bandwidth metrics for the node


Here is the data stream manifest for "application" data stream:

title: "Application logs"
type: logs
streams:
 - input: logfile
   title: Standard logs
   description: Collect IPFS node application logs
   vars:
     - name: paths
       type: text
       title: Paths
       multi: true
       default:
         - /var/log/ipfs-node-*.log
         - /var/log/ipfs-debug-*.logRead more

It's relatively small and defines one variable - paths (location of log files). For the purpose of this exercise we will not introduce more variables and only focus on the basic lifecycle of an integration. Once you modified the "application" data stream manifest, adjust manifests for "repository" and "traffic".

"Repository" data stream manifest:

title: "Repository"
type: metrics
streams:
 - input: http/metrics
   title: Repository metrics
   description: Collect repository metrics from IPFS node
   vars:
     - name: period
       type: text
       title: Period
       default: 10sRead more

"Traffic" data stream manifest:

title: "Traffic"
type: metrics
streams:
 - input: http/metrics
   title: Traffic metrics
   description: Collect bandwidth metrics from IPFS node
   vars:
     - name: period
       type: text
       title: Period
       default: 10sRead more
Both manifests define a single variable - period, which defines the delay between consecutive metrics fetch operations. All manifest files are used by Fleet UI (Kibana) to render configuration forms of the Integration:
Let's adjust the agent's stream configuration files. It's the configuration which is passed down to the Elastic Agent instance to reconfigure supervised filebeat and metricbeat processes. The agent stream for the "application" uses the standard file input:
paths:
{{#each paths as |path i|}}
 - {{path}}
{{/each}}
exclude_files: [".gz$"]
processors:
 - add_locale: ~

The agent stream for the “repository” uses the HTTP module with enabled JSON metricset:

metricsets: ["json"]
hosts:
{{#each hosts}}
 - {{this}}/api/v0/repo/stat
{{/each}}
period: {{period}}
method: "POST"
namespace: "repository"Read more
Similarly does the agent for “"traffic":
metricsets: ["json"]
hosts:
{{#each hosts}}
 - {{this}}/api/v0/stats/bw
{{/each}}
period: {{period}}
method: "POST"
namespace: "traffic"

The "application" data stream pushes logs to an ingest pipeline which runs in Elasticsearch. It can transform logs in multiple ways - skip fields, add new fields, trim content, replace values conditionally, etc. For the purpose of this exercise we will not introduce a complex processing:

data_stream/application/elasticsearch/ingest_pipeline/default.yml

---
description: Pipeline for processing sample logs
processors:
 - set:
     field: ecs.version
     value: '1.11.0'
 - trim:
     field: message
 - drop:
     description: 'Drop if the log message is empty'
     if: ctx.message == ''
on_failure:
 - set:
     field: error.message
     value: '{{ _ingest.on_failure_message }}'Read more

The core package files are ready now, so it's a good moment to run few extra commands:

elastic-package format - to format the package source code
elastic-package lint - to double-check if all files are inline with package-spec

elastic-package build - to build the integration package (mind that this will create the build directory with a built package)

Once every command passed successfully we can switch to testing. The elastic-package tool can boot up locally the Elastic stack for development and testing purposes. The stack consists of Docker containers for Elasticsearch, Kibana, Fleet Server and Elastic Agent. As all contantainers run in the same network, it's a good idea to run the IPFS node in a container belonging to the same Docker network.

Create _dev/deploy/docker directory in the package root and add place following files:

docker-compose.yml

version: '2.3'
services:
 ipfs_node:
   build: .
   ports:
     - 5001
   volumes:
     - ${SERVICE_LOGS_DIR}:/var/log/ipfs
docker-entrypoint.sh
#!/bin/sh

/usr/local/bin/start_ipfs daemon --migrate=true | tee /var/log/ipfs/ipfs-node-0.log

Dockerfile

FROM ipfs/go-ipfs@sha256:f7e30972e35a839ea8ce00c060412face29aa31624fd2dc87a5e696f99835a91

RUN mkdir -p /var/log/ipfs

ADD docker-entrypoint.sh /

ENV IPFS_LOGGING "info"

ENTRYPOINT ["/docker-entrypoint.sh"]

We will use them in system tests to boot up an instance of an IPFS node in the Docker network and make it observable by Elastic Agent. Now it’s the time to boot the Elastic stack. Navigate to the package root and run the command:

elastic-package stack up -d
The tool will discover the locally built package and include it in the Package Registry - see command output:
Custom build packages directory found: /Users/marcin.tojek/go/src/github.com/mtojek/elastic-blog-posts/build/integrations
Packages from the following directories will be loaded into the package-registry:
- built-in packages (package-storage:snapshot Docker image)
- /Users/marcin.tojek/go/src/github.com/mtojek/elastic-blog-posts/build/integrations
The instance of the Package Registry will include your prebuilt ipfs_node package and expose it under: http://localhost:8080/search?package=ipfs_node&experimental=1
[
  {
    "name": "ipfs_node",
    "title": "IPFS Node",
    "version": "0.0.1",
    "release": "experimental",
    "description": "Collect logs and metrics from IPFS node.",
    "type": "integration",
    "download": "/epr/ipfs_node/ipfs_node-0.0.1.zip",
    "path": "/package/ipfs_node/0.0.1",
    "icons": [
      {
        "src": "/img/sample-logo.svg",
        "path": "/package/ipfs_node/0.0.1/img/sample-logo.svg",
        "title": "Sample logo",
        "size": "32x32",
        "type": "image/svg+xml"
      }
    ],
    "policy_templates": [
      {
        "name": "application",
        "title": "IPFS node logs and metrics",
        "description": "Collect IPFS node logs and metrics"
      }
    ]
  }
]Read more
Navigate to the local Kibana panel: http://localhost:5601 (login: elastic, password: changeme), visit the Integrations page and confirm that the IPFS node package is present. Click on the “Add IPFS node” button to see the configuration form (rendered from manifests):

We're hiring

Work for a global, distributed team where finding someone like you is just a Zoom meeting away. Flexible work with impact? Development opportunities from the start?