Manual instrumentation with OpenTelemetry for Node.js applications

observability-launch-series-1-node-js-manual_(1).jpg

DevOps and SRE teams are transforming the process of software development. While DevOps engineers focus on efficient software applications and service delivery, SRE teams are key to ensuring reliability, scalability, and performance. These teams must rely on a full-stack observability solution that allows them to manage and monitor systems and ensure issues are resolved before they impact the business.  

Observability across the entire stack of modern distributed applications requires data collection, processing, and correlation often in the form of dashboards. Ingesting all system data requires installing agents across stacks, frameworks, and providers — a process that can be challenging and time-consuming for teams who have to deal with version changes, compatibility issues, and proprietary code that doesn't scale as systems change.      

Thanks to OpenTelemetry (OTel), DevOps and SRE teams now have a standard way to collect and send data that doesn't rely on proprietary code and have a large support community reducing vendor lock-in.  

In a previous blog, we also reviewed how to use the OpenTelemetry demo and connect it to Elastic®, as well as some of Elastic’s capabilities with OpenTelemetry and Kubernetes. 

In this blog, we will show how to use manual instrumentation for OpenTelemetry with the Node.js service of our application called Elastiflix. This approach is slightly more complex than using auto-instrumentation.

The beauty of this is that there is no need for the otel-collector! This setup enables you to slowly and easily migrate an application to OTel with Elastic according to a timeline that best fits your business.

Application, prerequisites, and config

The application that we use for this blog is called Elastiflix, a movie streaming application. It consists of several micro-services written in .NET, NodeJS, Go, and Python.

Before we instrument our sample application, we will first need to understand how Elastic can receive the telemetry data.

Configuration

All of Elastic Observability’s APM capabilities are available with OTel data. Some of these include:

  • Service maps
  • Service details (latency, throughput, failed transactions)
  • Dependencies between services, distributed tracing
  • Transactions (traces)
  • Machine learning (ML) correlations
  • Log correlation

In addition to Elastic’s APM and a unified view of the telemetry data, you will also be able to use Elastic’s powerful machine learning capabilities to reduce the analysis, and alerting to help reduce MTTR.

Prerequisites

View the example source code

The full source code, including the Dockerfile used in this blog, can be found on GitHub. The repository also contains the same application without instrumentation. This allows you to compare each file and see the differences.

Before we begin, let’s look at the non-instrumented code first.

This is our simple index.js file that can receive a POST request. See the full code here.

const pino = require('pino');
const ecsFormat = require('@elastic/ecs-pino-format') // 
const log = pino({ ...ecsFormat({ convertReqRes: true }) })
const expressPino = require('express-pino-logger')({ logger: log });

var API_ENDPOINT_FAVORITES = process.env.API_ENDPOINT_FAVORITES || "127.0.0.1:5000";
API_ENDPOINT_FAVORITES = API_ENDPOINT_FAVORITES.split(",")

const express = require("express");
const cors = require("cors")({ origin: true });
const cookieParser = require("cookie-parser");
const { json } = require("body-parser");

const PORT = process.env.PORT || 3001;

const app = express().use(cookieParser(), cors, json(), expressPino);

const axios = require('axios');

app.use(express.json());
app.use(express.urlencoded({ extended: false }));
app.use((err, req, res, next) => {
  log.error(err.stack)
  res.status(500).json({error: err.message, code: err.code})
  })


var favorites = {}

app.post("/api/favorites", (req, res) => {
  var randomIndex = Math.floor(Math.random() * API_ENDPOINT_FAVORITES.length);
  if (process.env.THROW_NOT_A_FUNCTION_ERROR == "true" && Math.random() < 0.5) {
    // randomly choose one of the endpoints
    axios.post('http://' + API_ENDPOINT_FAVORITES[randomIndex]  + '/favorites?user_id=1' , req.body)
    .then(function (response) {
      favorites = response.data
      // quiz solution: "42"
      res.jsonn({ favorites: favorites });
    })
    .catch(function (error) {
      res.json({"error": error, favorites: []})
    });
  } else {
    axios.post('http://' + API_ENDPOINT_FAVORITES[randomIndex]  + '/favorites?user_id=1', req.body)
    .then(function (response) {
      favorites = response.data
      res.json({ favorites: favorites });
    })
    .catch(function (error) {
      res.json({"error": error, favorites: []})
    });
  }

});

app.listen(PORT, () => {
  console.log(`Server listening on ${PORT}`);
});

Step-by-step guide

Step 0. Log in to your Elastic Cloud account

This blog assumes you have an Elastic Cloud account — if not, follow the instructions to get started on Elastic Cloud.

trial

Step 1. Install and initialize OpenTelemetry

As a first step, we’ll need to add some additional modules to our application. 

const opentelemetry = require("@opentelemetry/api");
const { NodeTracerProvider } = require('@opentelemetry/sdk-trace-node');
const { BatchSpanProcessor } = require("@opentelemetry/sdk-trace-base");
const { Resource } = require('@opentelemetry/resources');
const { SemanticResourceAttributes } = require('@opentelemetry/semantic-conventions');

const { registerInstrumentations } = require('@opentelemetry/instrumentation');
const { HttpInstrumentation } = require('@opentelemetry/instrumentation-http');
const { ExpressInstrumentation } = require('@opentelemetry/instrumentation-express');

We start by creating a collectorOptions object with parameters such as the url and headers for connecting to the Elastic APM Server or OpenTelemetry collector. 

const collectorOptions = {
  url: OTEL_EXPORTER_OTLP_ENDPOINT,
  headers: OTEL_EXPORTER_OTLP_HEADERS
};

In order to pass additional parameters to OpenTelemetry, we will read the OTEL_RESOURCE_ATTRIBUTES variable and convert it into an object.

const envAttributes = process.env.OTEL_RESOURCE_ATTRIBUTES || '';

// Parse the environment variable string into an object
const attributes = envAttributes.split(',').reduce((acc, curr) => {
  const [key, value] = curr.split('=');
  if (key && value) {
    acc[key.trim()] = value.trim();
  }
  return acc;
}, {});

Next we will then use these parameters to populate the resources configuration.

const resource = new Resource({
  [SemanticResourceAttributes.SERVICE_NAME]: attributes['service.name'] || 'node-server-otel-manual',
  [SemanticResourceAttributes.SERVICE_VERSION]: attributes['service.version'] || '1.0.0',
  [SemanticResourceAttributes.DEPLOYMENT_ENVIRONMENT]: attributes['deployment.environment'] || 'production',
});

We then set up the trace provider using the previously created resource, followed by the exporter which takes the collectorOptions from before. The trace provider will allow us to create spans later.

Additionally, we specify the use of BatchSPanProcessor. The Span processor is an interface that allows hooks for span start and end method invocations.

In OpenTelemetry, different Span processors are offered. The BatchSPanProcessor batches span and sends them in bulk. Multiple Span processors can be configured to be active at the same time using the MultiSpanProcessor. See OpenTelemetry documentation

Additionally, we added the resource module. This allows us to specify attributes such as service.name, version, and more. See OpenTelemetry semantic conventions documentation for more details.

const tracerProvider = new NodeTracerProvider({
  resource: resource,
});

const exporter = new OTLPTraceExporter(collectorOptions);
tracerProvider.addSpanProcessor(new BatchSpanProcessor(exporter));
tracerProvider.register();

Next, we are going to register some instrumentations. This will automatically instrument Express and HTTP for us. While it’s possible to do this step fully manually as well, it would be complex and a waste of time. This way we can ensure that any incoming and outgoing request is captured properly and that functionality such as distributed tracing works without any additional work.

registerInstrumentations({
  instrumentations: [
    new HttpInstrumentation(),
    new ExpressInstrumentation()
  ],
  tracerProvider: tracerProvider,
});

As a last step, we will now get an instance of the tracer that we can use to create custom spans.

const tracer = opentelemetry.trace.getTracer();

Step 2. Adding custom spans

Now that we have the modules added and initialized, we can add custom spans.

Our sample application has a POST request which calls a downstream service. If we want to have additional instrumentation for this part of our app, we simply wrap the function code with:

tracer.startActiveSpan('favorites',   tracer.startActiveSpan('favorites', (span) => {...

The wrapped code is as follows:

app.post("/api/favorites", (req, res, next) => {
  tracer.startActiveSpan('favorites', (span) => {
    axios.post('http://' + API_ENDPOINT_FAVORITES + '/favorites?user_id=1', req.body)
      .then(function (response) {
        favorites = response.data
        span.end();
        res.jsonn({ favorites: favorites });
      })
      .catch(next)
  }); 
});

Automatic error handling
For automatic error handling, we are adding a function that we use in Express which captures the exception for any error that happens during runtime.

app.use((err, req, res, next) => {
  log.error(err.stack)
  span = opentelemetry.trace.getActiveSpan()
  span.recordException(error);
  span.end();
  res.status(500).json({error: err.message, code: err.code})
})

Additional code
n addition to modules and span instrumentation, the sample application also checks some environment variables at startup. When sending data to Elastic without an OTel collector, the OTEL_EXPORTER_OTLP_HEADERS variable is required as it contains the authentication. The same is true for OTEL_EXPORTER_OTLP_ENDPOINT, the host where we’ll send the telemetry data.

const OTEL_EXPORTER_OTLP_HEADERS = process.env.OTEL_EXPORTER_OTLP_HEADERS;
// error if secret token is not set
if (!OTEL_EXPORTER_OTLP_HEADERS) {
  throw new Error("OTEL_EXPORTER_OTLP_HEADERS environment variable is not set");
}

const OTEL_EXPORTER_OTLP_ENDPOINT = process.env.OTEL_EXPORTER_OTLP_ENDPOINT;
// error if server url is not set
if (!OTEL_EXPORTER_OTLP_ENDPOINT) {
  throw new Error("OTEL_EXPORTER_OTLP_ENDPOINT environment variable is not set");
}

Final code
For comparison, this is the instrumented code of our sample application. You can find the full source code in GitHub.

const pino = require('pino');
const ecsFormat = require('@elastic/ecs-pino-format') // 
const log = pino({ ...ecsFormat({ convertReqRes: true }) })
const expressPino = require('express-pino-logger')({ logger: log });

// Add OpenTelemetry packages
const opentelemetry = require("@opentelemetry/api");
const { NodeTracerProvider } = require('@opentelemetry/sdk-trace-node');
const { BatchSpanProcessor } = require("@opentelemetry/sdk-trace-base");
const { OTLPTraceExporter } = require('@opentelemetry/exporter-trace-otlp-grpc');
const { Resource } = require('@opentelemetry/resources');
const { SemanticResourceAttributes } = require('@opentelemetry/semantic-conventions');

const { registerInstrumentations } = require('@opentelemetry/instrumentation');

// Import OpenTelemetry instrumentations
const { HttpInstrumentation } = require('@opentelemetry/instrumentation-http');
const { ExpressInstrumentation } = require('@opentelemetry/instrumentation-express');


var API_ENDPOINT_FAVORITES = process.env.API_ENDPOINT_FAVORITES || "127.0.0.1:5000";
API_ENDPOINT_FAVORITES = API_ENDPOINT_FAVORITES.split(",")

const OTEL_EXPORTER_OTLP_HEADERS = process.env.OTEL_EXPORTER_OTLP_HEADERS;
// error if secret token is not set
if (!OTEL_EXPORTER_OTLP_HEADERS) {
  throw new Error("OTEL_EXPORTER_OTLP_HEADERS environment variable is not set");
}

const OTEL_EXPORTER_OTLP_ENDPOINT = process.env.OTEL_EXPORTER_OTLP_ENDPOINT;
// error if server url is not set
if (!OTEL_EXPORTER_OTLP_ENDPOINT) {
  throw new Error("OTEL_EXPORTER_OTLP_ENDPOINT environment variable is not set");
}

const collectorOptions = {
  // url is optional and can be omitted - default is http://localhost:4317
  // Unix domain sockets are also supported: 'unix:///path/to/socket.sock'
  url: OTEL_EXPORTER_OTLP_ENDPOINT,
  headers: OTEL_EXPORTER_OTLP_HEADERS
};

const envAttributes = process.env.OTEL_RESOURCE_ATTRIBUTES || '';

// Parse the environment variable string into an object
const attributes = envAttributes.split(',').reduce((acc, curr) => {
  const [key, value] = curr.split('=');
  if (key && value) {
    acc[key.trim()] = value.trim();
  }
  return acc;
}, {});

// Create and configure the resource object
const resource = new Resource({
  [SemanticResourceAttributes.SERVICE_NAME]: attributes['service.name'] || 'node-server-otel-manual',
  [SemanticResourceAttributes.SERVICE_VERSION]: attributes['service.version'] || '1.0.0',
  [SemanticResourceAttributes.DEPLOYMENT_ENVIRONMENT]: attributes['deployment.environment'] || 'production',
});

// Create and configure the tracer provider
const tracerProvider = new NodeTracerProvider({
  resource: resource,
});
const exporter = new OTLPTraceExporter(collectorOptions);
tracerProvider.addSpanProcessor(new BatchSpanProcessor(exporter));
tracerProvider.register();

//Register instrumentations
registerInstrumentations({
  instrumentations: [
    new HttpInstrumentation(),
    new ExpressInstrumentation()
  ],
  tracerProvider: tracerProvider,
});

const express = require("express");
const cors = require("cors")({ origin: true });
const cookieParser = require("cookie-parser");
const { json } = require("body-parser");

const PORT = process.env.PORT || 3001;

const app = express().use(cookieParser(), cors, json(), expressPino);

const axios = require('axios');

app.use(express.json());
app.use(express.urlencoded({ extended: false }));
app.use((err, req, res, next) => {
  log.error(err.stack)
  span = opentelemetry.trace.getActiveSpan()
  span.recordException(error);
  span.end();
  res.status(500).json({error: err.message, code: err.code})
})

const tracer = opentelemetry.trace.getTracer();


var favorites = {}

app.post("/api/favorites", (req, res, next) => {
  tracer.startActiveSpan('favorites', (span) => {
    var randomIndex = Math.floor(Math.random() * API_ENDPOINT_FAVORITES.length);

    if (process.env.THROW_NOT_A_FUNCTION_ERROR == "true" && Math.random() < 0.5) {
      // randomly choose one of the endpoints
      axios.post('http://' + API_ENDPOINT_FAVORITES[randomIndex] + '/favorites?user_id=1', req.body)
        .then(function (response) {
          favorites = response.data
          // quiz solution: "42"
          span.end();
          res.jsonn({ favorites: favorites });
        })
        .catch(next)
    } else {
      axios.post('http://' + API_ENDPOINT_FAVORITES[randomIndex] + '/favorites?user_id=1', req.body)
        .then(function (response) {
          favorites = response.data
          span.end();
          res.json({ favorites: favorites });
        })
        .catch(next)
    }
  }); 
});

app.listen(PORT, () => {
  log.info(`Server listening on ${PORT}`);
});

Step 3. Running the Docker image with environment variables

We will use environment variables and pass in the configuration values to enable it to connect with Elastic Observability’s APM server.  

Because Elastic accepts OTLP natively, we just need to provide the Endpoint and authentication where the OTEL Exporter needs to send the data, as well as some other environment variables.

Getting Elastic Cloud variables
You can copy the endpoints and token from Kibana® under the path `/app/home#/tutorial/apm`.

apm

You will need to copy the following environment variables:

OTEL_EXPORTER_OTLP_ENDPOINT
OTEL_EXPORTER_OTLP_HEADERS

Build the image

docker build -t  node-otel-manual-image .

Run the image

docker run \
       -e OTEL_EXPORTER_OTLP_ENDPOINT="<REPLACE WITH OTEL_EXPORTER_OTLP_ENDPOINT>" \
       -e OTEL_EXPORTER_OTLP_HEADERS="Authorization=Bearer <REPLACE WITH TOKEN>" \
       -e OTEL_RESOURCE_ATTRIBUTES="service.version=1.0,deployment.environment=production,service.name=node-server-otel-manual" \
       -p 3001:3001 \
       node-otel-manual-image

You can now issue a few requests in order to generate trace data. Note that these requests are expected to return an error, as this service relies on some downstream services that you may not have running on your machine.

curl localhost:3001/api/login
curl localhost:3001/api/favorites

# or alternatively issue a request every second

while true; do curl "localhost:3001/api/favorites"; sleep 1; done;

Step 4. Explore in Elastic APM

Now that the service is instrumented, you should see the following output in Elastic APM when looking at the transactions section of your Node.js service:

graphs

Notice how this mirrors the auto-instrumented version.

graphs-2

Is it worth it?

This is the million-dollar question. Depending on what level of detail you need, it's potentially necessary to manually instrument. Manual instrumentation lets you add custom spans, custom labels, and metrics where you want or need them. It allows you to get a level of detail that otherwise would not be possible and is oftentimes important for tracking business-specific KPIs.

Your operations, and whether you need to troubleshoot or analyze the performance of specific parts of the code, will dictate when and what to instrument. But it’s helpful to know that you have the option to manually instrument.

If you noticed we didn’t yet instrument metrics, that is another blog. We discussed logs in a previous blog.

Conclusion

In this blog, we discussed the following:

  • How to manually instrument Node.js with OpenTelemetry
  • The different modules needed when using Express
  • How to properly initialize and instrument span
  • How to easily set the OTLP ENDPOINT and OTLP HEADERS from Elastic without the need for a collector

Hopefully, this provides an easy-to-understand walk-through of instrumenting Node.js with OpenTelemetry and how easy it is to send traces into Elastic.

Don’t have an Elastic Cloud account yet? Sign up for Elastic Cloud and try out the auto-instrumentation capabilities that I discussed above. I would be interested in getting your feedback about your experience in gaining visibility into your application stack with Elastic.

The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all.