Getting started with OpenTelemetry instrumentation with a sample application

email-thumbnail-generic-release-cloud_(1).png

Application performance management (APM) has moved beyond traditional monitoring to become an essential tool for developers, offering deep insights into applications at the code level. With APM, teams can not only detect issues but also understand their root causes, optimizing software performance and end-user experiences. The modern landscape presents a wide range of APM tools and companies offering different solutions. Additionally, OpenTelemetry is becoming the open ingestion standard for APM. With OpenTelemetry, DevOps teams have a consistent approach to collecting and ingesting telemetry data. 

Elastic® offers its own APM Agents, which can be used for instrumenting your code. In addition, Elastic also supports OpenTelemtry natively.

Navigating the differences and understanding how to instrument applications using these tools can be challenging. That's where our sample application, Elastiflix — a UI for movie search — comes into play. We've crafted it to demonstrate the nuances of both OTEL and Elastic APM, guiding you through the process of the APM instrumentation and showcasing how you can use one or the other, depending on your preference.

The sample application

We deliberately kept the movie search UI really simple. It displays some movies, has a search bar, and, at the time of writing, only one real functionality: you can add a movie to your list of favorites.

luca
services

Services, languages, and instrumentation

Our application has a few different services: 

  • javascript-frontend: A React frontend, talking to the node service and Elasticsearch®
  • node-server: Node backend, talking to other backend services
  • dotnet-login: A login service that returns a random username

We reimplemented the “favorite” service in a few different languages, as we did not want to introduce additional complexity to the architecture of the application.

  • Go-favorite: A Go service that stores a list of favorites movies in Redis
  • Java-favorite: A Java service that stores a list of favorites movies in Redis
  • Python-favorite: A Python service that stores a list of favorites movies in Redis

In addition, there’s also some other supporting containers:

  • Movie-data-loader: Loads the movie database into your Elasticsearch cluster
  • Redis: Used as a datastore for keeping track of the user’s favorites
  • Locust: A load generator that talks to the node service to introduce artificial load
flowchart

The main difference compared to some other sample application repositories is that we’ve coded it in several languages, with each language version showcasing almost all possible types of instrumentation:

types of instrumentation

Why this approach?

While sample applications provide good insight into how tools work, they often showcase only one version, leaving developers to find all of the necessary modifications themselves. We've taken a different approach. By offering multiple versions, we intend to bridge the knowledge gap, making it straightforward for developers to see and comprehend the transition process from non-instrumented code to either Elastic or OTEL instrumented versions.

Instead of simply starting the already instrumented version, you can instrument the base version yourself, by following some of our other blogs. This will teach you much more than just looking at an already built version.

Prerequisites

Before starting the sample application, ensure you've set up your Elastic deployment details. Populate the .env file (located in the same directory as the compose files) with the necessary credentials. You can copy these from the Cloud UI and from within Kibana® under the path /app/home#/tutorial/apm

Cloud UI

my deployment

Kibana APM Tutorial

Kibana APM Tutorial
ELASTIC_APM_SERVER_URL="https://foobar.apm.us-central1.gcp.cloud.es.io"
ELASTIC_APM_SECRET_TOKEN="secret123"
ELASTICSEARCH_USERNAME="elastic"
ELASTICSEARCH_PASSWORD="changeme"
ELASTICSEARCH_URL="https://foobar.es.us-central1.gcp.cloud.es.io"

Starting the application

You have the flexibility to initiate our sample app in three distinctive manners, each corresponding to a different instrumentation scenario.

We provide public Docker images that you can use when you supply the --no-build flag. Otherwise the images will be built from source on your machine, which will take around 5–10 minutes.

1. Non-instrumented version

cd Elastiflix
docker-compose -f docker-compose.yml up -d --no-build

2. Elastic instrumented version

cd Elastiflix
docker-compose -f docker-compose-elastic.yml up -d --no-build

3. OpenTelemetry instrumented version

cd Elastiflix
docker-compose -f docker-compose-elastic-otel.yml up -d --no-build

After launching the desired version, explore the application at localhost:9000. We also deploy a load generator on localhost:8089 where you can increase the number of concurrent users. Note that the load generator is talking directly to the node backend service. If you want to generate RUM data from the javascript frontend, then you have to manually browse to localhost:9000 and visit a few pages.

Simulation and failure scenarios

In the real world, applications are subject to varying conditions, random bugs, and misconfigurations. We've incorporated some of these to mimic potential real-life situations. You can find a list of possible environment variables here.

Non-instrumented scenarios

# healthy
docker-compose -f docker-compose.yml up -d

# pause redis for 5 seconds, every 30 seconds
TOGGLE_CLIENT_PAUSE=true docker-compose -f docker-compose.yml up -d

# add artificial delay to python service, 100ms, delay 50% of requests by 1000ms
TOGGLE_SERVICE_DELAY=100 TOGGLE_CANARY_DELAY=1000 docker-compose -f docker-compose.yml up -d

# add artificial delay to python service, 100ms, delay 50% of requests by 1000ms, and fail 20% of them
TOGGLE_SERVICE_DELAY=100 TOGGLE_CANARY_DELAY=1000 TOGGLE_CANARY_FAILURE=0.2 docker-compose -f docker-compose.yml up -d

# throw error in nodejs service, 50% of the time
THROW_NOT_A_FUNCTION_ERROR=true docker-compose -f docker-compose.yml up -d 

Elastic instrumented scenarios

# healthy
docker-compose -f docker-compose-elastic.yml up -d

# pause redis for 5 seconds, every 30 seconds
TOGGLE_CLIENT_PAUSE=true docker-compose -f docker-compose-elastic.yml up -d 

# add artificial delay to python service, 100ms, delay 50% of requests by 1000ms
TOGGLE_SERVICE_DELAY=100 TOGGLE_CANARY_DELAY=1000 docker-compose -f docker-compose-elastic.yml up -d

# add artificial delay to python service, 100ms, delay 50% of requests by 1000ms, and fail 20% of them
TOGGLE_SERVICE_DELAY=100 TOGGLE_CANARY_DELAY=1000 TOGGLE_CANARY_FAILURE=0.2 docker-compose -f docker-compose-elastic.yml up -d

# throw error in nodejs service, 50% of the time
THROW_NOT_A_FUNCTION_ERROR=true docker-compose -f docker-compose-elastic.yml up -d 

OpenTelemetry instrumented scenarios

# healthy
docker-compose -f docker-compose-elastic-otel.yml up -d

# pause redis for 5 seconds, every 30 seconds
TOGGLE_CLIENT_PAUSE=true docker-compose -f docker-compose-elastic-otel.yml up -d 

# add artificial delay to python service, 100ms, delay 50% of requests by 1000ms
TOGGLE_SERVICE_DELAY=100 TOGGLE_CANARY_DELAY=1000 docker-compose -f docker-compose-elastic-otel.yml up -d

# add artificial delay to python service, 100ms, delay 50% of requests by 1000ms, and fail 20% of them
TOGGLE_SERVICE_DELAY=100 TOGGLE_CANARY_DELAY=1000 TOGGLE_CANARY_FAILURE=0.2 docker-compose -f docker-compose-elastic-otel.yml up -d


# throw error in nodejs service, 50% of the time
THROW_NOT_A_FUNCTION_ERROR=true docker-compose -f docker-compose-elastic-otel.yml up -d 

Mix Elastic and OTel

Since the application has the services in all possible permutations and the “favorite” service even written in multiple languages, you can also run them in a mixed mode. 

You can also run some of them in parallel, like we do for the “favorite” service.

Elastic and OTel are fully compatible, so you could run some services instrumented with OTel while others are running with the Elastic APM Agent.

Take a look at the existing compose file and simply copy one of the snippets for each service type.

  favorite-java-otel-auto:
    build: java-favorite-otel-auto/.
    image: docker.elastic.co/demos/workshop/observability/elastiflix-java-favorite-otel-auto:${ELASTIC_VERSION}-${BUILD_NUMBER}
    depends_on:
      - redis
    networks:
      - app-network
    ports:
      - "5004:5000"
    environment:
      - ELASTIC_APM_SECRET_TOKEN=${ELASTIC_APM_SECRET_TOKEN}
      - OTEL_EXPORTER_OTLP_ENDPOINT=${ELASTIC_APM_SERVER_URL}
      - OTEL_METRICS_EXPORTER=otlp
      - OTEL_RESOURCE_ATTRIBUTES=service.version=1.0,deployment.environment=production
      - OTEL_SERVICE_NAME=java-favorite-otel-auto
      - OTEL_TRACES_EXPORTER=otlp
      - REDIS_HOST=redis
      - TOGGLE_SERVICE_DELAY=${TOGGLE_SERVICE_DELAY}
      - TOGGLE_CANARY_DELAY=${TOGGLE_CANARY_DELAY}
      - TOGGLE_CANARY_FAILURE=${TOGGLE_CANARY_FAILURE}

Working with the source code

The repository contains all possible permutations of the service. 

  • Subdirectories are named in the format $langauge-$serviceName-(elastic|otel)-(auto|manual). As an example, python-favorite-otel-auto is a Python service. The name of it is “favorite,” and it’s instrumented with OpenTelemetry, using auto-instrumentation.
  • You can now compare this directory to the non-instrumented version of this service available under the directory python-favorite.
code

This allows you to easily understand the difference between the two. In addition, you can also start from scratch using the non-instrumentation version and try to instrument it yourself.

Conclusion

Monitoring is more than just observing; it's about understanding and optimizing. Our sample application seeks to guide you on your journey with Elastic APM or OpenTelemetry, providing you with the tools to build resilient and high-performing applications.

Don’t have an Elastic Cloud account yet? Sign up for Elastic Cloud and try out the auto-instrumentation capabilities that I discussed above. I would be interested in getting your feedback about your experience in gaining visibility into your application stack with Elastic.  

The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all.