Auto-instrumentation of Java applications with OpenTelemetry

observability-launch-series-3-java-auto.jpg

In the fast-paced universe of software development, especially in the cloud-native realm, DevOps and SRE teams are increasingly emerging as essential partners in application stability and growth.

DevOps engineers continuously optimize software delivery, while SRE teams act as the stewards of application reliability, scalability, and top-tier performance. The challenge? These teams require a cutting-edge observability solution, one that encompasses full-stack insights, empowering them to rapidly manage, monitor, and rectify potential disruptions before they culminate into operational challenges.

Observability in our modern distributed software ecosystem goes beyond mere monitoring — it demands limitless data collection, precision in processing, and the correlation of this data into actionable insights. However, the road to achieving this holistic view is paved with obstacles, from navigating version incompatibilities to wrestling with restrictive proprietary code.

Enter OpenTelemetry (OTel), with the following benefits for those who adopt it:

  • Escape vendor constraints with OTel, freeing yourself from vendor lock-in and ensuring top-notch observability.
  • See the harmony of unified logs, metrics, and traces come together to provide a complete system view.
  • Improve your application oversight through richer and enhanced instrumentations.
  • Embrace the benefits of backward compatibility to protect your prior instrumentation investments.
  • Embark on the OpenTelemetry journey with an easy learning curve, simplifying onboarding and scalability.
  • Rely on a proven, future-ready standard to boost your confidence in every investment.

In this blog, we will explore how you can use automatic instrumentation in your Java application using Docker, without the need to refactor any part of your application code. We will use an application called Elastiflix, which helps highlight auto-instrumentation in a simple way.

Application, prerequisites, and config

The application that we use for this blog is called Elastiflix, a movie-streaming application. It consists of several micro-services written in .NET, NodeJS, Go, and Python.

Before we instrument our sample application, we will first need to understand how Elastic can receive the telemetry data.

Elastic configuration options
Elastic configuration options for OpenTelemetry

All of Elastic Observability’s APM capabilities are available with OTel data. Some of these include:

  • Service maps
  • Service details (latency, throughput, failed transactions)
  • Dependencies between services, distributed tracing
  • Transactions (traces)
  • Machine learning (ML) correlations
  • Log correlation

In addition to Elastic’s APM and a unified view of the telemetry data, you will also be able to use Elastic’s powerful machine learning capabilities to reduce the analysis, and alerting to help reduce MTTR.

Prerequisites

View the example source code

The full source code, including the Dockerfile used in this blog, can be found on GitHub. The repository also contains the same application without instrumentation. This allows you to compare each file and see the differences.

The following steps will show you how to instrument this application and run it on the command line or in Docker. If you are interested in a more complete OTel example, take a look at the docker-compose file here, which will bring up the full project.

Step-by-step guide

Step 0. Log in to your Elastic Cloud account

This blog assumes you have an Elastic Cloud account — if not, follow the instructions to get started on Elastic Cloud.

free trial

Step 1. Configure auto-instrumentation for the Java service

We are going to use automatic instrumentation with Java service from the Elastiflix demo application.

We will be using the following service from Elastiflix:

Elastiflix/java-favorite-otel-auto

Per the OpenTelemetry Automatic Instrumentation for Java documentation and documentation, you will simply install the appropriate Java packages. 

Create a local OTel directory to download the OpenTelemetry Java agent. Download opentelemetry-javaagent.jar.

>mkdir /otel

>curl -L https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases/latest/download/opentelemetry-javaagent.jar –output /otel/opentelemetry-javaagent.jar

If you are going to run the service on the command line, then you can use the following command:

java -javaagent:/otel/opentelemetry-javaagent.jar \
-jar /usr/src/app/target/favorite-0.0.1-SNAPSHOT.jar --server.port=5000

For our application, we will do this as part of the Dockerfile.

Dockerfile

Start with a base image containing Java runtime
FROM maven:3.8.2-openjdk-17-slim as build

# Make port 8080 available to the world outside this container
EXPOSE 5000

# Change to the app directory
WORKDIR /usr/src/app

# Copy the local code to the container
COPY . .

# Build the application
RUN mvn clean install

USER root
RUN apt-get update && apt-get install -y zip curl
RUN mkdir /otel
RUN curl -L -o /otel/opentelemetry-javaagent.jar https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases/download/v1.28.0/opentelemetry-javaagent.jar

COPY start.sh /start.sh
RUN chmod +x /start.sh

ENTRYPOINT ["/start.sh"]

Step 2. Running the Docker Image with environment variables

As specified in the OTEL Java documentation, we will use environment variables and pass in the configuration values to enable it to connect with Elastic Observability’s APM server.  

Because Elastic accepts OTLP natively, we just need to provide the Endpoint and authentication where the OTEL Exporter needs to send the data, as well as some other environment variables.

Getting Elastic Cloud variables
You can copy the endpoints and token from Kibana under the path `/app/home#/tutorial/apm`.

apm agents

You will need to copy the following environment variables:

OTEL_EXPORTER_OTLP_ENDPOINT
OTEL_EXPORTER_OTLP_HEADERS

Build the Docker image

docker build -t java-otel-auto-image .

Run the Docker image

docker run \
       -e OTEL_EXPORTER_OTLP_ENDPOINT="REPLACE WITH OTEL_EXPORTER_OTLP_ENDPOINT" \
       -e ELASTIC_APM_SECRET_TOKEN="REPLACE WITH THE BIT AFTER Authorization=Bearer " \
       -e OTEL_RESOURCE_ATTRIBUTES="service.version=1.0,deployment.environment=production" \
       -e OTEL_SERVICE_NAME="java-favorite-otel-auto" \
       -p 5000:5000 \
       java-otel-auto-image

You can now issue a few requests in order to generate trace data. Note that these requests are expected to return an error, as this service relies on a connection to Redis that you don’t currently have running. As mentioned before, you can find a more complete example using docker-compose here.

curl localhost:5000/favorites

# or alternatively issue a request every second

while true; do curl "localhost:5000/favorites"; sleep 1; done;

Step 3: Explore traces and logs in Elastic APM

Once you have this up and running, you can ping the endpoint for your instrumented service (in our case, this is /favorites), and you should see the app appear in Elastic APM, as shown below:

services

It will begin by tracking throughput and latency critical metrics for SREs to pay attention to.

Digging in, we can see an overview of all our Transactions.

services-2

And look at specific transactions:

graph colored lines

Click on Logs, and we see that logs are also brought over. The OTel Agent will automatically bring in logs and correlate them with traces for you:

graph-no-colors

This gives you complete visibility across logs, metrics, and traces!

Basic concepts: How APM works with Java

Before we continue, let's first understand a few basic concepts and terms.

  • Java Agent: This is a tool that can be used to instrument (or modify) the bytecode of class files in the Java Virtual Machine (JVM). Java agents are used for many purposes like performance monitoring, logging, security, and more.
  • Bytecode: This is the intermediary code generated by the Java compiler from your Java source code. This code is interpreted or compiled on the fly by the JVM to produce machine code that can be executed.
  • Byte Buddy: Byte Buddy is a code generation and manipulation library for Java. It is used to create, modify, or adapt Java classes at runtime. In the context of a Java Agent, Byte Buddy provides a powerful and flexible way to modify bytecode. Both the Elastic APM Agent and the OpenTelemetry Agent use Byte Buddy under the covers.

Now, let's talk about how automatic instrumentation works with Byte Buddy:

Automatic instrumentation is the process by which an agent modifies the bytecode of your application's classes, often to insert monitoring code. The agent doesn't modify the source code directly, but rather the bytecode that is loaded into the JVM. This is done while the JVM is loading the classes, so the modifications are in effect during runtime.

Here's a simplified explanation of the process:

  1. Start the JVM with the agent: When starting your Java application, you specify the Java agent with the -javaagent command line option. This instructs the JVM to load your agent before the main method of your application is invoked. At this point, the agent has the opportunity to set up class transformers.

  2. Register a class file transformer with Byte Buddy: Your agent will register a class file transformer with Byte Buddy. A transformer is a piece of code that is invoked every time a class is loaded into the JVM. This transformer receives the bytecode of the class, and it can modify this bytecode before the class is actually used.

  3. Transform the bytecode: When your transformer is invoked, it will use Byte Buddy's API to modify the bytecode. Byte Buddy allows you to specify your transformations in a high-level, expressive way rather than manually writing complex bytecode. For example, you could specify a certain class and method within that class that you want to instrument and provide an "interceptor" that will add new behavior to that method.

    1. For instance, let's say you want to measure the execution time of a method. You would instruct Byte Buddy to target the specific class and method and then provide an interceptor that wraps the method call with timing code. Every time this method is invoked, your interceptor is called first and measures the start time, then it calls the original method, and finally it measures the end time and prints the duration.

  4. Use the transformed classes: Once the agent has set up its transformers, the JVM continues to load classes as usual. Each time a class is loaded, your transformers are invoked, allowing them to modify the bytecode. Your application then uses these transformed classes as if they were the original ones, but they now have the extra behavior that you've injected through your interceptor.

flowchart

In essence, automatic instrumentation with Byte Buddy is about modifying the behavior of your Java classes at runtime, without needing to alter the source code directly. This is especially useful for cross-cutting concerns like logging, monitoring, or security, as it allows you to centralize this code in your Java Agent, rather than scattering it throughout your application.

Summary

With this Dockerfile, you've transformed your simple Java application into one that's automatically instrumented with OpenTelemetry. This will aid greatly in understanding application performance, tracing errors, and gaining insights into how users interact with your software.

Remember, observability is a crucial aspect of modern application development, especially in distributed systems. With tools like OpenTelemetry, understanding complex systems becomes a tad bit easier.

In this blog, we discussed the following:

  • How to auto-instrument Java with OpenTelemetry. 
  • Using standard commands in a Docker file, auto-instrumentation was done efficiently and without adding code in multiple places enabling manageability.
  • Using OpenTelemetry and its support for multiple languages, DevOps and SRE teams can auto-instrument their applications with ease gaining immediate insights into the health of the entire application stack and reduce mean time to resolution (MTTR).

Since Elastic can support a mix of methods for ingesting data, whether it be using auto-instrumentation of open-source OpenTelemetry or manual instrumentation with its native APM agents, you can plan your migration to OTel by focusing on a few applications first and then using OpenTelemety across your applications later on in a manner that best fits your business needs.

Don’t have an Elastic Cloud account yet? Sign up for Elastic Cloud and try out the auto-instrumentation capabilities that I discussed above. I would be interested in getting your feedback about your experience in gaining visibility into your application stack with Elastic.

The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all.