Auto-instrumentation of Java applications with OpenTelemetry
In the fast-paced universe of software development, especially in the cloud-native realm, DevOps and SRE teams are increasingly emerging as essential partners in application stability and growth.
DevOps engineers continuously optimize software delivery, while SRE teams act as the stewards of application reliability, scalability, and top-tier performance. The challenge? These teams require a cutting-edge observability solution, one that encompasses full-stack insights, empowering them to rapidly manage, monitor, and rectify potential disruptions before they culminate into operational challenges.
Observability in our modern distributed software ecosystem goes beyond mere monitoring — it demands limitless data collection, precision in processing, and the correlation of this data into actionable insights. However, the road to achieving this holistic view is paved with obstacles, from navigating version incompatibilities to wrestling with restrictive proprietary code.
Enter OpenTelemetry (OTel), with the following benefits for those who adopt it:
- Escape vendor constraints with OTel, freeing yourself from vendor lock-in and ensuring top-notch observability.
- See the harmony of unified logs, metrics, and traces come together to provide a complete system view.
- Improve your application oversight through richer and enhanced instrumentations.
- Embrace the benefits of backward compatibility to protect your prior instrumentation investments.
- Embark on the OpenTelemetry journey with an easy learning curve, simplifying onboarding and scalability.
- Rely on a proven, future-ready standard to boost your confidence in every investment.
In this blog, we will explore how you can use automatic instrumentation in your Java application using Docker, without the need to refactor any part of your application code. We will use an application called Elastiflix, which helps highlight auto-instrumentation in a simple way.
Application, prerequisites, and config
The application that we use for this blog is called Elastiflix, a movie-streaming application. It consists of several micro-services written in .NET, NodeJS, Go, and Python.
Before we instrument our sample application, we will first need to understand how Elastic can receive the telemetry data.
All of Elastic Observability’s APM capabilities are available with OTel data. Some of these include:
- Service maps
- Service details (latency, throughput, failed transactions)
- Dependencies between services, distributed tracing
- Transactions (traces)
- Machine learning (ML) correlations
- Log correlation
In addition to Elastic’s APM and a unified view of the telemetry data, you will also be able to use Elastic’s powerful machine learning capabilities to reduce the analysis, and alerting to help reduce MTTR.
View the example source code
The full source code, including the Dockerfile used in this blog, can be found on GitHub. The repository also contains the same application without instrumentation. This allows you to compare each file and see the differences.
The following steps will show you how to instrument this application and run it on the command line or in Docker. If you are interested in a more complete OTel example, take a look at the docker-compose file here, which will bring up the full project.
Step 1. Configure auto-instrumentation for the Java service
We are going to use automatic instrumentation with Java service from the Elastiflix demo application.
We will be using the following service from Elastiflix:
Per the OpenTelemetry Automatic Instrumentation for Java documentation and documentation, you will simply install the appropriate Java packages.
Create a local OTel directory to download the OpenTelemetry Java agent. Download opentelemetry-javaagent.jar.
>mkdir /otel >curl -L https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases/latest/download/opentelemetry-javaagent.jar –output /otel/opentelemetry-javaagent.jar
If you are going to run the service on the command line, then you can use the following command:
java -javaagent:/otel/opentelemetry-javaagent.jar \ -jar /usr/src/app/target/favorite-0.0.1-SNAPSHOT.jar --server.port=5000
For our application, we will do this as part of the Dockerfile.
Start with a base image containing Java runtime FROM maven:3.8.2-openjdk-17-slim as build # Make port 8080 available to the world outside this container EXPOSE 5000 # Change to the app directory WORKDIR /usr/src/app # Copy the local code to the container COPY . . # Build the application RUN mvn clean install USER root RUN apt-get update && apt-get install -y zip curl RUN mkdir /otel RUN curl -L -o /otel/opentelemetry-javaagent.jar https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases/download/v1.28.0/opentelemetry-javaagent.jar COPY start.sh /start.sh RUN chmod +x /start.sh ENTRYPOINT ["/start.sh"]
Step 2. Running the Docker Image with environment variables
Because Elastic accepts OTLP natively, we just need to provide the Endpoint and authentication where the OTEL Exporter needs to send the data, as well as some other environment variables.
Getting Elastic Cloud variables
You can copy the endpoints and token from Kibana under the path `/app/home#/tutorial/apm`.
You will need to copy the following environment variables:
Build the Docker image
docker build -t java-otel-auto-image .
Run the Docker image
docker run \ -e OTEL_EXPORTER_OTLP_ENDPOINT="REPLACE WITH OTEL_EXPORTER_OTLP_ENDPOINT" \ -e OTEL_EXPORTER_OTLP_HEADERS="REPLACE WITH OTEL_EXPORTER_OTLP_HEADERS" \ -e OTEL_RESOURCE_ATTRIBUTES="service.version=1.0,deployment.environment=production" \ -e OTEL_SERVICE_NAME="java-favorite-otel-auto" \ -p 5000:5000 \ java-otel-auto-image
You can now issue a few requests in order to generate trace data. Note that these requests are expected to return an error, as this service relies on a connection to Redis that you don’t currently have running. As mentioned before, you can find a more complete example using docker-compose here.
curl localhost:5000/favorites # or alternatively issue a request every second while true; do curl "localhost:5000/favorites"; sleep 1; done;
Step 3: Explore traces and logs in Elastic APM
Once you have this up and running, you can ping the endpoint for your instrumented service (in our case, this is /favorites), and you should see the app appear in Elastic APM, as shown below:
It will begin by tracking throughput and latency critical metrics for SREs to pay attention to.
Digging in, we can see an overview of all our Transactions.
And look at specific transactions:
Click on Logs, and we see that logs are also brought over. The OTel Agent will automatically bring in logs and correlate them with traces for you:
This gives you complete visibility across logs, metrics, and traces!
Basic concepts: How APM works with Java
Before we continue, let's first understand a few basic concepts and terms.
- Java Agent: This is a tool that can be used to instrument (or modify) the bytecode of class files in the Java Virtual Machine (JVM). Java agents are used for many purposes like performance monitoring, logging, security, and more.
- Bytecode: This is the intermediary code generated by the Java compiler from your Java source code. This code is interpreted or compiled on the fly by the JVM to produce machine code that can be executed.
- Byte Buddy: Byte Buddy is a code generation and manipulation library for Java. It is used to create, modify, or adapt Java classes at runtime. In the context of a Java Agent, Byte Buddy provides a powerful and flexible way to modify bytecode. Both the Elastic APM Agent and the OpenTelemetry Agent use Byte Buddy under the covers.
Now, let's talk about how automatic instrumentation works with Byte Buddy:
Automatic instrumentation is the process by which an agent modifies the bytecode of your application's classes, often to insert monitoring code. The agent doesn't modify the source code directly, but rather the bytecode that is loaded into the JVM. This is done while the JVM is loading the classes, so the modifications are in effect during runtime.
Here's a simplified explanation of the process:
Start the JVM with the agent: When starting your Java application, you specify the Java agent with the -javaagent command line option. This instructs the JVM to load your agent before the main method of your application is invoked. At this point, the agent has the opportunity to set up class transformers.
Register a class file transformer with Byte Buddy: Your agent will register a class file transformer with Byte Buddy. A transformer is a piece of code that is invoked every time a class is loaded into the JVM. This transformer receives the bytecode of the class, and it can modify this bytecode before the class is actually used.
Transform the bytecode: When your transformer is invoked, it will use Byte Buddy's API to modify the bytecode. Byte Buddy allows you to specify your transformations in a high-level, expressive way rather than manually writing complex bytecode. For example, you could specify a certain class and method within that class that you want to instrument and provide an "interceptor" that will add new behavior to that method.
For instance, let's say you want to measure the execution time of a method. You would instruct Byte Buddy to target the specific class and method and then provide an interceptor that wraps the method call with timing code. Every time this method is invoked, your interceptor is called first and measures the start time, then it calls the original method, and finally it measures the end time and prints the duration.
- Use the transformed classes: Once the agent has set up its transformers, the JVM continues to load classes as usual. Each time a class is loaded, your transformers are invoked, allowing them to modify the bytecode. Your application then uses these transformed classes as if they were the original ones, but they now have the extra behavior that you've injected through your interceptor.
In essence, automatic instrumentation with Byte Buddy is about modifying the behavior of your Java classes at runtime, without needing to alter the source code directly. This is especially useful for cross-cutting concerns like logging, monitoring, or security, as it allows you to centralize this code in your Java Agent, rather than scattering it throughout your application.
With this Dockerfile, you've transformed your simple Java application into one that's automatically instrumented with OpenTelemetry. This will aid greatly in understanding application performance, tracing errors, and gaining insights into how users interact with your software.
Remember, observability is a crucial aspect of modern application development, especially in distributed systems. With tools like OpenTelemetry, understanding complex systems becomes a tad bit easier.
In this blog, we discussed the following:
- How to auto-instrument Java with OpenTelemetry.
- Using standard commands in a Docker file, auto-instrumentation was done efficiently and without adding code in multiple places enabling manageability.
- Using OpenTelemetry and its support for multiple languages, DevOps and SRE teams can auto-instrument their applications with ease gaining immediate insights into the health of the entire application stack and reduce mean time to resolution (MTTR).
Since Elastic can support a mix of methods for ingesting data, whether it be using auto-instrumentation of open-source OpenTelemetry or manual instrumentation with its native APM agents, you can plan your migration to OTel by focusing on a few applications first and then using OpenTelemety across your applications later on in a manner that best fits your business needs.
Additional resources for OpenTelemetry with Elastic:
- Elastiflix application, a guide to instrument different languages with OpenTelemetry
- Python: Auto-instrumentation, Manual-instrumentation
- Java: Auto-instrumentation, Manual-instrumentation
- Node.js: Auto-instrumentation, Manual-instrumentation
- .NET: Auto-instrumentation, Manual-instrumentation
- Go: Manual-instrumentation
- Best practices for instrumenting OpenTelemetry
General configuration and use case resources:
- Independence with OpenTelemetry on Elastic
- Modern observability and security on Kubernetes with Elastic and OpenTelemetry
- 3 models for logging with OpenTelemetry and Elastic
- Adding free and open Elastic APM as part of your Elastic Observability deployment
- Capturing custom metrics through OpenTelemetry API in code with Elastic
- Future-proof your observability platform with OpenTelemetry and Elastic
- Elastic Observability: Built for open technologies like Kubernetes, OpenTelemetry, Prometheus, Istio, and more
Don’t have an Elastic Cloud account yet? Sign up for Elastic Cloud and try out the auto-instrumentation capabilities that I discussed above. I would be interested in getting your feedback about your experience in gaining visibility into your application stack with Elastic.
The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all.