Manual instrumentation of Java applications with OpenTelemetry

observability-launch-series-3-java-manual.jpg

In the fast-paced universe of software development, especially in the cloud-native realm, DevOps and SRE teams are increasingly emerging as essential partners in application stability and growth.

DevOps engineers continuously optimize software delivery, while SRE teams act as the stewards of application reliability, scalability, and top-tier performance. The challenge? These teams require a cutting-edge observability solution, one that encompasses full-stack insights, empowering them to rapidly manage, monitor, and rectify potential disruptions before they culminate into operational challenges.

Observability in our modern distributed software ecosystem goes beyond mere monitoring—it demands limitless data collection, precision in processing, and the correlation of this data into actionable insights. However, the road to achieving this holistic view is paved with obstacles: from navigating version incompatibilities to wrestling with restrictive proprietary code.

Enter OpenTelemetry (OTel), with the following benefits for those who adopt it:

  • Escape vendor constraints with OTel, freeing yourself from vendor lock-in and ensuring top-notch observability.
  • See the harmony of unified logs, metrics, and traces come together to provide a complete system view.
  • Improve your application oversight through richer and enhanced instrumentations.
  • Embrace the benefits of backward compatibility to protect your prior instrumentation investments.
  • Embark on the OpenTelemetry journey with an easy learning curve, simplifying onboarding and scalability.
  • Rely on a proven, future-ready standard to boost your confidence in every investment.

In this blog, we will explore how you can use manual instrumentation in your Java application using Docker, without the need to refactor any part of your application code. We will use an application called Elastiflix. This approach is slightly more complex than using automatic instrumentation.

The beauty of this is that there is no need for the otel-collector! This setup enables you to slowly and easily migrate an application to OTel with Elastic according to a timeline that best fits your business.

Application, prerequisites, and config

The application that we use for this blog is called Elastiflix, a movie streaming application. It consists of several micro-services written in .NET, NodeJS, Go, and Python.

Before we instrument our sample application, we will first need to understand how Elastic can receive the telemetry data.

Elastic configuration options for OpenTelemetry
Elastic configuration options for OpenTelemetry

All of Elastic Observability’s APM capabilities are available with OTel data. Some of these include:

  • Service maps
  • Service details (latency, throughput, failed transactions)
  • Dependencies between services, distributed tracing
  • Transactions (traces)
  • Machine learning (ML) correlations
  • Log correlation

In addition to Elastic’s APM and a unified view of the telemetry data, you will also be able to use Elastic’s powerful machine learning capabilities to reduce the analysis, and alerting to help reduce MTTR.

Prerequisites

View the example source code

The full source code, including the Dockerfile used in this blog, can be found on GitHub. The repository also contains the same application without instrumentation. This allows you to compare each file and see the differences.

In particular, we will be working through the following file:

Elastiflix/java-favorite/src/main/java/com/movieapi/ApiServlet.java

The following steps will show you how to instrument this application and run it on the command line or in Docker. If you are interested in a more complete OTel example, take a look at the docker-compose file here, which will bring up the full project.

Before we begin, let’s look at the non-instrumented code first.

Step-by-step guide

Step 0. Log in to your Elastic Cloud account

This blog assumes you have an Elastic Cloud account — if not, follow the instructions to get started on Elastic Cloud.

trial

Step 1. Set up OpenTelemetry

The first step is to set up the OpenTelemetry SDK in your Java application. You can start by adding the OpenTelemetry Java SDK and its dependencies to your project's build file, such as Maven or Gradle. In our example application, we are using Maven. Add the dependencies below to your pom.xml:

  <dependency>
      <groupId>io.opentelemetry.instrumentation</groupId>
      <artifactId>opentelemetry-logback-mdc-1.0</artifactId>
      <version>1.25.1-alpha</version>
    </dependency>

    <dependency>
      <groupId>io.opentelemetry</groupId>
      <artifactId>opentelemetry-api</artifactId>
    </dependency>
    <dependency>
      <groupId>io.opentelemetry</groupId>
      <artifactId>opentelemetry-sdk</artifactId>
    </dependency>
    <dependency>
      <groupId>io.opentelemetry</groupId>
      <artifactId>opentelemetry-exporter-otlp</artifactId>
    </dependency>
    <dependency>
      <groupId>io.opentelemetry</groupId>
      <artifactId>opentelemetry-semconv</artifactId>
    </dependency>
    <dependency>
      <groupId>io.opentelemetry</groupId>
      <artifactId>opentelemetry-exporter-otlp-logs</artifactId>
    </dependency>
    <dependency>
      <groupId>io.opentelemetry.instrumentation</groupId>
      <artifactId>opentelemetry-logback-appender-1.0</artifactId>
      <version>1.25.1-alpha</version>
    </dependency>

And add the following bill of materials from OpenTelemetry too:

  <dependencyManagement>
    <dependencies>
      <dependency>
        <groupId>io.opentelemetry</groupId>
        <artifactId>opentelemetry-bom</artifactId>
        <version>1.25.0</version>
        <type>pom</type>
        <scope>import</scope>
      </dependency>
      <dependency>
        <groupId>io.opentelemetry</groupId>
        <artifactId>opentelemetry-bom-alpha</artifactId>
        <version>1.25.0-alpha</version>
        <type>pom</type>
        <scope>import</scope>
      </dependency>
    </dependencies>
  </dependencyManagement>

Step 2. Add the application configuration

We recommend that you add the following configuration to the application’s main method, to start before any application code. Doing it like this gives you a bit more control and flexibility and ensures that OpenTelemetry will be available at any stage of the application lifecycle. In the examples, we put this code before the Spring Boot Application startup. Elastic supports OTLP over HTTP and OTLP over GRPC. In this example, we are using GRPC.

String SERVICE_NAME = System.getenv("OTEL_SERVICE_NAME");

// set service name on all OTel signals
Resource resource = Resource.getDefault().merge(Resource.create(Attributes.of(ResourceAttributes.SERVICE_NAME,SERVICE_NAME,ResourceAttributes.SERVICE_VERSION,"1.0",ResourceAttributes.DEPLOYMENT_ENVIRONMENT,"production")));

// init OTel logger provider with export to OTLP
SdkLoggerProvider sdkLoggerProvider = SdkLoggerProvider.builder().setResource(resource).addLogRecordProcessor(BatchLogRecordProcessor.builder(OtlpGrpcLogRecordExporter.builder().setEndpoint(System.getenv("OTEL_EXPORTER_OTLP_ENDPOINT")).addHeader("Authorization", "Bearer " + System.getenv("ELASTIC_APM_SECRET_TOKEN")).build()).build()).build();

// init OTel trace provider with export to OTLP
SdkTracerProvider sdkTracerProvider = SdkTracerProvider.builder().setResource(resource).setSampler(Sampler.alwaysOn()).addSpanProcessor(BatchSpanProcessor.builder(OtlpGrpcSpanExporter.builder().setEndpoint(System.getenv("OTEL_EXPORTER_OTLP_ENDPOINT")).addHeader("Authorization", "Bearer " + System.getenv("ELASTIC_APM_SECRET_TOKEN")).build()).build()).build();

// init OTel meter provider with export to OTLP
SdkMeterProvider sdkMeterProvider = SdkMeterProvider.builder().setResource(resource).registerMetricReader(PeriodicMetricReader.builder(OtlpGrpcMetricExporter.builder().setEndpoint(System.getenv("OTEL_EXPORTER_OTLP_ENDPOINT")).addHeader("Authorization", "Bearer " + System.getenv("ELASTIC_APM_SECRET_TOKEN")).build()).build()).build();

// create sdk object and set it as global
OpenTelemetrySdk sdk = OpenTelemetrySdk.builder().setTracerProvider(sdkTracerProvider).setLoggerProvider(sdkLoggerProvider).setMeterProvider(sdkMeterProvider).setPropagators(ContextPropagators.create(W3CTraceContextPropagator.getInstance())).build();

GlobalOpenTelemetry.set(sdk);
// connect logger
GlobalLoggerProvider.set(sdk.getSdkLoggerProvider());
// Add hook to close SDK, which flushes logs
Runtime.getRuntime().addShutdownHook(new Thread(sdk::close));

Step 3. Create the Tracer and start the OpenTelemetry Span inside the TracingFilter

In the Spring Boot, example you will notice that we have a TracingFilter class which extends the OncePerRequestFilter class. This Filter is a component placed at the front of the request processing chain. Its primary roles are to intercept incoming requests and outgoing responses, performing tasks such as logging, authentication, transformation of request/response entities, and more. So what we do here is intercept the request as it comes into the Favorite service, so that we can pull out the headers which may contain tracing information from upstream systems.

We start by using the OpenTelemetry Tracer, which is a core component of OpenTelemetry that allows you to create spans, start and stop them, and add attributes and events. In your Java code, import the necessary OpenTelemetry classes and create an instance of the Tracer within your application.

We use this to create a new downstream span, which will continue as a child from the span created in the upstream system using the information we got from the upstream request. In our Elastiflix example, this will be the nodejs application.

@Override
protected void doFilterInternal(jakarta.servlet.http.HttpServletRequest request, jakarta.servlet.http.HttpServletResponse response, jakarta.servlet.FilterChain filterChain) throws jakarta.servlet.ServletException, IOException {
        Tracer tracer = GlobalOpenTelemetry.getTracer(SERVICE_NAME);

        Context extractedContext = GlobalOpenTelemetry.getPropagators()
                .getTextMapPropagator()
                .extract(Context.current(), request, getter);

        Span span = tracer.spanBuilder(request.getRequestURI())
                .setSpanKind(SpanKind.SERVER)
                .setParent(extractedContext)
                .startSpan();

        try (Scope scope = span.makeCurrent()) {
            filterChain.doFilter(request, response);
        } catch (Exception e) {
            span.setStatus(StatusCode.ERROR);
            throw e;
        } finally {
            span.end();
        }
    }

Step 4. Instrument other interesting code with spans

To instrument with spans and track specific regions of your code, you can use the Tracer's SpanBuilder to create spans. To accurately measure the duration of a specific operation, make sure to start and stop the spans at the appropriate locations in your code. Use the startSpan and endSpan methods provided by the Tracer to mark the beginning and end of the span. For example, you can create a span around a specific method or operation in your code, as shown here in the handleCanary method:

private void handleCanary() throws Exception {
        Span span = GlobalOpenTelemetry.getTracer(SERVICE_NAME).spanBuilder("handleCanary").startSpan();
        Scope scope = span.makeCurrent();
        
///.....


 span.setStatus(StatusCode.OK);

        span.end();

        scope.close();
    }

Step 5. Add attributes and events to spans

You can enhance the spans with additional attributes and events to provide more context and details about the operation being tracked. Attributes can be key-value pairs that describe the span, while events can be used to mark significant points in the span's lifecycle. This is also shown in the handleCanary method:

private void handleCanary() throws Exception {

            Span.current().setAttribute("canary", "test-new-feature");
            Span.current().setAttribute("quiz_solution", "correlations");

            span.addEvent("a span event", Attributes
                    .of(AttributeKey.longKey("someKey"), Long.valueOf(93)));
    }

Step 6. Instrument backends

Let's consider an example where we are instrumenting a Redis database call. We're using the Java OpenTelemetry SDK, and our goal is to create a trace that captures each "Post User Favorites" operation to the database.

Below is the Java method that performs the operation and collects telemetry data:

public void postUserFavorites(String user_id, String movieID) {
  ...
}

Let's go through it line by line:

Initializing a span
The first important line of our method is where we initialize a span. A span represents a single operation within a trace, which could be a database call, a remote procedure call (RPC), or any segment of code that you want to measure.

Span span = GlobalOpenTelemetry.getTracer(SERVICE_NAME).spanBuilder("Redis.Post").setSpanKind(SpanKind.CLIENT).startSpan();

Setting span attributes
Next, we add attributes to our span. Attributes are key-value pairs that provide additional information about the span. In order to get the backend call to appear correctly in the service map, it is critical that the attributes are set correctly for the backend call type. In this example, we set the db.system attribute to redis.

span.setAttribute("db.system", "redis");
span.setAttribute("db.connection_string", redisHost);
span.setAttribute("db.statement", "POST user_id " + user_id +" AND movie_id "+movieID);

This will ensure calls to the backend redis backend are tracked as shown below:

flowchart

Capturing the result of the operation
We then execute the operation we're interested in, within a try-catch block. If an exception occurs during the execution of the operation, we record it in the span.

try (Scope scope = span.makeCurrent()) {
    ...
} catch (Exception e) {
    span.setStatus(StatusCode.ERROR, "Error while getting data from Redis");
    span.recordException(e);
}

Closing resources
Finally, we close the Redis connection and end the span.

finally {
    jedis.close();
    span.end();
}

Step 7. Configure logging

Logging is an essential part of application monitoring and troubleshooting. OpenTelemetry allows you to integrate with existing logging frameworks, such as Logback or Log4j, to capture logs along with the telemetry data. Configure the logging framework of your choice to capture logs related to the instrumented spans. In our example application, check out the logback configuration, which shows how to export logs directly to Elastic. 

<?xml version="1.0" encoding="UTF-8"?>
<configuration debug="true">

    <appender name="otel-otlp"
        class="io.opentelemetry.instrumentation.logback.appender.v1_0.OpenTelemetryAppender">
        <captureExperimentalAttributes>false</captureExperimentalAttributes>
        <captureCodeAttributes>true</captureCodeAttributes>
        <captureKeyValuePairAttributes>true</captureKeyValuePairAttributes>
    </appender>

    <appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
        <encoder>
            <pattern>%d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n</pattern>
        </encoder>
    </appender>

    <root level="DEBUG">
     <appender-ref ref="otel-otlp" />
        <appender-ref ref="STDOUT" />

    </root>
</configuration>

Step 8. Running the Docker image with environment variables

As specified in the OTEL Java documentation, we will use environment variables and pass in the configuration values to enable it to connect with Elastic Observability’s APM server.  

Because Elastic accepts OTLP natively, we just need to provide the Endpoint and authentication where the OTEL Exporter needs to send the data, as well as some other environment variables.

Getting Elastic Cloud variables
You can copy the endpoints and token from Kibana under the path `/app/home#/tutorial/apm`.

apm agents

You will need to copy the following environment variable:

OTEL_EXPORTER_OTLP_ENDPOINT

As well as the token from:

OTEL_EXPORTER_OTLP_HEADERS

Build the Docker image

docker build -t java-otel-manual-image .

Run the Docker image

docker run \
       -e OTEL_EXPORTER_OTLP_ENDPOINT="REPLACE WITH OTEL_EXPORTER_OTLP_ENDPOINT" \
       -e ELASTIC_APM_SECRET_TOKEN="REPLACE WITH TOKEN" \
       -e OTEL_RESOURCE_ATTRIBUTES="service.version=1.0,deployment.environment=production" \
       -e OTEL_SERVICE_NAME="java-favorite-otel-manual" \
       -p 5000:5000 \
       java-otel-manual-image

You can now issue a few requests in order to generate trace data. Note that these requests are expected to return an error, as this service relies on a connection to Redis that you don’t currently have running. As mentioned before, you can find a more complete example using docker-compose here.

curl localhost:5000/favorites

# or alternatively issue a request every second

while true; do curl "localhost:5000/favorites"; sleep 1; done;

Step 9. Explore traces and logs in Elastic APM

Once you have this up and running, you can ping the endpoint for your instrumented service (in our case, this is /favorites), and you should see the app appear in Elastic APM, as shown below:

services

It will begin by tracking throughput and latency critical metrics for SREs to pay attention to.

Digging in, we can see an overview of all our Transactions.

java favorite otel graph

And look at specific transactions:

graph2

Click on Logs, and we see that logs are also brought over. The OTel Agent will automatically bring in logs and correlate them with traces for you:

graph3

This gives you complete visibility across logs, metrics, and traces!

Wrapping up

Manually instrumenting your Java applications with OpenTelemetry gives you greater control over what to track and monitor. By following the steps outlined in this blog post, you can effectively monitor the performance of your Java applications, identify issues, and gain insights into the overall health of your application.

Remember, OpenTelemetry is a powerful tool, and proper instrumentation requires careful consideration of what metrics, traces, and logs are essential for your specific use case. Experiment with different configurations, leverage the OpenTelemetry SDK for Java documentation, and continuously iterate to achieve the observability goals of your application.

In this blog, we discussed the following:

  • How to manually instrument Java with OpenTelemetry 
  • How to properly initialize and instrument span
  • How to easily set the OTLP ENDPOINT and OTLP HEADERS from Elastic without the need for a collector

Hopefully, this provided an easy-to-understand walk-through of instrumenting Java with OpenTelemetry and how easy it is to send traces into Elastic.

Don’t have an Elastic Cloud account yet? Sign up for Elastic Cloud and try out the auto-instrumentation capabilities that I discussed above. I would be interested in getting your feedback about your experience in gaining visibility into your application stack with Elastic.  

The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all.