Migrating from Elastic’s Go APM agent to OpenTelemetry Go SDK

elastic-de-136675-V1_V1_(1).jpg

As we’ve already shared, Elastic is committed to helping OpenTelemetry (OTel) succeed, which means, in some cases, building distributions of language SDKs.

Elastic is strategically standardizing on OTel for observability and security data collection. Additionally, Elastic is committed to working with the OTel community to become the best data collection infrastructure for the observability ecosystem. Elastic is deepening its relationship with OTel beyond the recent contributions of the Elastic Common Schema (ECS) to OpenTelemetry, invokedynamic in the OTel Java agent, and the upcoming profiling agent donation.

Since Elastic version 7.14, Elastic has supported OTel natively by being able to directly ingest OpenTelemetry protocol (OTLP)-based traces, metrics, and logs.

The Go SDK is a bit different from the other language SDKs, as the Go language inherently lacks the dynamicity that would allow building a distribution that is not a fork.

Nevertheless, the absence of a distribution doesn’t mean you shouldn’t use OTel for data collection from Go applications with the Elastic Stack.

Elastic currently has an APM Go agent, but we recommend switching to the OTel Go SDK. In this post, we cover two ways you can do that migration:

  • By replacing all telemetry in your application’s code (a “big bang migration”) and shipping the change

  • By splitting the migration into atomic changes, to reduce the risk of regressions

A big bang migration

The simplest way to migrate from our APM Go agent to the OTel SDK may be by removing all telemetry provided by the agent and replacing it all with the new one.

Automatic instrumentation

Most of your instrumentation may be provided automatically, as it is part of the frameworks or libraries you are using.

For example, if you use the Elastic Go agent, you may be using our net/http auto instrumentation module like this:

import (
	"net/http"
	"go.elastic.co/apm/module/apmhttp/v2"
)


func handler(w http.ResponseWriter, req *http.Request) {
	fmt.Fprintf(w, "Hello World!")
}

func main() {
	http.ListenAndServe(
                  ":8080",
                  apmhttp.Wrap(http.HandlerFunc(handler)),
	)
}

With OpenTelemetry, you would use the otelhttp module instead:

import (
	"net/http"
	"go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp"
)


func handler(w http.ResponseWriter, req *http.Request) {
	fmt.Fprintf(w, "Hello World!")
}

func main() {
	http.ListenAndServe(
                  ":8080",
                  otelhttp.NewHandler(http.HandlerFunc(handler), "http"),
	)
}

You should perform this same change for every other module you use from our agent.

Manual instrumentation

Your application may also have manual instrumentations, which consist of creating traces and spans directly within your application code by calling the Elastic APM agent API.

You may be creating transactions and spans like this with Elastic’s APM SDK:

import (
	"go.elastic.co/apm/v2"
)

func main() {
       // Create a transaction, and assign it to the context.
       tx :=  apm.DefaultTracer().StartTransaction("GET /", "request")
       defer tx.End()
       ctx = apm.ContextWithTransaction(ctx, tx)

       // Create a span
       span, ctx := apm.StartSpan(ctx, "span")
       defer span.End()
}

OpenTelemetry uses the same API for both transactions and spans — what Elastic considers “transactions” are just considered spans with no parent in OTel (“root spans”).

So, your instrumentation becomes the following:

import (
	"go.opentelemetry.io/otel/trace"
)

func main() {
	tracer := otel.Tracer("my library")

	// Create a root span.
	// It is assigned to the returned context automatically.
	ctx, span := tracer.Start(ctx, "GET /")
	defer span.End()

	// Create a child span (as the context has a parent).
	ctx, span := tracer.Start(ctx, "span")
	defer span.End()
}

With a big bang migration, you will need to migrate everything before shipping it to production. You cannot split the migration into smaller chunks.

For small applications or ones that only use automatic instrumentation, that constraint may be fine. It allows you to quickly validate the migration and move on.

However, if you are working on a complex set of services, a large application, or one with a lot of manual instrumentation, you probably want to be able to ship code multiple times during the migration instead of all at once.

An atomic migration

An atomic migration would be one where you can ship atomic changes gradually and have your application keep working normally. Then, you are able to pull the final plug only at the end, once you are ready to do so.

To help with atomic migrations, we provide a bridge between our APM Go agent and OpenTelemetry.

This bridge allows you to run both our agent and OTel alongside each other and to have instrumentations with both libraries in the same process with the data being transmitted to the same location and in the same format.

You can configure the OTel bridge with our agent like this:

import (
	"go.elastic.co/apm/v2"
	"go.elastic.co/apm/module/apmotel/v2"

	"go.opentelemetry.io/otel"
)

func main() {
	provider, err := apmotel.NewTracerProvider()
	if err != nil {
		log.Fatal(err)
	}
	otel.SetTracerProvider(provider)
}

Once this configuration is set, every span created by OTel will be transmitted to the Elastic APM agent.

With this bridge, you can make your migration much safer with the following process:

  • Add the bridge to your application.

  • Switch one instrumentation (automatic or manual) from the agent to OpenTelemetry, as you would have done for the big bang migration above but a single one at a time.

    • Repeat until everything has been migrated.

  • Remove the bridge and our agent, and configure OpenTelemetry to transmit the data via its SDK.

Each of those steps can be a single change within your application and go to production right away.

If any issue arises during the migration process, you should then be able to see it immediately and fix it before moving on.

Observability benefits from building with OTel

As OTel is quickly becoming an industry standard, and Elastic is committed to making it even better, it can be very beneficial to your engineering teams to migrate to it.

In Go, whether you do this through a big bang migration or using Elastic’s OTel bridge, doing so will allow you to benefit from instrumentations maintained by the global community to make your observability even more effective and to better understand what’s happening within your application.

The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all.