Álex Cámara

Migrate Logstash Pipelines from Azure Event Hubs to OTel Collector Kafka Receiver

Step-by-step guide to migrating Logstash pipelines from the Azure Event Hubs plugin to the OpenTelemetry Collector Kafka receiver.

Introduction

This article is a companion guide to the Logstash Azure Event Hubs to Kafka input plugin migration, covering an alternative path: replacing logstash-input-azure_event_hubs with the OpenTelemetry Collector kafka receiver to consume from the Azure Event Hubs Kafka endpoint. For the reasons to migrate, authentication considerations, and key behavior changes such as offset handling, refer to the original article.

Reference: For detailed OTel Kafka receiver configuration options or parameter default values, see the Kafka Receiver README.

Converting your configuration

TLS configuration

Azure Event Hubs requires TLS for all Kafka connections on port 9093. The tls: {} block enables TLS with default settings (system CA certificates, no client certificate), which is sufficient for Azure Event Hubs. Omitting this block will cause the connection to fail because the broker expects a TLS handshake.

Encoding

The encoding field controls how the receiver interprets each Kafka message payload. For events consumed from Azure Event Hubs, the most common options are:

  • text: decodes the payload as text and inserts it as the body of a log record. Uses UTF-8 by default; use text_<ENCODING> (e.g., text_shift_jis) for other character sets.
  • raw: inserts the payload bytes as-is into the log record body.
  • json: decodes the payload as JSON and inserts it as the log record body.
  • azure_resource_logs: converts Azure Resource Logs format to OpenTelemetry format.

Additional encodings such as otlp_proto, otlp_json, and trace-specific formats (jaeger_proto, zipkin_json, etc.) are also available. See the Kafka Receiver README for the full list.

Basic configuration

Minimal configuration to consume logs from one Event Hub with SASL/PLAIN.

receivers:
  kafka:
    brokers:
      - "<NAMESPACE>.servicebus.windows.net:9093"
    group_id: "<CONSUMER_GROUP_NAME>"
    auth:
      sasl:
        username: "$ConnectionString"
        password: "Endpoint=sb://<NAMESPACE>.servicebus.windows.net/;SharedAccessKeyName=<ACCESS_KEY_NAME>;SharedAccessKey=<ACCESS_KEY>"
        mechanism: "PLAIN"
    tls: {}
    logs:
      topics:
        - "<EVENT_HUB_NAME>"
      encoding: text

Advanced configuration

Example with multiple Event Hubs.

receivers:
  kafka/eh1:
    brokers:
      - "<NAMESPACE>.servicebus.windows.net:9093"
    group_id: "<CONSUMER_GROUP_1>"
    auth:
      sasl:
        username: "$ConnectionString"
        password: "Endpoint=sb://<NAMESPACE>.servicebus.windows.net/;SharedAccessKeyName=<KEY_1>;SharedAccessKey=<ACCESS_KEY_1>"
        mechanism: "PLAIN"
    tls: {}
    logs:
      topics:
        - "<EVENT_HUB_1>"
      encoding: text

  kafka/eh2:
    brokers:
      - "<NAMESPACE>.servicebus.windows.net:9093"
    group_id: "<CONSUMER_GROUP_2>"
    auth:
      sasl:
        username: "$ConnectionString"
        password: "Endpoint=sb://<NAMESPACE>.servicebus.windows.net/;SharedAccessKeyName=<KEY_2>;SharedAccessKey=<ACCESS_KEY_2>"
        mechanism: "PLAIN"
    tls: {}
    logs:
      topics:
        - "<EVENT_HUB_2>"
      encoding: text

Configuration parameters mapping

The following section maps each logstash-input-azure_event_hubs parameter to its OpenTelemetry Collector kafka receiver equivalent.

  1. checkpoint_interval: Direct mapping to autocommit.interval.

    Units: Azure checkpoint_interval is in seconds. OTel autocommit.interval requires a duration string (e.g., 10s, 500ms).

    Azure config:

    input {
        azure_event_hubs {
            # ... other params ...
            checkpoint_interval => 10 # Default 5
        }
    }
    

    OTel receiver equivalent:

    receivers:
      kafka:
        # ... other params ...
        autocommit:
          interval: 10s # Default 1s
    
  2. initial_position: Maps to initial_offset.

    Azure config:

    input {
        azure_event_hubs {
            initial_position => "end"
        }
    }
    

    OTel receiver equivalent:

    receivers:
      kafka:
        initial_offset: latest
    

    Value mapping:

    Azure valueOTel value
    beginningearliest
    endlatest (default)
    look_backNot directly supported

    Note: Since the Kafka receiver can't read the old Blob Storage checkpoints, it treats the migration as a first-time connection. To avoid reprocessing data the legacy plugin already handled, set initial_offset: latest for the initial deployment.

  3. max_batch_size: No direct 1:1 mapping.

    In OTel, the maximum batch of events processed cannot be directly controlled by the receiver. The receiver only controls how much data is read per fetch request using min_fetch_size, max_fetch_size, and max_fetch_wait.

    The actual event batching happens at the processing layer via the batch processor, which groups telemetry at the configured pipeline stage.

    Units: min_fetch_size and max_fetch_size are in bytes. max_fetch_wait uses duration strings (e.g., 250ms). send_batch_size is the number of records. timeout uses duration strings (e.g., 5s).

    Azure config:

    input {
        azure_event_hubs {
            max_batch_size => 125
        }
    }
    

    OTel receiver example:

    receivers:
      kafka:
        max_fetch_size: 2097152  # bytes (2 MiB)
        max_fetch_wait: 250ms
    
    processors:
      batch:
        send_batch_size: 125  # number of log records
    
  4. threads: No direct mapping.

    Event Hubs distribute work by partition. A single Collector Kafka client can read from multiple partitions in parallel because the underlying Kafka client (franz-go) uses internal goroutines to fetch and process partition data concurrently. This concurrency is handled internally and is not configurable via a user-facing threads setting.

  5. decorate_events: Not supported by Kafka receiver.

Performance comparison

These results use the same test environment described in the companion article: same Event Hub namespace, same number of partitions, and same batch/thread configuration. The absolute numbers are environment-specific, but the relative difference is what matters.

ComponentPayloadThroughput (events/s)
Logstash azure_event_hubs plugin100B~5700
OTel Collector kafka receiver100B~10900
Logstash azure_event_hubs plugin1KB~1500
OTel Collector kafka receiver1KB~1900
Logstash azure_event_hubs plugin10KB~170
OTel Collector kafka receiver10KB~190

Across all payload sizes, the OTel Collector kafka receiver outperforms the Logstash azure_event_hubs plugin, with the largest gain at small payloads (~1.9x at 100B) where protocol overhead dominates, narrowing at larger sizes (~1.3x at 1KB, ~1.1x at 10KB). It does not reach the throughput of the Logstash kafka plugin from the companion article, but it improves on the legacy plugin across all tested payload sizes. Combined with the removal of the Blob Storage and GPv2 dependencies, the OTel Collector path removes two pieces of infrastructure that need to be provisioned, secured, and monitored.

Conclusions

Both migration paths eliminate the Blob Storage checkpoint dependency and improve throughput over the legacy azure_event_hubs plugin. The Logstash kafka plugin is the lower-friction option: the configuration change is minimal, the offset model carries over, and it delivers the highest throughput of the options tested. The OTel Collector kafka receiver is the better fit if you want to remove Logstash from the pipeline entirely and align with OpenTelemetry. It trades a lower peak throughput and no decorate_events equivalent for a vendor-neutral ingestion layer that can run alongside other OTel Collector pipelines in the same Collector.

Next steps

With the GPv1 retirement deadline (October 2026) approaching, starting this migration sooner reduces the time spent managing storage infrastructure that is no longer needed.

If any issues arise during migration:

Related resources

Share this article