kafkaedit

  • Version: 5.1.0
  • Released on: 2016-11-17
  • Changelog
  • Compatible: 5.1.2, 5.1.1, 5.0.0, 2.4.1, 2.4.0

This input will read events from a Kafka topic. It uses the the newly designed 0.10 version of consumer API provided by Kafka to read messages from the broker.

Here’s a compatibility matrix that shows the Kafka client versions that are compatible with each combination of Logstash and the Kafka input plugin:

Kafka Client Version Logstash Version Plugin Version Security Features Why?

0.8

2.0.0 - 2.x.x

<3.0.0

Legacy, 0.8 is still popular

0.9

2.0.0 - 2.3.x

3.x.x

Basic Auth, SSL

Works with the old Ruby Event API (event['product']['price'] = 10)

0.9

2.4.0 - 5.0.x

4.x.x

Basic Auth, SSL

Works with the new getter/setter APIs (event.set('[product][price]', 10))

0.10

2.4.0 - 5.0.x

5.x.x

Basic Auth, SSL

Not compatible with the 0.9 broker

Note

We recommended that you use matching Kafka client and broker versions. During upgrades, you should upgrade brokers before clients because brokers target backwards compatibility. For example, the 0.9 broker is compatible with both the 0.8 consumer and 0.9 consumer APIs, but not the other way around.

The Logstash Kafka consumer handles group management and uses the default offset management strategy using Kafka topics.

Logstash instances by default form a single logical group to subscribe to Kafka topics Each Logstash Kafka consumer can run multiple threads to increase read throughput. Alternatively, you could run multiple Logstash instances with the same group_id to spread the load across physical machines. Messages in a topic will be distributed to all Logstash instances with the same group_id.

Ideally you should have as many threads as the number of partitions for a perfect balance — more threads than partitions means that some threads will be idle

For more information see http://kafka.apache.org/documentation.html#theconsumer

Kafka consumer configuration: http://kafka.apache.org/documentation.html#consumerconfigs

This version also adds support for SSL/TLS security connection to Kafka. By default SSL is disabled but can be turned on as needed.

 

Synopsisedit

This plugin supports the following configuration options:

Required configuration options:

kafka {
}

Available configuration options:

Setting Input typeRequiredDefault value

add_field

hash

No

{}

auto_commit_interval_ms

string

No

"5000"

auto_offset_reset

string

No

bootstrap_servers

string

No

"localhost:9092"

check_crcs

string

No

client_id

string

No

"logstash"

codec

codec

No

"plain"

connections_max_idle_ms

string

No

consumer_threads

number

No

1

decorate_events

boolean

No

false

enable_auto_commit

string

No

"true"

enable_metric

boolean

No

true

exclude_internal_topics

string

No

fetch_max_wait_ms

string

No

fetch_min_bytes

string

No

group_id

string

No

"logstash"

heartbeat_interval_ms

string

No

id

string

No

jaas_path

a valid filesystem path

No

kerberos_config

a valid filesystem path

No

key_deserializer_class

string

No

"org.apache.kafka.common.serialization.StringDeserializer"

max_partition_fetch_bytes

string

No

max_poll_records

string

No

metadata_max_age_ms

string

No

partition_assignment_strategy

string

No

poll_timeout_ms

number

No

100

receive_buffer_bytes

string

No

reconnect_backoff_ms

string

No

request_timeout_ms

string

No

retry_backoff_ms

string

No

sasl_kerberos_service_name

string

No

sasl_mechanism

string

No

"GSSAPI"

security_protocol

string, one of ["PLAINTEXT", "SSL", "SASL_PLAINTEXT", "SASL_SSL"]

No

"PLAINTEXT"

send_buffer_bytes

string

No

session_timeout_ms

string

No

ssl_key_password

password

No

ssl_keystore_location

a valid filesystem path

No

ssl_keystore_password

password

No

ssl_keystore_type

string

No

ssl_truststore_location

a valid filesystem path

No

ssl_truststore_password

password

No

ssl_truststore_type

string

No

tags

array

No

topics

array

No

["logstash"]

topics_pattern

string

No

type

string

No

value_deserializer_class

string

No

"org.apache.kafka.common.serialization.StringDeserializer"

Detailsedit

 

add_fieldedit

  • Value type is hash
  • Default value is {}

Add a field to an event

auto_commit_interval_msedit

  • Value type is string
  • Default value is "5000"

The frequency in milliseconds that the consumer offsets are committed to Kafka.

auto_offset_resetedit

  • Value type is string
  • There is no default value for this setting.

What to do when there is no initial offset in Kafka or if an offset is out of range:

  • earliest: automatically reset the offset to the earliest offset
  • latest: automatically reset the offset to the latest offset
  • none: throw exception to the consumer if no previous offset is found for the consumer’s group
  • anything else: throw exception to the consumer.

bootstrap_serversedit

  • Value type is string
  • Default value is "localhost:9092"

A list of URLs to use for establishing the initial connection to the cluster. This list should be in the form of host1:port1,host2:port2 These urls are just used for the initial connection to discover the full cluster membership (which may change dynamically) so this list need not contain the full set of servers (you may want more than one, though, in case a server is down).

check_crcsedit

  • Value type is string
  • There is no default value for this setting.

Automatically check the CRC32 of the records consumed. This ensures no on-the-wire or on-disk corruption to the messages occurred. This check adds some overhead, so it may be disabled in cases seeking extreme performance.

client_idedit

  • Value type is string
  • Default value is "logstash"

The id string to pass to the server when making requests. The purpose of this is to be able to track the source of requests beyond just ip/port by allowing a logical application name to be included.

codecedit

  • Value type is codec
  • Default value is "plain"

The codec used for input data. Input codecs are a convenient method for decoding your data before it enters the input, without needing a separate filter in your Logstash pipeline.

connections_max_idle_msedit

  • Value type is string
  • There is no default value for this setting.

Close idle connections after the number of milliseconds specified by this config.

consumer_threadsedit

  • Value type is number
  • Default value is 1

Ideally you should have as many threads as the number of partitions for a perfect balance — more threads than partitions means that some threads will be idle

decorate_eventsedit

  • Value type is boolean
  • Default value is false

Option to add Kafka metadata like topic, message size to the event. This will add a field named kafka to the logstash event containing the following attributes: topic: The topic this message is associated with consumer_group: The consumer group used to read in this event partition: The partition this message is associated with offset: The offset from the partition this message is associated with key: A ByteBuffer containing the message key

enable_auto_commitedit

  • Value type is string
  • Default value is "true"

If true, periodically commit to Kafka the offsets of messages already returned by the consumer. This committed offset will be used when the process fails as the position from which the consumption will begin.

enable_metricedit

  • Value type is boolean
  • Default value is true

Disable or enable metric collection and reporting for this specific plugin instance. By default we record metrics from all plugins, but you can disable metrics collection for a specific plugin.

exclude_internal_topicsedit

  • Value type is string
  • There is no default value for this setting.

Whether records from internal topics (such as offsets) should be exposed to the consumer. If set to true the only way to receive records from an internal topic is subscribing to it.

fetch_max_wait_msedit

  • Value type is string
  • There is no default value for this setting.

The maximum amount of time the server will block before answering the fetch request if there isn’t sufficient data to immediately satisfy fetch_min_bytes. This should be less than or equal to the timeout used in poll_timeout_ms

fetch_min_bytesedit

  • Value type is string
  • There is no default value for this setting.

The minimum amount of data the server should return for a fetch request. If insufficient data is available the request will wait for that much data to accumulate before answering the request.

group_idedit

  • Value type is string
  • Default value is "logstash"

The identifier of the group this consumer belongs to. Consumer group is a single logical subscriber that happens to be made up of multiple processors. Messages in a topic will be distributed to all Logstash instances with the same group_id

heartbeat_interval_msedit

  • Value type is string
  • There is no default value for this setting.

The expected time between heartbeats to the consumer coordinator. Heartbeats are used to ensure that the consumer’s session stays active and to facilitate rebalancing when new consumers join or leave the group. The value must be set lower than session.timeout.ms, but typically should be set no higher than 1/3 of that value. It can be adjusted even lower to control the expected time for normal rebalances.

  • Value type is string
  • There is no default value for this setting.

Add a unique named ID to the plugin instance. This ID is used for tracking information for a specific configuration of the plugin and will be useful for debugging purposes.

output {
 stdout {
   id => "debug_stdout"
 }
}

If you don’t explicitly set this field, Logstash will generate a unique name.

jaas_pathedit

  • Value type is path
  • There is no default value for this setting.

The Java Authentication and Authorization Service (JAAS) API supplies user authentication and authorization services for Kafka. This setting provides the path to the JAAS file. Sample JAAS file for Kafka client:

KafkaClient {
  com.sun.security.auth.module.Krb5LoginModule required
  useTicketCache=true
  renewTicket=true
  serviceName="kafka";
  };

Please note that specifying jaas_path and kerberos_config in the config file will add these to the global JVM system properties. This means if you have multiple Kafka inputs, all of them would be sharing the same jaas_path and kerberos_config. If this is not desirable, you would have to run separate instances of Logstash on different JVM instances.

kerberos_configedit

  • Value type is path
  • There is no default value for this setting.

Optional path to kerberos config file. This is krb5.conf style as detailed in https://web.mit.edu/kerberos/krb5-1.12/doc/admin/conf_files/krb5_conf.html

key_deserializer_classedit

  • Value type is string
  • Default value is "org.apache.kafka.common.serialization.StringDeserializer"

Java Class used to deserialize the record’s key

max_partition_fetch_bytesedit

  • Value type is string
  • There is no default value for this setting.

The maximum amount of data per-partition the server will return. The maximum total memory used for a request will be <code>#partitions * max.partition.fetch.bytes</code>. This size must be at least as large as the maximum message size the server allows or else it is possible for the producer to send messages larger than the consumer can fetch. If that happens, the consumer can get stuck trying to fetch a large message on a certain partition.

max_poll_recordsedit

  • Value type is string
  • There is no default value for this setting.

The maximum number of records returned in a single call to poll().

metadata_max_age_msedit

  • Value type is string
  • There is no default value for this setting.

The period of time in milliseconds after which we force a refresh of metadata even if we haven’t seen any partition leadership changes to proactively discover any new brokers or partitions

partition_assignment_strategyedit

  • Value type is string
  • There is no default value for this setting.

The class name of the partition assignment strategy that the client will use to distribute partition ownership amongst consumer instances

poll_timeout_msedit

  • Value type is number
  • Default value is 100

Time kafka consumer will wait to receive new messages from topics

receive_buffer_bytesedit

  • Value type is string
  • There is no default value for this setting.

The size of the TCP receive buffer (SO_RCVBUF) to use when reading data.

reconnect_backoff_msedit

  • Value type is string
  • There is no default value for this setting.

The amount of time to wait before attempting to reconnect to a given host. This avoids repeatedly connecting to a host in a tight loop. This backoff applies to all requests sent by the consumer to the broker.

request_timeout_msedit

  • Value type is string
  • There is no default value for this setting.

The configuration controls the maximum amount of time the client will wait for the response of a request. If the response is not received before the timeout elapses the client will resend the request if necessary or fail the request if retries are exhausted.

retry_backoff_msedit

  • Value type is string
  • There is no default value for this setting.

The amount of time to wait before attempting to retry a failed fetch request to a given topic partition. This avoids repeated fetching-and-failing in a tight loop.

sasl_kerberos_service_nameedit

  • Value type is string
  • There is no default value for this setting.

The Kerberos principal name that Kafka broker runs as. This can be defined either in Kafka’s JAAS config or in Kafka’s config.

sasl_mechanismedit

  • Value type is string
  • Default value is "GSSAPI"

SASL mechanism used for client connections. This may be any mechanism for which a security provider is available. GSSAPI is the default mechanism.

security_protocoledit

  • Value can be any of: PLAINTEXT, SSL, SASL_PLAINTEXT, SASL_SSL
  • Default value is "PLAINTEXT"

Security protocol to use, which can be either of PLAINTEXT,SSL,SASL_PLAINTEXT,SASL_SSL

send_buffer_bytesedit

  • Value type is string
  • There is no default value for this setting.

The size of the TCP send buffer (SO_SNDBUF) to use when sending data

session_timeout_msedit

  • Value type is string
  • There is no default value for this setting.

The timeout after which, if the poll_timeout_ms is not invoked, the consumer is marked dead and a rebalance operation is triggered for the group identified by group_id

ssl (DEPRECATED)edit

  • DEPRECATED WARNING: This configuration item is deprecated and may not be available in future versions.
  • Value type is boolean
  • Default value is false

Enable SSL/TLS secured communication to Kafka broker.

ssl_key_passwordedit

  • Value type is password
  • There is no default value for this setting.

The password of the private key in the key store file.

ssl_keystore_locationedit

  • Value type is path
  • There is no default value for this setting.

If client authentication is required, this setting stores the keystore path.

ssl_keystore_passwordedit

  • Value type is password
  • There is no default value for this setting.

If client authentication is required, this setting stores the keystore password

ssl_keystore_typeedit

  • Value type is string
  • There is no default value for this setting.

The keystore type.

ssl_truststore_locationedit

  • Value type is path
  • There is no default value for this setting.

The JKS truststore path to validate the Kafka broker’s certificate.

ssl_truststore_passwordedit

  • Value type is password
  • There is no default value for this setting.

The truststore password

ssl_truststore_typeedit

  • Value type is string
  • There is no default value for this setting.

The truststore type.

tagsedit

  • Value type is array
  • There is no default value for this setting.

Add any number of arbitrary tags to your event.

This can help with processing later.

topicsedit

  • Value type is array
  • Default value is ["logstash"]

A list of topics to subscribe to, defaults to ["logstash"].

topics_patternedit

  • Value type is string
  • There is no default value for this setting.

A topic regex pattern to subscribe to. The topics configuration will be ignored when using this configuration.

typeedit

  • Value type is string
  • There is no default value for this setting.

Add a type field to all events handled by this input.

Types are used mainly for filter activation.

The type is stored as part of the event itself, so you can also use the type to search for it in Kibana.

If you try to set a type on an event that already has one (for example when you send an event from a shipper to an indexer) then a new input will not override the existing type. A type set at the shipper stays with that event for its life even when sent to another Logstash server.

value_deserializer_classedit

  • Value type is string
  • Default value is "org.apache.kafka.common.serialization.StringDeserializer"

Java Class used to deserialize the record’s value