IMPORTANT: No additional bug fixes or documentation updates will be released for this version. For the latest information, see the current release documentation.

« imap jmx »

› ›

jdbc

edit

IMPORTANT: This documentation is no longer updated. Refer to Elastic's version policy and the latest documentation.

jdbc

edit

This plugin was created as a way to ingest data in any database with a JDBC interface into Logstash. You can periodically schedule ingestion using a cron syntax (see schedule setting) or run the query one time to load data into Logstash. Each row in the resultset becomes a single event. Columns in the resultset are converted into fields in the event.

Drivers

edit

This plugin does not come packaged with JDBC driver libraries. The desired jdbc driver library must be explicitly passed in to the plugin using the jdbc_driver_library configuration option.

Scheduling

edit

Input from this plugin can be scheduled to run periodically according to a specific schedule. This scheduling syntax is powered by rufus-scheduler. The syntax is cron-like with some extensions specific to Rufus (e.g. timezone support ).

Examples:

`* 5 * 1-3 *`	will execute every minute of 5am every day of January through March.
`0 * * * *`	will execute on the 0th minute of every hour every day.
`0 6 * * * America/Chicago`	will execute at 6:00am (UTC/GMT -5) every day.

Further documentation describing this syntax can be found here.

State

edit

The plugin will persist the sql_last_value parameter in the form of a metadata file stored in the configured last_run_metadata_path. Upon query execution, this file will be updated with the current value of sql_last_value. Next time the pipeline starts up, this value will be updated by reading from the file. If clean_run is set to true, this value will be ignored and sql_last_value will be set to Jan 1, 1970, or 0 if use_column_value is true, as if no query has ever been executed.

Dealing With Large Result-sets

edit

Many JDBC drivers use the fetch_size parameter to limit how many results are pre-fetched at a time from the cursor into the client’s cache before retrieving more results from the result-set. This is configured in this plugin using the jdbc_fetch_size configuration option. No fetch size is set by default in this plugin, so the specific driver’s default size will be used.

Usage:

edit

Here is an example of setting up the plugin to fetch data from a MySQL database. First, we place the appropriate JDBC driver library in our current path (this can be placed anywhere on your filesystem). In this example, we connect to the mydb database using the user: mysql and wish to input all rows in the songs table that match a specific artist. The following examples demonstrates a possible Logstash configuration for this. The schedule option in this example will instruct the plugin to execute this input statement on the minute, every minute.

input {
  jdbc {
    jdbc_driver_library => "mysql-connector-java-5.1.36-bin.jar"
    jdbc_driver_class => "com.mysql.jdbc.Driver"
    jdbc_connection_string => "jdbc:mysql://localhost:3306/mydb"
    jdbc_user => "mysql"
    parameters => { "favorite_artist" => "Beethoven" }
    schedule => "* * * * *"
    statement => "SELECT * from songs where artist = :favorite_artist"
  }
}

Configuring SQL statement

edit

A sql statement is required for this input. This can be passed-in via a statement option in the form of a string, or read from a file (statement_filepath). File option is typically used when the SQL statement is large or cumbersome to supply in the config. The file option only supports one SQL statement. The plugin will only accept one of the options. It cannot read a statement from a file as well as from the statement configuration parameter.

Predefined Parameters

edit

Some parameters are built-in and can be used from within your queries. Here is the list:

sql_last_value

The value used to calculate which rows to query. Before any query is run, this is set to Thursday, 1 January 1970, or 0 if use_column_value is true and tracking_column is set. It is updated accordingly after subsequent queries are run.

Example:

input {
  jdbc {
    statement => "SELECT id, mycolumn1, mycolumn2 FROM my_table WHERE id > :sql_last_value"
    use_column_value => true
    tracking_column => id
    # ... other configuration bits
  }
}

Synopsis

edit

This plugin supports the following configuration options:

Required configuration options:

jdbc {
    jdbc_connection_string => ...
    jdbc_driver_class => ...
    jdbc_user => ...
}

Available configuration options:

Setting	Input type	Required	Default value
`add_field`	hash	No	`{}`
`clean_run`	boolean	No	`false`
`codec`	codec	No	`"plain"`
`jdbc_connection_string`	string	Yes
`jdbc_default_timezone`	string	No
`jdbc_driver_class`	string	Yes
`jdbc_driver_library`	string	No
`jdbc_fetch_size`	number	No
`jdbc_page_size`	number	No	`100000`
`jdbc_paging_enabled`	boolean	No	`false`
`jdbc_password`	password	No
`jdbc_pool_timeout`	number	No	`5`
`jdbc_user`	string	Yes
`jdbc_validate_connection`	boolean	No	`false`
`jdbc_validation_timeout`	number	No	`3600`
`last_run_metadata_path`	string	No	`"$USER_HOME/.logstash_jdbc_last_run"`
`lowercase_column_names`	boolean	No	`true`
`parameters`	hash	No	`{}`
`record_last_run`	boolean	No	`true`
`connection_retry_attempts`	number	No	`1`
`connection_retry_attempts_wait_time`	number	No	`0.5`
`schedule`	string	No
`sequel_opts`	hash	No	`{}`
`sql_log_level`	string, one of `["fatal", "error", "warn", "info", "debug"]`	No	`"info"`
`statement`	string	No
`statement_filepath`	a valid filesystem path	No
`tags`	array	No
`tracking_column`	string	No
`type`	string	No
`use_column_value`	boolean	No	`false`

Details

edit

`add_field`

edit

Value type is hash
Default value is {}

Add a field to an event

`clean_run`

edit

Value type is boolean
Default value is false

Whether the previous run state should be preserved

`codec`

edit

Value type is codec
Default value is "plain"

The codec used for input data. Input codecs are a convenient method for decoding your data before it enters the input, without needing a separate filter in your Logstash pipeline.

`connection_retry_attempts`

edit

Value type is number
Default value is 1

`connection_retry_attempts_wait_time`

edit

Value type is number
Default value is 0.5

`jdbc_connection_string`

edit

This is a required setting.
Value type is string
There is no default value for this setting.

JDBC connection string

`jdbc_default_timezone`

edit

Value type is string
There is no default value for this setting.

`jdbc_driver_class`

edit

This is a required setting.
Value type is string
There is no default value for this setting.

JDBC driver class to load, for example, "org.apache.derby.jdbc.ClientDriver". If you are using the Oracle JDBC driver (ojdbc6.jar) the correct jdbc_driver_class is "Java::oracle.jdbc.driver.OracleDriver"

`jdbc_driver_library`

edit

Value type is string
There is no default value for this setting.

JDBC driver library path to third party driver library. If not provided, Plugin will look for the driver class in the Logstash Java classpath.

`jdbc_fetch_size`

edit

Value type is number
There is no default value for this setting.

JDBC fetch size. if not provided, respective driver’s default will be used

`jdbc_page_size`

edit

Value type is number
Default value is 100000

JDBC page size

`jdbc_paging_enabled`

edit

Value type is boolean
Default value is false

JDBC enable paging. This will cause a sql statement to be broken up into multiple queries. Each query will use limits and offsets to collectively retrieve the full result-set. The limit size is set with jdbc_page_size.

`jdbc_password`

edit

Value type is password
There is no default value for this setting.

Password

`jdbc_pool_timeout`

edit

Value type is number
Default value is 5

`jdbc_user`

edit

This is a required setting.
Value type is string
There is no default value for this setting.

`jdbc_validate_connection`

edit

Value type is boolean
Default value is false

Connection pool configuration. Validate connection before use.

`jdbc_validation_timeout`

edit

Value type is number
Default value is 3600

Connection pool configuration. How often to validate a connection (in seconds)

`last_run_metadata_path`

edit

Value type is string
Default value is $USER_HOME/.logstash_jdbc_last_run

Path to file with last run time

`lowercase_column_names`

edit

Value type is boolean
Default value is true

Whether to force the lowercasing of identifier fields

`parameters`

edit

Value type is hash
Default value is {}

Hash of query parameter, for example { "target_id" => "321" }

`record_last_run`

edit

Value type is boolean
Default value is true

Whether to save state or not in last_run_metadata_path

`schedule`

edit

Value type is string
There is no default value for this setting.

Schedule of when to periodically run statement, in Cron format for example: "* * * * *" (execute query every minute, on the minute)

There is no schedule by default. If no schedule is given, then the statement is run exactly once.

`sequel_opts`

edit

Value type is hash
Default value is {}

General/Vendor-specific Sequel configuration options. An example of an optional connection pool configuration max_connections - The maximum number of connections the connection pool examples of vendor-specific options can be found in this documentation page: https://github.com/jeremyevans/sequel/blob/master/doc/opening_databases.rdoc

`sql_log_level`

edit

Value can be any of: fatal, error, warn, info, debug
Default value is "info"

Log level at which to log SQL queries, the accepted values are the common ones fatal, error, warn, info and debug. The default value is info.

`statement`

edit

Value type is string
There is no default value for this setting.

Statement to execute

To use parameters, use named parameter syntax. For example:

"SELECT * FROM MYTABLE WHERE id = :target_id"

here, ":target_id" is a named parameter. You can configure named parameters with the parameters setting.

`statement_filepath`

edit

Value type is path
There is no default value for this setting.

Path of file containing statement to execute

`tags`

edit

Value type is array
There is no default value for this setting.

Add any number of arbitrary tags to your event.

This can help with processing later.

`tracking_column`

edit

Value type is string
There is no default value for this setting.

If tracking column value rather than timestamp, the column whose value is to be tracked

`type`

edit

Value type is string
There is no default value for this setting.

Add a type field to all events handled by this input.

Types are used mainly for filter activation.

The type is stored as part of the event itself, so you can also use the type to search for it in Kibana.

If you try to set a type on an event that already has one (for example when you send an event from a shipper to an indexer) then a new input will not override the existing type. A type set at the shipper stays with that event for its life even when sent to another Logstash server.

`use_column_value`

edit

Value type is boolean
Default value is false

Use an incremental column value rather than a timestamp

« imap jmx »