Migrate Elastic Cloud Hosted data to Elastic Cloud Serverless with Logstash
Stack ECH
Logstash is a data collection engine that uses a large ecosystem of plugins to collect, process, and forward data from a variety of sources to a variety of destinations. Here we focus on using the Elasticsearch input plugin to read from your Elastic Cloud Hosted deployment, and the Elasticsearch output plugin to write to your Elastic Cloud Serverless project.
Familiarity with Elastic Cloud Hosted, Elasticsearch, and Logstash is helpful, but not required.
This guide focuses on a basic data migration scenario for moving static data from an Elastic Cloud Hosted deployment to a Elastic Cloud Serverless project.
The Elasticsearch input plugin offers additional configuration options that can support more advanced use cases and migrations. More information about those options is available near the end of this topic.
- Elastic Cloud Hosted deployment with data to migrate
- Elastic Cloud Serverless project configured and running
- Logstash installed on your local machine or server
- API keys in Logstash format for authentication with both deployments
Kibana assets much be migrated separately using the Kibana export/import APIs or recreated manually. Templates, data stream definitions, and ILM policies, must be in place before you start data migration.
Visual components, such dashboard and visualizations, can be migrated after you have migrated the data.
Create a new Logstash pipeline configuration file (migration.conf) using the Elasticsearch input and the Elasticsearch output:
- The input reads from your Elastic Cloud Hosted.
- The output writes to your Elastic Cloud Serverless project.
input {
elasticsearch {
cloud_id => "<HOSTED_DEPLOYMENT_CLOUD_ID>"
api_key => "<HOSTED_API_KEY>"
index => "index_pattern*"
docinfo => true # Includes metadata about each document, such as its original index name or doc ID. This metadata can be used to preserve index information on the destination cluster.
}
}
- Connects Logstash to your Elastic Cloud Hosted deployment using its Cloud ID.
- API key for authenticating the connection.
- The index or index pattern (such as logs-*,metrics-*).
To migrate multiple indexes at the same time, use a wildcard in the index name. For example, index => "logs-*" migrates all indices starting with logs-.
output {
elasticsearch {
hosts => [ "https://<SERVERLESS_HOST_URL>:443" ]
api_key => "<SERVERLESS_API_KEY>"
index => "%{[@metadata][input][elasticsearch][_index]}"
}
stdout { codec => rubydebug { metadata => true } }
}
- URL for your Serverless project URL, set port as 443
- API key (in Logstash format) for your Serverless project
- Instruction to retain original index names
When you create an API key for Logstash, be sure to select Logstash from the API key format dropdown. This option formats the API key in the correct id:api_key format required by Logstash.
Start Logstash:
bin/logstash -f migration.conf
After running Logstash, verify that the data has been migrated successfully:
- Log in to your Elastic Cloud Serverless project.
- Navigate to Index Management and select the relevant index.
- Confirm that the migrated data is visible.
The Elasticsearch input includes more configuration options that offer greater flexibility and can handle more advanced migrations. Some options that can be particularly relevant for a migration use case are:
size- Controls how many documents are retrieved per scroll. Larger values increase throughput, but use more memory.slices- Enables parallel reads from the source index.scroll- Adjusts how long Elasticsearch keeps the scroll context alive.
Serverless Stack
The Elasticsearch input plugin supports cursor-like pagination functionality, unlocking more advanced migration features, including the ability to resume migration tasks after a Logstash restart, and support for ongoing data migration over time. Tracking field options are:
tracking_field- Plugin records the value of a field for the last document retrieved in a run.tracking_field_seed- Sets the starting value fortracking_fieldif nolast_run_metadata_pathis set.
Check out the Elasticsearch input plugin documentation for more details and code samples: Tracking a field's value across runs.