Reindex from a remote cluster
editReindex from a remote cluster
editYou can use reindex from remote to migrate indices from your old cluster to a new 7.0.1 cluster. This enables you move to 7.0.1 from a pre-6.7 cluster without interrupting service.
Elasticsearch provides backwards compatibility support that enables indices from the previous major version to be upgraded to the current major version. Skipping a major version means that you must resolve any backward compatibility issues yourself.
Elasticsearch does not support forward compatibility across major versions. For example, you cannot reindex from a 7.x cluster into a 6.x cluster.
If you use machine learning features and you’re migrating indices from a 6.5 or earlier cluster, the job and datafeed configuration information are not stored in an index. You must recreate your machine learning jobs in the new cluster. If you are migrating from a 6.6 or later cluster, it is a good idea to temporarily halt the tasks associated with your machine learning jobs and datafeeds to prevent inconsistencies between different machine learning indices that are reindexed at slightly different times. Use the set upgrade mode API or stop all datafeeds and close all machine learning jobs.
To migrate your indices:
-
Set up a new 7.0.1 cluster and add the existing cluster to the
reindex.remote.whitelistinelasticsearch.yml.reindex.remote.whitelist: oldhost:9200
The new cluster doesn’t have to start fully-scaled out. As you migrate indices and shift the load to the new cluster, you can add nodes to the new cluster and remove nodes from the old one.
-
For each index that you need to migrate to the new cluster:
-
Create an index the appropriate mappings and settings. Set the
refresh_intervalto-1and setnumber_of_replicasto0for faster reindexing. -
Use the
reindexAPI to pull documents from the remote index into the new 7.0.1 index:POST _reindex { "source": { "remote": { "host": "http://oldhost:9200", "username": "user", "password": "pass" }, "index": "source", "query": { "match": { "test": "data" } } }, "dest": { "index": "dest" } }If you run the reindex job in the background by setting
wait_for_completiontofalse, the reindex request returns atask_idyou can use to monitor progress of the reindex job with the task API:GET _tasks/TASK_ID. -
When the reindex job completes, set the
refresh_intervalandnumber_of_replicasto the desired values (the default settings are30sand1). -
Once reindexing is complete and the status of the new index is
green, you can delete the old index.
-
Create an index the appropriate mappings and settings. Set the