How to

How to migrate from Swiftype App Search to Elastic Cloud

If you are a current App Search user on Swiftype.com and you’d like to move to Elastic Cloud to benefit from the latest developments — look no further! Here’s a sampling of all the long term benefits you’ll enjoy after migrating:

  • Faster performance, plus data locality
  • More flexibility and scalability 
  • Easier log and analytics management
Note: If you’re not yet sure about moving you can read about the benefits of migrating.

In this blog we look at automating engine transfers from App Search on Swiftype.com to App Search on Elastic Cloud, using nothing but simple Python scripts and the available APIs. At the end of this process, you’ll be left with exactly the same engines on Elastic Cloud that you had on Swiftype.com. 

Before proceeding, we highly recommend looking over the migration guide to get an understanding of the steps involved, as well as the impact of this migration.

You’ll need your existing App Search engines on Swiftype.com and a new deployment on Elastic Cloud with Enterprise Search enabled to get ready to move. If you’re not sure how to get started, have a look at this Quick Start to get things up and running in Elastic Cloud.

Scripting the steps

Let’s consider the steps we must take along with code examples that allow us to automate things via some Python scripting (see the full code at the end of the blog post).

Step 1: Download the Swiftype App Search engine metadata

  • First, we use App Search APIs to download all the engine metadata we need.
  • To script this step, we used 1_get_settings_st.py
    • We’ll need to input the engine names, a local directory to store files temporarily, and the API key/endpoint for Swiftype App Search.
  • Then save and run the script. This will download the schema, settings, curations, and synonyms for each engine locally to disk in .json files.

Step 2: Create new Elastic Cloud App Search engines

  • For this step, 2_create_eng_ec iterates the engine names and creates them on Elastic Cloud.
    • We’ll need to input the same engine names and directory from the previous step.
    • This time we'll need to input the Elastic Cloud App Search API key/endpoint.
  • It also populates the schema for each of the created engines with the previously downloaded schemas.

Step 3: Backfill documents in Elastic Cloud App Search (optional)

  • We strongly recommend performing this step by backfilling directly from source. It alleviates unnecessary complications. However, we know this isn’t always possible, so the script will move these over for you if needed.
  • The script 3_migrate_docs iterates pages of documents in Swiftype App Search and copies them over to App Search on Elastic Cloud.
    • Once again, we’ll need the API key/endpoint for both instances of App Search.
    • This time we don’t use any local storage — simply hold documents in memory to be migrated over in chunks.
  • Note: There is a cap of 10,000 documents that can be retrieved via the List Documents API so if the number of documents in the engine is above this number, the script will not move all your documents.
    Instead, you’ll need to use the Search API to retrieve fewer than 10,000 documents and populate them in chunks. We don’t have a standard piece of code to automate this process as it will be unique to the data being moved - however, if you’d like assistance with this, Elastic offers Consulting and Support services to help you through it.

Step 4: Migrate the search settings, curations, and synonyms

  • These were previously downloaded in Step 1.
  • The script 4_add_settings.py will iterate all engines and add all of the settings for these engines in Elastic Cloud App Search.
    • We’ll need to populate the App Search API key/endpoint and local directory path again.

Step 5: Migrate the result settings and user roles

Finishing up

After testing, running, and verifying the results of the steps above, there are a few things to iron out:

  • The client applications currently consuming App Search APIs on Swiftype.com will need to be switched over to Elastic Cloud.
  • The parity between both engines should be tested by sending queries and writes to the new engines.
    • If documents were added to the old engine while the migration was in progress, the new data needs to be copied over to ensure it’s completely up to date.
  • The analytics from Swiftype.com can be exported and backed up if you'll need them.
  • Finally, after verifying that there are no new events in the Swiftype App Search API logs, we can decommission the old engines.

More info

To help you learn more, we put together a webinar that highlights all the benefits of migrating to App Search on Elastic Cloud, including an overview of the process.

If you’d like to take a spin around App Search on Elastic Cloud, you can always start up a free trial.

Appendix

1_get_settings_st.py
import requests
import json
# Populate API key, host, temp folder and engine names
headers={'Content-Type': 'application/json','Authorization': 'Bearer private-<API_KEY>'}
host='https://host-abc123.api.swiftype.com'
data_folder='/tmp/'
engines = ['dev-products','prod-products']
for engine in engines:
st_api = host+'/api/as/v1/engines/'+engine+'/'
print('Downloading engine metadata: '+engine)
# Get Schema
st_schema = st_api+'schema'
r = requests.get(st_schema,headers=headers)
with open(data_folder+engine+'_schema.json', "w") as file:
file.write(json.dumps(r.json()))
# Get search settings
st_settings = st_api+'search_settings'
r = requests.get(st_settings,headers=headers)
with open(data_folder+engine+'_settings.json', "w") as file:
file.write(json.dumps(r.json()))
# Get curations
st_curations = st_api+'curations'
r = requests.get(st_curations,headers=headers)
with open(data_folder+engine+'_curations.json', "w") as file:
file.write(json.dumps(r.json()))
# Get synonyms
st_synonyms = st_api+'synonyms'
r = requests.get(st_synonyms,headers=headers)
with open(data_folder+engine+'_synonyms.json', "w") as file:
file.write(json.dumps(r.json()))
2_create_eng_ec
import requests
import json
# Populate API key, host, temp folder and engine names
headers={'Content-Type': 'application/json','Authorization': 'Bearer private-<API_KEY>'}
data_folder='/tmp/'
host='https://abc123.ent-search.westeurope.azure.elastic-cloud.com'
engines = ['dev-products','prod-products']
for engine in engines:
ec_api = host+'/api/as/v1/engines/'
# Create engine
r = requests.post(ec_api,headers=headers,json={"name": engine})
print('Create '+engine+' response: '+str(r.status_code))
# Update schema
ec_schema = ec_api+engine+'/schema'
schema=open(data_folder+engine+'_schema.json', 'rb').read()
r = requests.post(ec_schema,headers=headers,data=schema)
print('Add schema '+engine+' response: '+str(r.status_code))
3_migrate_docs
import requests
import json
from sys import getsizeof
# Populate API keys, hosts, engine names
headers_st={'Content-Type': 'application/json','Authorization': 'Bearer private-<API_KEY>'}
headers_ec={'Content-Type': 'application/json','Authorization': 'Bearer private-<API_KEY>'}
st_host='https://host-abc123.api.swiftype.com'
ec_host='https://abc123.ent-search.westeurope.azure.elastic-cloud.com'
engines = ['dev-products','prod-products']
# Migrate documents between services, up to 10,000
for engine in engines:
st_docs = st_host+'/api/as/v1/engines/'+engine+'/documents/list'
ec_docs = ec_host+'/api/as/v1/engines/'+engine+'/documents'
page = 0
print('Migrating docs in '+engine)
while True:
r_st = requests.get(st_docs,headers=headers_st,params={'page[current]':page})
docs = json.dumps(r_st.json()['results'])
r_ec = requests.post(ec_docs,headers=headers_ec,data=docs)
print(r_ec.status_code)
# Print info, check if we break
print('Page '+str(r_st.json()['meta']['page']['current'])+' of '+str(r_st.json()['meta']['page']['total_pages']))
if r_st.json()['meta']['page']['current'] == r_st.json()['meta']['page']['total_pages']:
break
else:
page+=1
4_add_settings.py
import requests
import json
# Populate API key, host, temp folder and engine names
headers={'Content-Type': 'application/json','Authorization': 'Bearer private-<API_KEY>'}
data_folder='/tmp/'
host='https://abc123.ent-search.westeurope.azure.elastic-cloud.com'
engines = ['dev-products','prod-products']
# Update search settings, curations and synonyms for these engines
for engine in engines:
ec_api = host+'/api/as/v1/engines/'+engine
# Update settings
ec_settings = ec_api+'/search_settings'
settings=open(data_folder+engine+'_settings.json', 'rb').read()
r = requests.put(ec_settings,headers=headers,data=settings)
print('Adding settings for '+engine+': '+str(r.status_code))
# Update curations
ec_curations = ec_api+'/curations'
with open(data_folder+engine+'_curations.json') as json_file:
data = json.load(json_file)
for c in data['results']:
del c['id']
r = requests.post(ec_curations,headers=headers,data=str(c))
print('Adding curation '+str(c)+': '+str(r.status_code))
# Update synonyms
ec_synonyms = ec_api+'/synonyms'
with open(data_folder+engine+'_synonyms.json') as json_file:
data = json.load(json_file)
for c in data['results']:
del c['id']
r = requests.post(ec_synonyms,headers=headers,data=str(c))
print('Adding synonym '+str(c)+': '+str(r.status_code))