Client helpers
editClient helpers
editThe client comes with an handy collection of helpers to give you a more comfortable experience with some APIs.
The client helpers are experimental, and the API may change in the next minor releases. The helpers will not work in any Node.js version lower than 10.
Bulk helper
editAdded in v7.7.0
Running bulk requests can be complex due to the shape of the API, this helper aims to provide a nicer developer experience around the Bulk API.
Usage
editconst { createReadStream } = require('fs') const split = require('split2') const { Client } = require('@elastic/elasticsearch') const client = new Client({ cloud: { id: '<cloud-id>' }, auth: { apiKey: 'base64EncodedKey' } }) const result = await client.helpers.bulk({ datasource: createReadStream('./dataset.ndjson').pipe(split()), onDocument (doc) { return { index: { _index: 'my-index' } } } }) console.log(result) // { // total: number, // failed: number, // retry: number, // successful: number, // time: number, // bytes: number, // aborted: boolean // }
To create a new instance of the Bulk helper, access it as shown in the example above, the configuration options are:
|
An array, async generator or a readable stream with the data you need to index/create/update/delete.
It can be an array of strings or objects, but also a stream of json strings or JavaScript objects. const { createReadStream } = require('fs') const split = require('split2') const b = client.helpers.bulk({ // if you just use split(), the data will be used as array of strings datasource: createReadStream('./dataset.ndjson').pipe(split()) // if you need to manipulate the data, you can pass JSON.parse to split datasource: createReadStream('./dataset.ndjson').pipe(split(JSON.parse)) }) |
|
A function that is called for each document of the datasource. Inside this function you can manipulate the document and you must return the operation you want to execute with the document. Look at the Bulk API documentation to see the supported operations. const b = client.helpers.bulk({ onDocument (doc) { return { index: { _index: 'my-index' } } } }) |
|
A function that is called for everytime a document can’t be indexed and it has reached the maximum amount of retries. const b = client.helpers.bulk({ onDrop (doc) { console.log(doc) } }) |
|
The size of the bulk body in bytes to reach before to send it. Default of 5MB. const b = client.helpers.bulk({ flushBytes: 1000000 }) |
|
How much time (in milliseconds) the helper waits before flushing the body from the last document read. const b = client.helpers.bulk({ flushInterval: 30000 }) |
|
How many request is executed at the same time. const b = client.helpers.bulk({ concurrency: 10 }) |
|
How many times a document is retried before to call the const b = client.helpers.bulk({ retries: 3 }) |
|
How much time to wait before retries in milliseconds. const b = client.helpers.bulk({ wait: 3000 }) |
|
If const b = client.helpers.bulk({ refreshOnCompletion: true // or refreshOnCompletion: 'index-name' }) |
Supported operations
editIndex
editclient.helpers.bulk({ datasource: myDatasource, onDocument (doc) { return { index: { _index: 'my-index' } } } })
Create
editclient.helpers.bulk({ datasource: myDatasource, onDocument (doc) { return { create: { _index: 'my-index', _id: doc.id } } } })
Update
editclient.helpers.bulk({ datasource: myDatasource, onDocument (doc) { // Note that the update operation requires you to return // an array, where the first element is the action, while // the second are the document option return [ { update: { _index: 'my-index', _id: doc.id } }, { doc_as_upsert: true } ] } })
Delete
editclient.helpers.bulk({ datasource: myDatasource, onDocument (doc) { return { delete: { _index: 'my-index', _id: doc.id } } } })
Abort a bulk operation
editIf needed, you can abort a bulk operation at any time. The bulk helper returns a
thenable, which has an abort
method.
The abort method stops the execution of the bulk operation, but if you are using a concurrency higher than one, the operations that are already running will not be stopped.
const { createReadStream } = require('fs') const split = require('split2') const { Client } = require('@elastic/elasticsearch') const client = new Client({ cloud: { id: '<cloud-id>' }, auth: { apiKey: 'base64EncodedKey' } }) const b = client.helpers.bulk({ datasource: createReadStream('./dataset.ndjson').pipe(split()), onDocument (doc) { return { index: { _index: 'my-index' } } }, onDrop (doc) { b.abort() } }) console.log(await b)
Passing custom options to the Bulk API
editYou can pass any option supported by the link: Bulk API to the helper, and the helper uses those options in conjunction with the Bulk API call.
const result = await client.helpers.bulk({ datasource: [...], onDocument (doc) { return { index: { _index: 'my-index' } } }, pipeline: 'my-pipeline' })
Usage with an async generator
editconst { Client } = require('@elastic/elasticsearch') async function * generator () { const dataset = [ { user: 'jon', age: 23 }, { user: 'arya', age: 18 }, { user: 'tyrion', age: 39 } ] for (const doc of dataset) { yield doc } } const client = new Client({ cloud: { id: '<cloud-id>' }, auth: { apiKey: 'base64EncodedKey' } }) const result = await client.helpers.bulk({ datasource: generator(), onDocument (doc) { return { index: { _index: 'my-index' } } } }) console.log(result)
Modifying a document before operation
editAdded in v8.8.2
If you need to modify documents in your datasource before it is sent to Elasticsearch, you can return an array in the onDocument
function rather than an operation object. The first item in the array must be the operation object, and the second item must be the document or partial document object as you’d like it to be sent to Elasticsearch.
const { Client } = require('@elastic/elasticsearch') const client = new Client({ cloud: { id: '<cloud-id>' }, auth: { apiKey: 'base64EncodedKey' } }) const result = await client.helpers.bulk({ datasource: [...], onDocument (doc) { return [ { index: { _index: 'my-index' } }, { ...doc, favorite_color: 'mauve' }, ] } }) console.log(result)
Multi search helper
editAdded in v7.8.0
If you send search request at a high rate, this helper might be useful
for you. It uses the multi search API under the hood to batch the requests
and improve the overall performances of your application. The result
exposes a
documents
property as well, which allows you to access directly the hits
sources.
Usage
editconst { Client } = require('@elastic/elasticsearch') const client = new Client({ cloud: { id: '<cloud-id>' }, auth: { apiKey: 'base64EncodedKey' } }) const m = client.helpers.msearch() m.search( { index: 'stackoverflow' }, { query: { match: { title: 'javascript' } } } ) .then(result => console.log(result.body)) // or result.documents .catch(err => console.error(err))
To create a new instance of the multi search (msearch) helper, you should access it as shown in the example above, the configuration options are:
|
How many search operations should be sent in a single msearch request. const m = client.helpers.msearch({ operations: 10 }) |
|
How much time (in milliseconds) the helper waits before flushing the operations from the last operation read. const m = client.helpers.msearch({ flushInterval: 500 }) |
|
How many request is executed at the same time. const m = client.helpers.msearch({ concurrency: 10 }) |
|
How many times an operation is retried before to resolve the request. An operation is retried only in case of a 429 error. const m = client.helpers.msearch({ retries: 3 }) |
|
How much time to wait before retries in milliseconds. const m = client.helpers.msearch({ wait: 3000 }) |
Stopping the msearch helper
editIf needed, you can stop an msearch processor at any time. The msearch helper
returns a thenable, which has an stop
method.
If you are creating multiple msearch helpers instances and using them for a
limitied period of time, remember to always use the stop
method once you have
finished using them, otherwise your application will start leaking memory.
The stop
method accepts an optional error, that will be dispatched every
subsequent search request.
The stop method stops the execution of the msearch processor, but if you are using a concurrency higher than one, the operations that are already running will not be stopped.
const { Client } = require('@elastic/elasticsearch') const client = new Client({ cloud: { id: '<cloud-id>' }, auth: { apiKey: 'base64EncodedKey' } }) const m = client.helpers.msearch() m.search( { index: 'stackoverflow' }, { query: { match: { title: 'javascript' } } } ) .then(result => console.log(result.body)) .catch(err => console.error(err)) m.search( { index: 'stackoverflow' }, { query: { match: { title: 'ruby' } } } ) .then(result => console.log(result.body)) .catch(err => console.error(err)) setImmediate(() => m.stop())
Search helper
editAdded in v7.7.0
A simple wrapper around the search API. Instead of returning the entire result
object it returns only the search documents source. For improving the
performances, this helper automatically adds filter_path=hits.hits._source
to
the query string.
const documents = await client.helpers.search({ index: 'stackoverflow', query: { match: { title: 'javascript' } } }) for (const doc of documents) { console.log(doc) }
Scroll search helper
editAdded in v7.7.0
This helpers offers a simple and intuitive way to use the scroll search API.
Once called, it returns an
async iterator
which can be used in conjuction with a for-await…of. It handles automatically
the 429
error and uses the maxRetries
option of the client.
const scrollSearch = client.helpers.scrollSearch({ index: 'stackoverflow', query: { match: { title: 'javascript' } } }) for await (const result of scrollSearch) { console.log(result) }
Clear a scroll search
editIf needed, you can clear a scroll search by calling result.clear()
:
for await (const result of scrollSearch) { if (condition) { await result.clear() } }
Quickly getting the documents
editIf you only need the documents from the result of a scroll search, you can
access them via result.documents
:
for await (const result of scrollSearch) { console.log(result.documents) }
Scroll documents helper
editAdded in v7.7.0
It works in the same way as the scroll search helper, but it returns only the
documents instead. Note, every loop cycle returns a single document, and you
can’t use the clear
method. For improving the performances, this helper
automatically adds filter_path=hits.hits._source
to the query string.
const scrollSearch = client.helpers.scrollDocuments({ index: 'stackoverflow', query: { match: { title: 'javascript' } } }) for await (const doc of scrollSearch) { console.log(doc) }