To set up an enrich processor, follow these steps:
The enrich processor performs several operations and may impact the speed of your ingest pipeline.
We strongly recommend testing and benchmarking your enrich processors before deploying them in production.
We do not recommend using the enrich processor to append real-time data. The enrich processor works best with reference data that doesn’t change frequently.
If you use Elasticsearch security features, you must have:
readindex privileges for any indices used
To begin, add documents to one or more source indices. These documents should contain the enrich data you eventually want to add to incoming documents.
After adding enrich data to your source indices, use the create enrich policy API to create an enrich policy.
Once created, you can’t update or change an enrich policy. See Update an enrich policy.
The enrich index contains documents from the policy’s source indices.
Enrich indices always begin with
and are force merged.
Enrich indices should be used by the enrich processor only. Avoid using enrich indices for other purposes.
Once you have source indices, an enrich policy, and the related enrich index in place, you can set up an ingest pipeline that includes an enrich processor for your policy.
When defining the enrich processor, you must include at least the following:
- The enrich policy to use.
- The field used to match incoming documents to the documents in your enrich index.
- The target field to add to incoming documents. This target field contains the match and enrich fields specified in your enrich policy.
You also can use the
max_matches option to set the number of enrich documents
an incoming document can match. If set to the default of
1, data is added to
an incoming document’s target field as a JSON object. Otherwise, the data is
added as an array.
See Enrich for a full list of configuration options.
You also can add other processors to your ingest pipeline.
You can now use your ingest pipeline to enrich and index documents.
Before implementing the pipeline in production, we recommend indexing a few test documents first and verifying enrich data was added correctly using the get API.
Once created, you cannot update or index documents to an enrich index. Instead, update your source indices and execute the enrich policy again. This creates a new enrich index from your updated source indices. The previous enrich index will deleted with a delayed maintenance job. By default this is done every 15 minutes.
Once created, you can’t update or change an enrich policy. Instead, you can:
The enrich coordinator is a component that manages and performs the searches required to enrich documents on each ingest node. It combines searches from all enrich processors in all pipelines into bulk multi-searches.
The enrich policy executor is a component that manages the executions of all enrich policies. When an enrich policy is executed, this component creates a new enrich index and removes the previous enrich index. The enrich policy executions are managed from the elected master node. The execution of these policies occurs on a different node.
enrich processor has node settings for enrich coordinator and
enrich policy executor.
The enrich coordinator supports the following node settings:
Maximum number of searches to cache for enriching documents. Defaults to
1000. There is a single cache for all enrich processors in the cluster. This setting determines the size of that cache.
Maximum number of concurrent multi-search requests to
run when enriching documents. Defaults to
Maximum number of searches to include in a multi-search
request when enriching documents. Defaults to
The enrich policy executor supports the following node settings:
Maximum batch size when reindexing a source index into an enrich index. Defaults
Maximum number of force merge attempts allowed on an
enrich index. Defaults to
How often Elasticsearch checks whether unused enrich indices can be deleted. Defaults to
Maximum number of enrich policies to execute concurrently. Defaults to