NOTE: You are looking at documentation for an older release. For the latest information, see the current release documentation.
Using the Attachment Processor in a Pipeline
editUsing the Attachment Processor in a Pipeline
editTable 1. Attachment options
| Name | Required | Default | Description |
|---|---|---|---|
|
yes |
- |
The field to get the base64 encoded field from |
|
no |
attachment |
The field that will hold the attachment information |
|
no |
100000 |
The number of chars being used for extraction to prevent huge fields. Use |
|
no |
|
Field name from which you can overwrite the number of chars being used for extraction. See |
|
no |
all properties |
Array of properties to select to be stored. Can be |
|
no |
|
If |
For example, this:
PUT _ingest/pipeline/attachment
{
"description" : "Extract attachment information",
"processors" : [
{
"attachment" : {
"field" : "data"
}
}
]
}
PUT my_index/_doc/my_id?pipeline=attachment
{
"data": "e1xydGYxXGFuc2kNCkxvcmVtIGlwc3VtIGRvbG9yIHNpdCBhbWV0DQpccGFyIH0="
}
GET my_index/_doc/my_id
Returns this:
{
"found": true,
"_index": "my_index",
"_type": "_doc",
"_id": "my_id",
"_version": 1,
"_seq_no": 22,
"_primary_term": 1,
"_source": {
"data": "e1xydGYxXGFuc2kNCkxvcmVtIGlwc3VtIGRvbG9yIHNpdCBhbWV0DQpccGFyIH0=",
"attachment": {
"content_type": "application/rtf",
"language": "ro",
"content": "Lorem ipsum dolor sit amet",
"content_length": 28
}
}
}
To specify only some fields to be extracted:
PUT _ingest/pipeline/attachment
{
"description" : "Extract attachment information",
"processors" : [
{
"attachment" : {
"field" : "data",
"properties": [ "content", "title" ]
}
}
]
}
Extracting contents from binary data is a resource intensive operation and consumes a lot of resources. It is highly recommended to run pipelines using this processor in a dedicated ingest node.