IMPORTANT: No additional bug fixes or documentation updates will be released for this version. For the latest information, see the current release documentation.

« Overview Classify text »

› › ›

Extract information

edit

IMPORTANT: This documentation is no longer updated. Refer to Elastic's version policy and the latest documentation.

Extract information

edit

These NLP tasks enable you to extract information from your unstructured text:

Named entity recognition

edit

The named entity recognition (NER) task can identify and categorize certain entities – typically proper nouns – in your unstructured text. Named entities usually refer to objects in the real world such as persons, locations, organizations, and other miscellaneous entities that are consistently referenced by a proper name.

NER is a useful tool to identify key information, add structure and gain insight into your content. It’s particularly useful while processing and exploring large collections of text such as news articles, wiki pages or websites. It makes it easier to understand the subject of a text and group similar pieces of content together.

In the following example, the short text is analyzed for any named entity and the model extracts not only the individual words that make up the entities, but also phrases, consisting of multiple words.

...
{
    "text_field": "Elastic is headquartered in Mountain View, California."
}
...

The task returns the following result:

...
{
  "results": [
    {
      "entity": "Elastic",
      "class": "organization"
    },
    {
      "entity": "Mountain View",
      "class": "location"
    },
    {
      "entity": "California",
      "class": "location"
    }
  ]
}
...

Fill-mask

edit

The objective of the fill-mask task is to predict a missing word from a text sequence. The model uses the context of the masked word to predict the most likely word to complete the text.

The fill-mask task can be used to quickly and easily test your model.

In the following example, the special word “[MASK]” is used as a placeholder to tell the model which word to predict.

...
{
    "input": "The capital city of France is [MASK]."
}
...

The task returns the following result:

...
{
  "result": "Paris"
}
...

« Overview Classify text »