13 juin 2014 Technique

Elasticsearch 1.2: Adding Context to Suggestions

Par Alexander Reelsen

The need for speed

If you have not yet read the introductory blog post about the completion suggester, why we built it, and what makes it so fast, you should do so right now!

Speed is nothing without control

One of the most requested requirements for the suggester was the possibility to apply filters to the suggestions. As the query being executed for a completion suggest request is not a real search request and the data structure being accessed differs, applying filters was not as simple as one would hope.

We spent some time thinking about this problem, and came up with a solution we call context suggestions. The name describes the difference between this one and other suggesters: users want to get suggestions back, but in the scope of a context.

What can a scope include? Like with many things in Elasticsearch, there are endless possibilities. You, the person using your data, will know best what the correct scope is. However, we wanted to show you a couple of examples of scoping a suggestion.

Filtering by fields

The first possibility is to filter by fields. One common use case is when you want to return suggestions only for a certain type. The mapping would look like this:

DELETE /posts
PUT /posts

PUT /posts/article/_mapping
{
  "article" : {
    "properties" : {
      "suggest_field": {
        "type": "completion",
        "context": {
          "type": { 
            "type": "category",
            "path": "_type"
          }
        }
      }
    }
  }
}

PUT /posts/teaser/_mapping
{
  "teaser" : {
    "properties" : {
      "suggest_field": {
        "type": "completion",
        "context": {
          "type": { 
            "type": "category",
            "path": "_type"
          }
        }
      }
    }
  }
}

Now index two documents:

PUT /posts/article/1
{
  "suggest_field" : {
    "input" : [ "Medicine - Better than homeopathy?" ]
  }
}

PUT posts/teaser/2
{
  "suggest_field" : {
    "input" : [ "Music - can it help plants to grow?" ]
  }
}

Now a suggestion needs to contain context information:

POST /posts/_suggest?pretty'
{
    "suggest" : {
        "text" : "m",
        "completion" : {
            "field" : "suggest_field",
            "size": 10,
            "context": {
                "type": "article"
            }
        }
    }
}

Because the suggester is using the article type as its context, only the first suggestion (“Medicine – Better than homeopathy?”) will be returned by the suggester. The second suggestion is ignored because it is a different type than article.

A very common use case for this example would be an e-commerce shop; imagine you have selected a category and only want to return products which are inside of the selected product category.

Using geo locations

Another interesting use case is to take geo locations into account. Imagine you are retrieving suggestions for restaurants; you probably want to suggest restaurants near the user. Ideally, we would filter the suggestions to include only those which are around 2km from the users’ location. This means that you need to supply location information on query and index time. Let us take care of the mapping first:

DELETE /venues
PUT /venues

PUT /venues/poi/_mapping
{
  "poi" : {
    "properties" : {
      "suggest_field": {
        "type": "completion",
        "context": {
          "location": { 
            "type": "geo",
            "precision" : "500m"
          }
        }
      }
    }
  }
}

The next step is to index a document, which contains location information in the suggest field:

PUT /venues/poi/1
{
  "suggest_field": {
    "input": ["The Shed", "shed"],
    "context": {
      "location": {
        "lat": 51.9481442,
        "lon": -5.1817516
      }
    }
  }
}

And now, all of a sudden, you can get suggestions back which only apply to a certain area:

POST /venues/_suggest
{
  "suggest" : {
    "text" : "s",
    "completion" : {
      "field" : "suggest_field",
      "context": {
        "location": {
          "value": {
            "lat": 51.938119,
            "lon": -5.174051
          }
        }
      }
    }
  }
}

Combining several suggesters

So after understanding this principle, your next step would be to answer the question: “How can I find suggestions for all the Indian restaurants near my location?”. Even though this seems more tricky, the great part about the context suggester is the possibility of using several suggesters sequentially. You can create a completion field mapping, which needs a field and a geo location in order to return suggestions.

So, let us create a new index with a new field mapping that contains two contexts. There are just two in this example, but you can have arbitrarily many contexts!

DELETE /venues
PUT /venues

PUT /venues/poi/_mapping
{
  "poi" : {
    "properties" : {
      "suggest_field": {
        "type": "completion",
        "context": {
          "type": { 
            "type": "category",
            "path": "type"
          },        
          "location": { 
            "type": "geo",
            "precision" : "500m"
          }
        }
      }
    }
  }
}

Next, index a new point of interest of type restaurant:

PUT /venues/poi/1
{
  "suggest_field": {
    "input": ["The Shed", "shed"],
    "output" : "The Shed - fresh sea food",
    "context": {
      "location": {
        "lat": 51.9481442,
        "lon": -5.1817516
      }
    }
  },
  "type" : "restaurant"
}
>

And now, use the type context and the location context to find suggestions for restaurants in that area:

POST /venues/_suggest
{
  "suggest" : {
    "text" : "s",
    "completion" : {
      "field" : "suggest_field",
      "context": {
        "type" :"restaurant",
        "location": {
          "value": {
            "lat": 51.938119,
            "lon": -5.174051
          }
        }
      }
    }
  }
}

Internally, Elasticsearch is creating two prefix graphs (remember the cute graph in the completion suggester blogpost?) in addition to the usual suggestion graph. The first one is for the type field and the second for the location.

The location is a geo_point, and you may be wondering how can this be a graph? The solution is simple: the geo point is converted into a geohash first – which is a string – and then a graph is created from the geohashes. This principle is also the reason why you can have an unlimited amount of such graphs, as you just create a graph with more prefix graphs. For example, the category id in your ecommerce site’s page, or the location and the type of a restaurant, or the type of point of interest you wanted to visit, etc etc.

More documentation

We intentionally left out a couple of possible mapping options in this quick introduction, which you might want to read up on in the context suggester documentation

We are very interested into what use cases you might bring this functionality in and would love to hear back from you about this highly requested feature. Let us know what you think!