29 April 2014 Engineering

DRY - Keeping Search Requests Short

By Alexander Reelsen

With the release of elasticsearch 1.1.0 it is now possible to have query and search templates for all your requests. This blog post explains the how and why.

Most of us know this principle from programming: Don’t repeat yourself. Make sure, you only write code once and leave out repetitive code that does not change. The same applies for search requests. There should be no need to repeat parts of a query that do not change a lot. However, even for executing a query via HTTP, you are required to repeat a similar JSON data structure over and over again.

Template Query

The first DRY helper for you is the template query. This allows you to specify a specific query where one part is the template, which simply is a mustache template, and one part parameters, which are then compiled together, before the query is executed as a normal query. Take a look at this simple example:

GET /_search
{
    "query": {
        "template": {
            "query": {"match_{{template}}": {}},
            "params" : {
                "template" : "all"
            }
        }
    }
}

Now this does not save you a lot of code exactly, does it? Very true, let us make this example more usable. First, most queries are not a simple match_all query, but rather a longer query where usually a single search phrase (possibly consisting of several terms) is put into several fields like this:

{
    "query": {
        "template": {
            "query": {
              "bool" : {
                "must" : [
                    { "match" : { "name": "{{name}}" } }
                ],
                "should" : [ { "match" : { "firstname": "{{name}}" } } ]
              }
            },
            "params" : {
                "name" : "alexander"
            }
        }
    }
}

Again, this query did not become any shorter. So the next step is not to send the query with every request, but maybe have it already stored on the server side. This is exactly one of the features of the query template. You can put the query part into a file in config/scripts/ directory, for example config/scripts/my-script.mustache, as mustache is used for rendering. Then the query is suddenly short like this:

GET /_search
{
    "query": {
        "template": {
            "query": "my-script",
            "params" : {
                "name" : "alexander"
            }
        }
    }
}

You can add the script file to elasticsearch while running and it will pick it up automatically without the need for a restart.

So, now we are saving some bytes per request. But can we do better? There is still some redundancy above like the query part. Also, it would be nice to maybe template the full request, and have script aggregations or highlighting fields for example.

Search template

GET /_search/template
{
  "template": {
    "query": {
      "term": {
         "{{field}}" : "{{value}}"
      }
    },
    "aggs" : { 
      "{{field}}" : 
        { "terms" : { "field": "{{field}}"} }
    }
  },
  "params": {
    "field" : "name",
    "value" : "alexander"
  }
}

As you can see, you can template the whole request. And again you can refer to an already stored script and shorten it dramatically!

GET /_search/template
{
  "template": "my-request",
  "params": {
    "field" : "username",
    "value" : "alexander"
  }
}

In addition, you can use more complex features of the of the mustache templating engine, see the search documentation.

There are several reasons for this feature. Saving some bytes on the wire might be one; only allowing to execute a couple of predefined search operations (like a saved searches functionality, or executing A/B tests by easily specifying different queries, or adding some ACL driven filters to all queries) might be another. This could also become a point of sharing queries between applications, perhaps even written in different languages.

Up next…

Storing the script in the scripts directory of elasticsearch still implies you have to copy it manually to each node. Another idea might be to store it in an index or the cluster state. We will work on that as well.

Also, feel free to drop us some feedback, if you think mustache is a good fit here or if you think it makes sense to support other template languages. As mustache is a so-called logic less template language, some people might consider it too limited, so we are eager to know about your use-case.

In addition, we’re discussing if it is a good idea to allow for a pure parameter driven get request like this without a body:

GET /_search/template/my-request?field=username&value=root&size=10

An advantage of this approach might be that it is easy to cache with a proxy, if needed. What do you think?