13 December 2013

New in Logstash 1.3: Elasticsearch Index Template Management

By Aaron Mildenstein

For a long time in the Logstash community we’ve been advising users to apply an index mapping template. There are a number of compelling reasons to do this, including:

  • Elasticsearch tuning ("index.refresh_interval" : "5s", for example, reduces disk I/O)
  • Custom definitions for fields (numbers, strings, IP, etc.)
  • Custom lists of stopwords
  • Custom analyzers
  • Your fields can be set to not_analyzed, so terms faceting won’t see useless or non-existing data (see Jordan’s awesome screen caps of this here)

Not having that last option set has resulted in a near-constant stream of similar questions, “Why do the terms break up on hyphens?” and others of this sort. To answer these questions and prevent future questions like it from being asked, it became one of the first tasks I was assigned to work on as an employee at Elasticsearch.

Important note

The elasticsearch plugin uses Java API calls to manage the template, while the elasticsearch_http plugin uses REST API calls. With this comes an important caveat.

In order to use template management with the elasticsearch plugin you must be using version 0.90.5 or newer. The Java API calls did not exist prior to that. If you attempt to use Logstash v1.3+ with a version of Elasticsearch older than 0.90.5 with the elasticsearch output plugin the template management features will not work and there will be a stack trace in the log files indicating the absence of those API calls.

The REST API has no such constraints. If you are using the elasticsearch_http output with an older version of Elasticsearch it will still attempt to assign the template. Some new template options may not exist in very old versions of Elasticsearch, so be sure to upgrade. Upgrading is good. You want all of the performance benefits and new features that come with new releases, right?

Configuration options.

Common to both the elasticsearch and elasticsearch_http plugin are the following options (with their defaults):

  • manage_template (true)
  • template_name (“logstash”)
  • template (if unset, use internal)
  • template_overwrite (false)

manage_template

The manage_template option is boolean and is only used to disable the automatic template feature since it is on by default. Who would use this feature? Why wouldn’t you want the awesomeness of automatic template management? One such reason might be that you have dynamically named indices. For example, if you want a different index name for production logs than staging, but in the same Elasticsearch cluster, you could configure that in Logstash:

output {
  elasticsearch { 
    cluster => "mycluster" 
    manage_template => false 
    index => "%{segment_name}-logstash-%{+YYYY.MM.dd}" 
  }
}

In this example the index is determined in part by date, and in part by a variable, segment_name. When using complex index names we recommend setting the template manually.

template_name

The template_name option determines what name the template will be stored as in Elasticsearch. The default is logstash. There is an important caveat to note with this setting. If you change the template_name option in a fully configured and running system you’ll still have a template stored under that name. In this case you may want to clean out the old template so it’s not left around.

curl -XDELETE http://localhost:9200/_template/OLD_template_name?pretty


where OLD_template_name is the previous template name.

template

The template configuration option is useful if you want to use the template engine but provide your own template instead of the included one. An example might be:

output {
  elasticsearch { 
    cluster => "mycluster" 
    template => "/path/to/mytemplate.json"
  }
}

Because the default is to manage templates, this configuration would read the JSON from the indicated file and attempt to upload the template if there isn’t one already there. This brings me to…

template_overwrite

The template_overwrite option will always overwrite the indicated template in Elasticsearch with either the one indicated by template or the included one. This option is set to false by default. If you always want to stay up to date with the template provided by Logstash, this option could be very useful to you. Likewise, if you have your own template file managed by puppet, for example, and you wanted to be able to update it regularly, this option could help there as well.

Conclusion

Hopefully this helps make the configuration options more clear. Happy Logstashing!