20 mars 2015 Technique

Curator 3.0 Released

Par Aaron Mildenstein

Curator 3.0 is a terrific update! It may not look like much has changed if you have been using a 2.x version, but there have been many improvements!

I am pleased to present these changes to you!

Why change it?

Is Curator 2 working just fine for you? I'm so glad to hear that! You can, of course, continue to use Curator 2. Because I was dissatisfied with the performance of some operations, I decided to fix it. For example, the Elasticsearch API does not require you to delete indices one at a time, but Curator 2 did. I was dissatisfied with the index selection parameters. While I did have some unit testing, Curator was sorely lacking command-line integration testing. Some bugs might have been easily detected with command-line-level integration testing. All of these, and more still, led me to write Curator 3.

Installation and Upgrading

Install the latest version of Curator with:
pip install elasticsearch-curator

Curator can be upgraded to the latest version by running:
pip install -U elasticsearch-curator

Note: This is a major release update. Command-lines from Curator version 2 will not work with Curator 3. With some small modifications, though, they should be good to go.

Learn more about alternative methods of installation at https://github.com/elastic/curator/wiki/Installation.

What's new?

I am so glad you asked!

  • Updated API
  • Command-line based on Click
  • Pipelined index & snapshot selection
  • Simultaneous operations (where allowed)
  • Improved Python package
  • New commands
  • Improved testing
  • Python3 support restored


Let's take a brief look at each of these.

Updated API

In Curator 2, the Curator API did index (or snapshot) selection in many of the methods. This required a staggering amount of command-line options. It also required fairly complex methods. Now with Curator 3, index selection and filtering is handled before any command method--close, delete, optimize, etc. --is called. The API methods expect to have the list of indices to act on sent to them as a parameter. As of Curator 3.0.1, a build_filter method has been added to assist in building index filters, as with the command-line.

Learn more at http://curator.readthedocs.org.

Command-line based on Click

Click is an amazing tool to help build sophisticated command-line interfaces. Using Click also allows me to do full integration testing of the software as though I were issuing commands at a command-line.

The big change for the command-line in Curator 3 is the additional sub-commands indices and snapshots, for index and snapshot selection, respectively.

To see how this works, you can use the --help flag at each sub-command:

    curator --help
curator show --help
curator show indices --help

Learn more in the wiki, in index selection and snapshot selection.

Pipelined index and snapshot selection

To continue where the last section left off, the new index selection parameters can work together in ways that Curator 2 could not touch.

You can now specify a line like:

    curator delete indices --older-than 30 --newer-than 60 --time-unit days \
--timestring '%Y.%m.%d' --prefix logs --suffix prod \
--exclude logs-2015.02.01-prod --exclude 2015.01.31 \
--index logs-2015.02.01-dev

This line will delete all indices older than 30 days, but newer than 60 days, with logs as a prefix and prod as a suffix, exclude the logs indicated by the patterns given, and force-include logs-2015.02.01-dev.

Simply put, Curator 2 could not achieve this level of customizability.

Curator 3 also comes with an --all-indices option to allow you to select all indices. However, if you select the delete command with --all-indices, the Kibana indices of .kibana, kibana-int, and .kibana-marvel will be auto-pruned. If you wish for Curator to delete one or more of these indices, you would have to manually include them with the --index flag. The --index flag bypasses all other filtering to add the specified index to the list to be operated on.

In fact, the --index flag, if used by itself, will allow you to act on specified indices directly:

    curator delete indices --index index1 --index index2 --index index3

This allows for operations on individual indices, which previously required the use of curl or direct API calls.

The best part of this change is that you can use Curator on non-time-series indices now! Curator is no longer restricted to operating on time-series indices!

Learn more in the wiki, in index selection and snapshot selection.

Simultaneous operations (where allowed)

Curator 2 would only do one action at a time. This could make some operations take far longer than was needful. Curator 3 now only acts one-at-a-time when:

  • Optimizing indices
  • Adding or removing indices from aliases
  • Deleting snapshots
  • Disabling bloom filters with a --delay specified (Only in Elasticsearch versions older than 1.4)

All other operations are done simultaneously, or in batches. Elasticsearch has a 4K limit for incoming requests (a changeable default). If the list of indices is too big, this could result in an error. To prevent this, Curator will try to automatically compensate for extremely long lists of indices by slicing them into smaller batches. For example, 365 indices named logstash-YYYY.MM.dd takes 4 or 5 batches to complete.

Improved Python package

Because everything is segmented into separate modules in the API or CLI, the package is much cleaner. This is makes adding or reviewing code much, much simpler.

New commands

Introducing the open and replicas commands. With the improved index selection, you can now re-open closed indices in a batch! You can also change the replica count of indices, even when closed!

Improved testing

While the current tests do not necessarily catch every conceivable test or condition, a lot of effort was made to get complete code coverage--that is to say, every line of code is evaluated at least once. This has already caught several potential bugs before they could make it into a released version! The code coverage is currently at 99%! This includes command-line simulating integration testing! For the first time in Curator's history, the CLI is being tested, and this is a great beginning.

A separate blog post will be following to discuss this in further detail.

Python3 support restored

Somewhere around version 2.1, Curator stopped working with Python3. I apologize for this, if it affected you. I made it a top priority to make Curator 3 work with Python3. All tests pass with 99% code coverage in both Python2 and Python3!

Conclusion

Curator 3 is a leap forward in performance, usability, and reliability. I would love to have your feedback, positive or negative. As always, if you come across any issues, do not hesitate to raise an issue on GitHub.

Happy Curating!