07 January 2014

Ignore Filters: The Latest Feature in elmah.io, Courtesy of Elasticsearch

By Leslie Hawthorn

Today we’re bringing you another story of Elasticsearch in the field: elmah.io and its latest feature, Ignore Filters. Elmah.io is cloud based error logger for .NET web applications. Thomas Ardal, one of their backend developers, was kind enough to share his story of implementing the latest functionality for elmah.io in just 30 minutes, all based on Elasticsearch’s redesigned percolator.

At elmah.io, we’ve set a course to create the best cloud based error logging framework for .NET web applications; simply add a NuGet package to your web project, and all of your website errors are logged to elmah.io. On the elmah.io site, you can search through all of your errors and logs using various search input. All of the errors we receive are indexed in a cluster of Elasticsearch instances, making it highly available and super-fast to search through all possible errors.

Let’s talk about how we’ve used some of the nice features of Elasticsearch for some of our recently added functionality. Like any good company, we eat our own dog food and use elmah.io to perform all logging for our own website. One day, I noticed new errors in our logs with the user agent saying something like *bot*.

You’ve probably seen something similar yourself; Google or someone else is trying to request pages that don’t exist on the server or somehow manipulate the URL in an unintended way. We could sit down and implement special code handling bot requests, but what we really want is a way to tell elmah.io not to index certain types of errors that don’t yield useful, actionable information. That’s why we decided to build Ignore Filters. In this blog post, I will tell you how we did that using C#, NEST and Elasticsearch’s Percolator API.

In short, the Percolator API implements a sort of reverse query in Elasticsearch. Usually you index documents and query them. Using the Percolator, you index queries and then ask if Elasticsearch has queries which match documents – perfect use-case for implementing ignore queries!

Ignore Filters were something we had been thinking about for some time. We wanted a solution that was easy for our users to setup and possible for us to implement without using months of development time. After reading a blog post explaining the Percolator API in Elasticsearch, I got curious and, before I knew it, I had implemented a prototype of the Ignore Filters feature. In just 30 minutes!

To get started, users need to input search queries telling elmah.io what errors to ignore. We built a new UI for this task:

The user adds their own queries or chooses one of the templates defined at the bottom. Implementing this save in C# is easy using the wonderful client package for Elasticsearch NEST, written by Martijn Laarman:

var connectionSettings = new ConnectionSettings(url);
var elasticClient = new ElasticClient(connectionSettings);
var registerPercolator = elasticClient.RegisterPercolator(id, p => p.QueryString(qs => qs.Query(query)));

We start by creating a new ElasticClient instance pointing to the index that we want to add the ignore filter to. Then, we call the RegisterPercolator method, which takes an ID that we generate from a random number, as well as a query. We’ve decided to let the user input Lucene queries in the UI, which uses the query_string query in Elasticsearch. This solution is not optimal, because users may incidentally write slow queries this way. There’s a new feature in recent versions of Elasticsearch called simple_query_string query, which we plan to migrate to in the next version of elmah.io.

And that’s it! The user is now able to register ignore filters, based on Lucene queries and the Percolator API. Each time we receive an error, we ask the Percolator for queries (ignore filters) matching the new error using NEST:

var percolateResponse = elasticClient.Percolate(errorDocument);
if (percolateResponse.OK && percolateResponse.Matches.Any())
    return Request.CreateResponse(HttpStatusCode.OK);

In text: if the percolate request were OK and the request actually returned any queries, we simply return a response immediately. To be able to distinguish if errors are actually logged or ignored by an ignore filter, we use status code 200 for everything went well, but if the errors were ignored then status code 201 is created for the error.

Looking at the code, the entire feature looks simple. With the Percolator API this complex feature was indeed simple to implement. Doing similar stuff with something like a relational databases really shows the strength of a NoSQL search engine like Elasticsearch. In conclusion, the Percolator API turned out to be the perfect companion for implementing elmah.io’s Ignore Filters. The Perclator’s simplicity led us in the right direction, making the feature we built for our elmah.io users simple to use, as well. We just released this feature in the past few weeks, and we’re looking forward to getting more feedback from our users.

Many thanks to Thomas for sharing his experience with us.

If you are interested in learning more about the redesigned percolator forthcoming in Elasticsearch 1.0 (and already available in our beta releases), check out this presentation from Elasticsearch core developer Luca Cavanna:


Last but not least, if you’d like to see your story featured here on the Elasticsearch Developer blog, let us know!