Tech Topics

GA Release of NEST 2.0, our .NET client for Elasticsearch

This marks the first GA release of our 2.0 client with well over a 1000 commits since 1.7.1 (the currently last released NEST version in the 1.x range).

Back to the drawing board

We took some time to go back to the drawing board. NEST was originally started in 2010 and there are many choices that have accumulated in the code base that don't make sense anymore. So we stepped back to properly formalize how we see the lifetime of a call and worked off of that. Armed with the following diagram, we completely rewrote NEST's internals; The old Task Parallel Library (TPL) based code is now replaced with async/await with a much saner approach to exceptions and errors in addition to exposing enough information as an audit trail so you don't ever have to guess what went down during a call.

Request pipeline in NEST 2.0

Our internals now also reflect this:

  • IElasticClient exposes all the Elasticsearch API endpoints e.g client.Search this calls into ITransport's 2 methods Request and RequestAsync
  • the default ITransport uses the passed in IRequestPipelineFactory to create an RequestPipeline which implements IPipeline.

This pipeline now handles all of the failover/sniffing/pinging logic and directly reflects the flow diagram.

We also simplified IConnection down just 2 methods. This means the outer edges are clean (ITransport and IConnection) and implementing your own should be really really simple. All of these (and also IMemoryStreamProvider and IDateTimeProvider) can be injected on the constructor of the client.

Test Framework

Another huge endeavour is the rework of our test framework. NEST 1.x was always well tested but used 5 different test projects and 5 years worth of changing our minds as how best to write tests and assertions, thus becoming a big hodgepodge of NUnit assertions, Fluent assertions, FakeItEasy, Moq combined with several different ways to compare json with object graphs and vice-versa. Trying to write a new test quickly became an exercise in yak shaving because there was no clear cut way for how best to write said test.

So the first thing we did as part of our 2.0 branch was to completely delete all of our tests. This act of insanity gave us carte blanche during our rewrite.

As of 2.0, we have one test project, Tests, with all tests written in such a way that they can be run in unit test mode and integration test mode. Write once, run differently. All of the API endpoint tests test all four request variations - two DSL's (using fluent and object initializer syntax) with both synchronous and asynchronous variations of each. We also test all of the moving parts of the Elasticsearch DSL (Aggregations, Sorting, IndexSettings, etc) in the same way.

In addition to the more formal unit and integration tests, we also implemented a thing we dubbed Literate Testing to allow us to write tests in a more story telling form, with multi-line comments serving as the narrative for our asciidoc documentation while using the Roslyn compiler to pick out the interesting bits of code. This gives us the benefit of always compiling our documentation in addition to having only one place where we document, test and assert how a piece of code is supposed to work.

Another huge component of our testing framework is the Virtual Cluster that allows us to write tests for any situation and how we expect the client to behave.

For example:

/** we set up a 10 node cluster with a global request time out of 20 seconds.
* Each call on a node takes 10 seconds. So we can only try this call on 2 nodes
* before the max request time out kills the client call.
*/
var audit = new Auditor(() => Framework.Cluster
  .Nodes(10)
  .ClientCalls(r => r.FailAlways().Takes(TimeSpan.FromSeconds(10)))
  .ClientCalls(r => r.OnPort(9209).SucceedAlways())
  .StaticConnectionPool()
  .Settings(s => s.DisablePing().RequestTimeout(TimeSpan.FromSeconds(20)))
);
audit = await audit.TraceCalls(
  new ClientCall {
    { BadResponse, 9200 }, //10 seconds
    { BadResponse, 9201 }, //20 seconds
    { MaxTimeoutReached }
  },
  /**
  * On the second client call we specify a request timeout override to 80 seconds
  * We should now see more nodes being tried.
  */
  new ClientCall(r => r.RequestTimeout(TimeSpan.FromSeconds(80)))
  {
    { BadResponse, 9203 }, //10 seconds
    { BadResponse, 9204 }, //20 seconds
    { BadResponse, 9205 }, //30 seconds
    { BadResponse, 9206 }, //40 seconds
    { BadResponse, 9207 }, //50 seconds
    { BadResponse, 9208 }, //60 seconds
    { HealthyResponse, 9209 },
  }
);

This showcases the Virtual Cluster tests combined with Literate Tests and the extensive audit trail information available on each response (or exception).

I'm pleased to say we are back at a decent coverage rate (60%) and will continue to iterate and improve this.

Exception handling

Another big change in NEST 2.0 is how we deal with exceptions.

In NEST 1.x, the client threw a multitude of exceptions: MaxRetryException, ElasticsearchAuthException, ElasticsearchServerException, DslException, etc.. This made it challenging for users to handle exceptions and invalid responses, and understand the root cause of errors. On top of that, the types of exceptions thrown depended on what kind of IConnectionPool was injected, in order to maintain maximum backwards compatibility with NEST 0.x.

In NEST 2.x, exceptions are much more deterministic. The former ThrowOnElasticsearchServerExceptions() setting has been replaced with the more succinct ThrowExceptions(), which determines whether the client should ever throw an exception or not (client side and server exceptions). Furthermore, the types of exceptions have been reduced and simplified down to three types of exceptions:

ElasticsearchClientException: These are known exceptions, either an exception that occurred in the request pipeline(such as max retries or timeout reached, bad authentication, etc...) or Elasticsearch itself returned an error (could not parse the request, bad query, missing field, etc...). If it is an Elasticsearch error, the ServerError property on the response will contain the the actual error that was returned. The inner exception will always contain the root causing exception.

UnexpectedElasticsearchClientException: These are unknown exceptions, for instance a response from Elasticsearch not properly deserialized. These are usually bugs in the client and we encourage you to report them. This exception also inherits from ElasticsearchClientException so an additional catch block isn't necessary to handle but can be helpful in distinguishing between the two.

Runtime exceptions: These are CLR exceptions like ArgumentException, NullArgumentException etc. that are thrown when an API in the client is misused.

Breaking Changes

Even though a lot of work went into the interior, the exterior did not escape unscathed! On top of the many breaking changes that Elasticsearch 2.0 introduces, there are more then a few NEST 2.0 introduces. We revalidated all the request and response domain objects against Elasticsearch 2.0.

A pretty complete list of breaking changes are available:

Elasticsearch 2.x support

NEST 2.0 supports all the new features in Elasticsearch 2.0 including pipeline aggregations. New features from Elasticsearch 2.2 have not yet been mapped.

Here we'll just highlight a couple features that are reflected in the NEST changes to whet your appetite!

Filters are Gone Gone Gone

In Elasticsearch 2.0, query and filter constructs have merged into one concept called queries and NEST 2.0 reflects this. So if you were previously using the AndFilter, this is now the AndQuery; beware however that some of these filters have been obsoleted and chances are high you were using them wrong!.

Isolated descriptors

In NEST 1.x we took pride in being a 1-to-1 mapping with the Elasticsearch API. In some cases however, this purity hid the real depth of parameters. As an example,

in NEST 1.x you could add sorts on Search() using:

client.Search<Project>(s=>s
  .SortAscending(...)
  .SortScript(...)
)

in NEST 2.0 you have to stoop down a level first in order to access the same functionality

client.Search<Project>(s=>s
  .Sort(ss=>ss
    .Field()
    .Script()
    .Ascending()
    .GeoDistance()
  )
)

This encapsulates all of the sort options properly and adheres stricter to the 1-to-1 mapping. NEST 1.x did also contain this full descriptor however the mix and matching of the convenience methods of the parent meant that some fluent methods were additive whilst others overwrote what was previously set. With NEST 2.0, this discrepancy is gone!

This happens in more places e.g index settings and mappings.

Filtered query deprecation

With the removal of filters, NEST has added a special construct in its Query DSL to easily create a bool query with a filter clause

.Query(q=> +q.Term(p=>p.Name, "NEST"))

the + will cause the term query to be wrapped inside a bool query's filter clause

You can even combine + with !

.Query(q=> !+q.Term(p=>p.Name, "NEST"))

This will wrap the term query inside a bool filter and subsequently inside a bool must_not clause. This approach also works with the object initializer syntax (OIS)

!+new TermQuery {}

Attribute based mapping

The single ElasticPropertyAttribute has been broken up into individual attributes per property type.

For instance, the following:

[ElasticType(Name = "othername", IdProperty = "MyId")]
public class Foo
{
  [ElasticProperty(Type = FieldType.String)]
  public Guid MyId { get; set; }
  [ElasticProperty(Type = FieldType.String)]
  public string Name { get; set; }
  [ElasticProperty(Type = FieldType.String, Analyzer = "myanalyzer", TermVector = TermVectorOption.WithOffsets)]
  public string Description { get; set; }
  [ElasticProperty(Type = FieldType.Date, Format = "mmmddyyyy")]
  public DateTime Date { get; set; }
  [ElasticProperty(Type = FieldType.Integer, Coerce = true)]
  public int Number { get; set; }
  [ElasticProperty(Type = FieldType.Nested, IncludeInParent = true)]
  public List<Bar> Bars { get; set; }
}

becomes

[ElasticsearchType(Name = "othername", IdProperty = "MyId")]
public class Foo
{
  [String]
  public Guid MyId { get; set; }
  [String]
  public string Name { get; set; }
  [String(Analyzer = "myanalyzer", TermVector = TermVectorOption.WithOffsets)]
  public string Description { get; set; }
  [Date(Format = "mmddyyyy")]
  public DateTime Date { get; set; }
  [Number(NumberType.Integer, Coerce = true, DocValues = true)]
  public int Number { get; set; }
  [Nested(IncludeInParent = true)]
  public List<Bar> Bars { get; set; }
}

Aside from a simpler and cleaner API, this allows each attribute to only reflect the options that are available for the particular type instead of exposing options that may not be relevant (as ElasticPropertyAttribute did).

Inferred types

Many places that only took a string or primitive type now take a more strongly typed object such as Id, Field, Fields, Index, Indices, Type, Types, DocumentPath<T>, etc. It's good to know that in most cases you can still pass a string or primitive type and it will be implicitly converted to the type where it makes sense.

If you are using C# 6 you can also statically import the Infer static class using using static Nest.Infer, allowing to write

  • Field<Project>(p => p.Name) any where that takes a Field
  • Field<Project>(p => p.Name).And(p=>p.Description).And("field") anywhere that takes Fields
  • Index<Project>() for Index

so on and so forth. If you are using the fluent API using these infer methods is not required since the fluent API is strongly typed through lambda expressions, but they are another tool at your disposal nonetheless.

C# 6 support

NEST's codebase has been largely rewritten to take advantage of all the cool new c# features making almost all the fluent code one liners

using static Nest.Infer;
//later..
Field<Project>(p => p.Name);
Index<Project>();
Indices<Project>().And<Developer>();

DNX Support

The 2.0 release has been released with a dotnet5.1 version on nuget that can be used on both the Desktop and Core CLR runtimes of DNX rc1. We are actively tracking rc2, the .NET Commandline interface (dotnet-cli) and the new netstandard NuGet target framework, with a plan to release compatible packages once these hit the scene.

Feedback

We'd like to thank everyone who took the alpha and rc releases out for a spin and provided invaluable input while we incubated the 2.0 release. A special shoutout to Blake Niemyjski for sharing his screen with us so we could see firsthand where the pain points were during an upgrade from 1.x to 2.0.

As always we very much welcome all feedback on our github issues page.