Tech Topics

NEST 5.0 released

Today, we are happy to announce the release of NEST and Elasticsearch.Net 5.0.

If you recall, our 2.0 release was full of breaking changes, not only due to the breaking changes in Elasticsearch 2.0, but also because the 2.0 client was a near complete rewrite. Whilst this was done for very good reasons, we realize how painful it made upgrading from 1.x to 2.x for our users.

This 5.0 release reaps all the benefits of the refactoring that was done in 2.0, introducing minimal breaking changes in comparison. The sole focus of NEST 5.0 was adding support for Elasticsearch 5.0 while minimizing breaking changes as much as possible. That being said, it is still a major version bump, so some breaking changes are to be expected! See the 5.0 Breaking Changes for a complete summary.

Elasticsearch 5.0 Features

Elasticearch 5.0 shipped with a ton of new awesome features and APIs, all of which are fully supported in the 5.0 release of NEST and Elasticsearch.NET. We'll recap some of the more notable features here.

Ingest Node

One of the biggest features in Elasticsearch 5.0 is Ingest node. As the documentation states, ingest node allows you to pre-process documents directly in Elasticsearch before indexing takes place.

There are a number of APIs and processors associated with an ingest pipeline, all of which are now supported in NEST 5.0.

For more information, check out our more detailed blog post Ingest Node: A Client's Perspective which showcases how to use ingest node with NEST.

X-Pack

NEST also fully supports all of the X-Pack APIs, including Security, Alerting, Monitoring and Graph.

Previously, Watcher (now called Alerting) 1.x APIs were supported in NEST 1.x as a separate NuGet package. This felt like a good idea at first as it allowed us to expose the necessary bits for third-party packages (official or community) to easily extend the client. However, as it transpired, this quickly became too unwieldy to support as the client and Watcher packages were on different release cycles. By moving it into the main package, we can ensure that it's always up to date and compatible with the latest version of NEST.

Client Features

In addition to the new Elasticsearch 5.0 APIs, NEST 5.0 also introduces a few new abstractions and client-specific features.

.NET Core support

When we released NEST 2.x, we supported cross platform .NET with a dotnet 5.1 release of the client, built using DNX toolchain. Now that DNX has been superseded by the dotnet toolchain, we continue to support cross platform .NET in the form of a netstandard 1.3 versions of the 5.x and 2.x client, making NEST usable across Windows, OSX, Linux and any other platform that supports netstandard 1.3.

Say Heya to the all new Scalar property

The fluent and attribute-based mapping APIs in NEST are powerful, and AutoMap() does a very good job at inferring property types. There was a slight nuance however when mapping scalar properties (i.e. numeric and date types) in that it was up to the caller to supply the actual numeric type (int, float, etc...) of the property being mapped. Without this, NEST would default to the Elasticsearch defaults (double in 2.x and float in 5.x).

For instance, in order to map MyInteger in the following POCO as an integer in Elasticsearch, the following was required:

public class MyClass
{
    public int MyInteger { get; set; }
}
client.Map<MyClass>(m => m
    .Properties(ps => ps
        .Number(n => n
            .Name(p => p.MyInteger)
            .Type(NumberType.Integer)
            .Coerce()
            .IgnoreMalformed(false)
        )
    )
);

With the new Scalar() mapping methods, the same thing can be achieved in a less verbose and much smarter fashion:

client.Map<MyClass>(m => m
    .Properties(ps => ps
        .Scalar(p => p.MyInteger, n => n
            .Coerce()
            .IgnoreMalformed(false)
        )
    )
);

which will automatically infer the correct numeric type based on the CLR type.

Idiomatic async support

The client now follows best practices in regards to asynchronous code execution. In previous versions of NEST, a CancellationToken could only be supplied using RequestConfiguration. We've made this much easier in 5.0, and it is now possible to provide an optional CancellationToken directly to any of the async endpoint methods.

We also ensure we call ConfigureAwait(false) on all async/await methods, since the callback context is not needed, improving performance slightly.

New Connection Settings

.DisableDirectStreaming() on a per-request basis

DisableDirectStreaming() is very handy for debugging as it preserves the original request and response streams so that they can be inspected. Previously, this could only be set at a global level for all requests. It is now possible to enable this on a per-request basis using RequestConfiguration, allowing you to capture ad-hoc requests and responses.

.ConnectionLimit()

Since version 2.0.0, NEST has supported both Desktop CLR and .NET Core. Depending on what framework you're running, HttpConnection uses a different HTTP implementation for making requests (HttpWebRequest on the Desktop CLR and HttpClient on .NET Core). Both of these implementations have different connection characteristics, and both have a different way of setting the maximum number of concurrent connections (ServicePointManager versus HttpClientHandler respectively).

We've chosen a sensible default of 80 maximum concurrent connections, but now also expose a new .ConnectionLimit() setting which allows you to override this regardless of the underlying HTTP implementation.

Serializer Buffer Size

The default JSON serializer, JsonNetSerializer, now exposes a BufferSize property that can be used to control the size of the byte buffer used when serializing to the request stream. Through benchmarking and profiling the client, a default value of 1024 was found to be a good compromise for all round performance but this can be overridden on a derived serializer type, should you wish to use a different value.

BulkAll(), ScrollAll(), and ReindexAll() helpers

While Elasticsearch provides the _bulk and scroll APIs for inserting or fetching large sets of documents, it's still up to the user to size and partition the number of documents for each request appropriately, as well as parallelize the execution of the requests.

NEST now offers new helper methods, BulkAll() and ScrollAll() which moves this burden from the user to the client.

BulkAll() accepts a collection of documents and automatically partitions them into smaller batches of documents and executes individual _bulk requests in parallel.

For instance, consider we create a lazy stream of 100k documents.

var size = 1000;
var pages = 100;
var seenPages = 0;
var numberOfDocuments = size * pages;
var documents = this.CreateLazyStreamOfDocuments(numberOfDocuments);

We can then set up our BulkAll() call as follows:

var observableBulkAll = client.BulkAll(documents, f => f
    .MaxDegreeOfParallelism(8)
    .BackOff(TimeSpan.FromSeconds(10))
    .NumberOfBackOffRetries(2)
    .Size(size)
    .RefreshOnCompleted()
    .Index(IndexName)
);

and then register an observer

var bulkAllObserver = new PartitionedBulkObserver(
    onError: (e) => { throw e; },
    onCompleted: () => doSomething(),
    onNext: (b) => Interlocked.Increment(ref seenPages)
);
observableBulkAll.Subscribe(bulkAllObserver);

Similarily, ScrollAll() leverages sliced scrolls to execute multiple scrolls in parallel behind and IObservable:

var scrollObservable = client.ScrollAll<SmallObject>("1m", slices, s => s
    .MaxDegreeOfParallelism(slices/ 2)
    .Search(search => search
        .Index(index)
        .AllTypes()
        .MatchAll()
    )
);

Lastly, NEST 5.0 also offers a reindex helper method ReindexAll() that uses both ScrollAll() and BulkAll() under the hood, taking advantage of the concurrency models of both methods.

Response changes

We've also made a few improvements to our responses.

Collections are now ReadOnly

Many responses from Elasticsearch contain JSON collections or objects. These were reflected in NEST as List<T> or Dictionary<TKey,TValue>, but in a very inconsistent manner. For instance, some responses contained IEnumerable<T> whereas others contained concrete types like List<T>; similarly, some contained IDictionary<TKey,TValue> while others Dictionary<TKey,TValue>.

In NEST 5.0, all response collections are now truly immutable and represented as either a IReadOnlyCollection<T> or IReadOnlyDictionary<TKey,TValue>, and this is consistent across all response objects.

Additionally, all collections are initialized to an empty collection, removing the need to null check the collection before enumerating.

Normalized ApiCall and CallDetails

IResponse contained two properties, ApiCall and CallDetails, which hold all of the details about the request and response. These were actually the same property with one slightly small implementation detail difference, making it confusing which to use. We've normalized this in NEST 5.0 to a single ApiCall property.

Breaking Changes

We've tried to avoid introducing breaking changes (except where Elasticsearch itself breaks) as much as possible and taken extra steps in marking methods and properties as obsolete in 2.x to help our users better plan for upgrading.

With that said, for the full list of breaking changes in NEST please refer to NEST 5.0 Breaking Changes, and for the full list of breaking changes in Elasticsearch.Net, refer to Elasticsearch.Net 5.0 Breaking Changes.

What's Next?

While we continue to support NEST 5.x, 2.x, and 1.x, we have already begun work on NEST 6.0, with a heap of plans cooking to further improve the client. One major goal for NEST 6.0 is removing Json.NET as a dependency and provide our own serializers with stream based parsers that are able to work with async streams. We already have a few components under way to aid in this venture. We'd also like to improve our support for F# with NEST 6.0 and release a more F# compatible version of the client. Stay tuned!

Lastly, a huge thank you to our community and everyone who contributed to and jumped on the RC, beta and alpha versions of NEST 5.0. To get started with using NEST 5.0, head on over to the documentation. For any issues, bug reports or feature requests, please visit our GitHub issues page, and for questions and comments, please visit our Discourse forum.