NEST 5.0 released
Today, we are happy to announce the release of NEST and Elasticsearch.Net 5.0.
If you recall, our 2.0 release was full of breaking changes, not only due to the breaking changes in Elasticsearch 2.0, but also because the 2.0 client was a near complete rewrite. Whilst this was done for very good reasons, we realize how painful it made upgrading from 1.x to 2.x for our users.
This 5.0 release reaps all the benefits of the refactoring that was done in 2.0, introducing minimal breaking changes in comparison. The sole focus of NEST 5.0 was adding support for Elasticsearch 5.0 while minimizing breaking changes as much as possible. That being said, it is still a major version bump, so some breaking changes are to be expected! See the 5.0 Breaking Changes for a complete summary.
Elasticsearch 5.0 Features
Elasticearch 5.0 shipped with a ton of new awesome features and APIs, all of which are fully supported in the 5.0 release of NEST and Elasticsearch.NET. We'll recap some of the more notable features here.
Ingest Node
One of the biggest features in Elasticsearch 5.0 is Ingest node. As the documentation states, ingest node allows you to pre-process documents directly in Elasticsearch before indexing takes place.
There are a number of APIs and processors associated with an ingest pipeline, all of which are now supported in NEST 5.0.
For more information, check out our more detailed blog post Ingest Node: A Client's Perspective which showcases how to use ingest node with NEST.
X-Pack
NEST also fully supports all of the X-Pack APIs, including Security, Alerting, Monitoring and Graph.
Previously, Watcher (now called Alerting) 1.x APIs were supported in NEST 1.x as a separate NuGet package. This felt like a good idea at first as it allowed us to expose the necessary bits for third-party packages (official or community) to easily extend the client. However, as it transpired, this quickly became too unwieldy to support as the client and Watcher packages were on different release cycles. By moving it into the main package, we can ensure that it's always up to date and compatible with the latest version of NEST.
Client Features
In addition to the new Elasticsearch 5.0 APIs, NEST 5.0 also introduces a few new abstractions and client-specific features.
.NET Core support
When we released NEST 2.x, we supported cross platform .NET with a dotnet 5.1 release of the client, built using DNX toolchain. Now that DNX has been superseded by the dotnet toolchain, we continue to support cross platform .NET in the form of a netstandard 1.3 versions of the 5.x and 2.x client, making NEST usable across Windows, OSX, Linux and any other platform that supports netstandard 1.3.
Say Heya to the all new Scalar
property
The fluent and attribute-based mapping APIs in NEST are powerful, and AutoMap() does a very good job at inferring property types. There was a slight nuance however when mapping scalar properties (i.e. numeric and date types) in that it was up to the caller to supply the actual numeric type (int
, float
, etc...) of the property being mapped. Without this, NEST would default to the Elasticsearch defaults (double
in 2.x and float
in 5.x).
For instance, in order to map MyInteger
in the following POCO as an integer in Elasticsearch, the following was required:
public class MyClass { public int MyInteger { get; set; } } client.Map<MyClass>(m => m .Properties(ps => ps .Number(n => n .Name(p => p.MyInteger) .Type(NumberType.Integer) .Coerce() .IgnoreMalformed(false) ) ) );
With the new Scalar()
mapping methods, the same thing can be achieved in a less verbose and much smarter fashion:
client.Map<MyClass>(m => m .Properties(ps => ps .Scalar(p => p.MyInteger, n => n .Coerce() .IgnoreMalformed(false) ) ) );
which will automatically infer the correct numeric type based on the CLR type.
Idiomatic async support
The client now follows best practices in regards to asynchronous code execution. In previous versions of NEST, a CancellationToken
could only be supplied using RequestConfiguration
. We've made this much easier in 5.0, and it is now possible to provide an optional CancellationToken
directly to any of the async endpoint methods.
We also ensure we call ConfigureAwait(false)
on all async/await methods, since the callback context is not needed, improving performance slightly.
New Connection Settings
.DisableDirectStreaming()
on a per-request basis
DisableDirectStreaming()
is very handy for debugging as it preserves the original request and response streams so that they can be inspected. Previously, this could only be set at a global level for all requests. It is now possible to enable this on a per-request basis using RequestConfiguration
, allowing you to capture ad-hoc requests and responses.
.ConnectionLimit()
Since version 2.0.0, NEST has supported both Desktop CLR and .NET Core. Depending on what framework you're running, HttpConnection
uses a different HTTP implementation for making requests (HttpWebRequest
on the Desktop CLR and HttpClient
on .NET Core). Both of these implementations have different connection characteristics, and both have a different way of setting the maximum number of concurrent connections (ServicePointManager
versus HttpClientHandler
respectively).
We've chosen a sensible default of 80
maximum concurrent connections, but now also expose a new .ConnectionLimit()
setting which allows you to override this regardless of the underlying HTTP implementation.
Serializer Buffer Size
The default JSON serializer, JsonNetSerializer
, now exposes a BufferSize
property that can be used to control the size of the byte buffer used when serializing to the request stream. Through benchmarking and profiling the client, a default value of 1024 was found to be a good compromise for all round performance but this can be overridden on a derived serializer type, should you wish to use a different value.
BulkAll()
, ScrollAll()
, and ReindexAll()
helpers
While Elasticsearch provides the _bulk
and scroll
APIs for inserting or fetching large sets of documents, it's still up to the user to size and partition the number of documents for each request appropriately, as well as parallelize the execution of the requests.
NEST now offers new helper methods, BulkAll()
and ScrollAll()
which moves this burden from the user to the client.
BulkAll()
accepts a collection of documents and automatically partitions them into smaller batches of documents and executes individual _bulk
requests in parallel.
For instance, consider we create a lazy stream of 100k documents.
var size = 1000; var pages = 100; var seenPages = 0; var numberOfDocuments = size * pages; var documents = this.CreateLazyStreamOfDocuments(numberOfDocuments);
We can then set up our BulkAll()
call as follows:
var observableBulkAll = client.BulkAll(documents, f => f .MaxDegreeOfParallelism(8) .BackOff(TimeSpan.FromSeconds(10)) .NumberOfBackOffRetries(2) .Size(size) .RefreshOnCompleted() .Index(IndexName) );
and then register an observer
var bulkAllObserver = new PartitionedBulkObserver( onError: (e) => { throw e; }, onCompleted: () => doSomething(), onNext: (b) => Interlocked.Increment(ref seenPages) ); observableBulkAll.Subscribe(bulkAllObserver);
Similarily, ScrollAll()
leverages sliced scrolls to execute multiple scrolls in parallel behind and IObservable
:
var scrollObservable = client.ScrollAll<SmallObject>("1m", slices, s => s .MaxDegreeOfParallelism(slices/ 2) .Search(search => search .Index(index) .AllTypes() .MatchAll() ) );
Lastly, NEST 5.0 also offers a reindex helper method ReindexAll()
that uses both ScrollAll()
and BulkAll()
under the hood, taking advantage of the concurrency models of both methods.
Response changes
We've also made a few improvements to our responses.
Collections are now ReadOnly
Many responses from Elasticsearch contain JSON collections or objects. These were reflected in NEST as List<T>
or Dictionary<TKey,TValue>
, but in a very inconsistent manner. For instance, some responses contained IEnumerable<T>
whereas others contained concrete types like List<T>
; similarly, some contained IDictionary<TKey,TValue>
while others Dictionary<TKey,TValue>
.
In NEST 5.0, all response collections are now truly immutable and represented as either a IReadOnlyCollection<T>
or IReadOnlyDictionary<TKey,TValue>
, and this is consistent across all response objects.
Additionally, all collections are initialized to an empty collection, removing the need to null check the collection before enumerating.
Normalized ApiCall
and CallDetails
IResponse
contained two properties, ApiCall
and CallDetails
, which hold all of the details about the request and response. These were actually the same property with one slightly small implementation detail difference, making it confusing which to use. We've normalized this in NEST 5.0 to a single ApiCall
property.
Breaking Changes
We've tried to avoid introducing breaking changes (except where Elasticsearch itself breaks) as much as possible and taken extra steps in marking methods and properties as obsolete in 2.x to help our users better plan for upgrading.
With that said, for the full list of breaking changes in NEST please refer to NEST 5.0 Breaking Changes, and for the full list of breaking changes in Elasticsearch.Net, refer to Elasticsearch.Net 5.0 Breaking Changes.
What's Next?
While we continue to support NEST 5.x, 2.x, and 1.x, we have already begun work on NEST 6.0, with a heap of plans cooking to further improve the client. One major goal for NEST 6.0 is removing Json.NET as a dependency and provide our own serializers with stream based parsers that are able to work with async streams. We already have a few components under way to aid in this venture. We'd also like to improve our support for F# with NEST 6.0 and release a more F# compatible version of the client. Stay tuned!
Lastly, a huge thank you to our community and everyone who contributed to and jumped on the RC, beta and alpha versions of NEST 5.0. To get started with using NEST 5.0, head on over to the documentation. For any issues, bug reports or feature requests, please visit our GitHub issues page, and for questions and comments, please visit our Discourse forum.