This week we merged elastic/rally#830, which adds a completely new approach to how Elasticsearch clusters are managed in Rally. Up to this point, Rally was only able to spin up uniform clusters, i.e. each cluster node is configured identically but this hinders automating more complex benchmark setups. This PR is a complete game changer and allows users to spin up arbitrarily complex cluster architectures (hot-warm, hot-warm-cold, with or without dedicated master nodes, etc.) for any Elasticsearch version between 2.0 and today’s latest commit on master. This new functionality is also achieved without relying on Rally’s internal actor system that will eventually be replaced to reduce complexity and improve stability. It is already available on the development version of Rally and will be the highlight of the upcoming 1.4.0 release. More examples are available in the documentation.

Snapshots

We merged a significant change to how we handle repository metadata. It uses the cluster state for keeping track of the blob in which the latest valid repository metadata can be found, allowing us to work around S3's eventually consistent nature. With this change, the eventual consistency will no longer make us accidentally overwrite, and thus corrupt in some cases, the repository metadata.

There is one follow-up change left in this effort that will make use of the new information in the cluster state to make a number of repository operations more efficient. The efficiency gains come from not having to list out the repository contents to find the latest version of its metadata and being able to remove a number of fallbacks that we used to work around issues that were resolved automatically by keeping a pointer to the latest metadata in the cluster state.

Analytics

A new histogram field type was merged on master and 7.x at the end of last week! This field type will allow us to do things like store Promethus histograms.

Geo

After a couple of weeks of getting the EdgeTree into a state that we can properly evaluate the differences in performance of TriangleTree and EdgeTree, it seems most benchmarks suggest it would benefit our aggregations to be backed by the TriangleTree instead of the GeometryTree. This goes against our consensus hypothesis when first starting deciding which tree to implement. Thanks to our determinism to demonstrate that it is likely our consensus is misguided, we had the luxury of actually testing both strategies against one another.

To help demonstrate which types of queries (in this case, tile queries) favor each tree, we visualized a precision=9 tiling of Switzerland and mapped the cost of each query. One major failure of the EdgeTree is for queries of tiles that lie just outside (and east of) the shape, but within the shapes bounding box. This results in the EdgeTree traversing O(n) edges. The EdgeTree tends to be slightly faster overall for most queries, but when it is slower than TriangleTree it is much slower which might outweigh its other advantages.

Monitoring

We are discussing the migration strategy for 7.x to 8.0 with respect to internal collectors and metricbeat collectors. We decided to decouple the shape changes of the monitoring documents from the upgrade to reduce the number of moving parts and to allow monitoring to continue to work across versions. We also decided that metricbeat collection should not be a requirement prior to upgrade and will need some adjustments to the internal collectors to ensure a smoother upgrade path.

Snapshot Lifecycle Management in Cloud

We merged a PR that adds Cloud support to SLM. These changes include adequate protections for Cloud-managed policies.

TLS

We have opened a PR for a new “elasticsearch-certutil http" utility.

This sub-command provides a guided process for creating SSL certs for the Elasticsearch HTTP (REST) interface.

As always, It Depends, but most customers are best served by using different certificates for transport and http, because the needs & usage of those interfaces are different.

For transport you want to use certificates to lock down your cluster. Typically that means running a custom Certificate Authority for the cluster so that the nodes trust one another, and no one else.

However, for HTTP you generally want to support access from a variety of clients, in a variety of languages that have their own builtin trusted CAs (or they use the Operating System’s CA list). Ideally you want to use a corporate CA, or if you don’t have one, a single CA for all of your ES clusters so that your clients can be configured once and then be able to connect to all of your clusters. You also need a copy of that CA in formats that are suitable for each client (For most clients that’s PEM, but JVM based clients will typically find PKCS#12 more helpful), and instructions for how to configure those clients.

Over time, we will evolve our instructions to recommend using "certutil cert” for transport level certificates, and “certutil http” for http level certificates.

Apache Lucene Highlights

Lucene 8.4: A branch should be cut later this week, and the first RC will be built shortly afterwards.

Efforts have been restarted to move the build to Gradle.
There is benchmarking the new AWS Graviton instances against existing comparable instances for indexing and search loads.Discussion around disallow configuring term vectors on a per-document basis. This is something that is already enforced by Elasticsearch on top of Lucene.
A few weeks ago, we removed the root cache for the FST that serves as an index of the terms dictionary. However it looks like removing similar caches for the Kuromoji and Nori analyzers hurts, so we will keep those.
There is exploration around configuring parallelism of search requests based on the current state of the threadpool: the larger the queue, the fewer threads will be used to search concurrently.

Changes in Elasticsearch

Changes in 8.0:

Silence lint warnings in server project - part 2 #49728
Migrate some of the Docker tests from old repository #49079
Add healthchecks to distro docker-compose.yml #49710

Changes in 7.6:

Consistent case in CLI option descriptions #49635
Fix task input for docker build #49814
Use Cluster State to Track Repository Generation #49729
Add reusable HistogramValue object #49799
Fix invalid break iterator highlighting on keyword field #49566
Fixes a bug in interval filter serialization #49793
[Transform] automatic deletion of old checkpoints #49496
Scripting: add available languages & contexts API #49652
Replicate write actions before fsyncing them #49746

Changes in 7.5:

Fix external integ test zip dep to expect a zip #49813
Extend systemd timeout during startup #49784
[Transform] Fix possible audit logging disappearance after rolling upgrade #49731
SQL: fix LOCATE function optional parameter handling #49666
SQL: fix NULL handling for FLOOR and CEIL functions #49644
SQL: handle NULL arithmetic operations with INTERVALs #49633

Changes in 6.8:

Support es7 node http publish_address format #49279

Changes in Rally

Changes in 1.4.0:

Expose API for cluster settings #831
Manage Elasticsearch nodes with dedicated subcommands #830
Only keep the most recent build log #832

The Search AI Company

Generative AI

Search

Security

Observability

By solution

Industries

This Week in Elasticsearch and Apache Lucene - 2019-12-06

Elasticsearch Highlights

Rally

Snapshots

Analytics

Geo

Monitoring

Snapshot Lifecycle Management in Cloud

TLS

Apache Lucene Highlights

Changes in Elasticsearch

Changes in Rally

Follow us

About us

Join us

Press

Partners

Trust & Security

Investor relations

EXCELLENCE AWARDS