03 August 2018

This Week in Elasticsearch and Apache Lucene - 2018-08-03

By Adrien GrandColin Goodheart-SmithePaul SanwaldBoaz Leskes


Rollups fix for doc ID bug

We have raised a PR to fix a doc id bug in rollups. Rollup doc Ids will now be a 128-bit hash prepended by the job ID to avoid hash collisions. The change will not cause a backwards compatibility break as jobs will "upgrade" themselves to the new hash, but we will be recommending that users rebuild their rollup jobs from scratch to protect themselves from any hash collisions that occurred prior to upgrade.


We worked on increasing the testability and reproducibility of MasterService, the class that receives Cluster State Update Tasks from the rest of the code base. We put up a PR thats adds gossiping to discover other nodes and exchange information required for discovering an existing master or electing a new one. This is the equivalent of the current ZenDiscovery pinging.

Cross Cluster Replication:

We reached a first milestone in the AWS support for CCR benchmarking. We are now able to benchmark between regions eu-central-1 and us-east-2 and terraform is doing all the legwork of setting things up like instances etc. apart from the creation of the VPCs + Peering, which will be next thing to automate.

We got results from the http_logs track using the AWS c5.2xlarge instance type vs the n1-highcpu-16 in GCP we've been using so far. Both scenarios have identical #nodes, OS, memory, disk, ES build, ES settings and each node per region resides in a different AZ, as is the typical Cloud H/A.

Results are quite similar as can be seen in the table below, with the exception that in AWS following indices seem to take a bit longer to fully catch up after bulk has ended; this corroborates my earlier observation that raw throughput per connection across the chosen regions is faster in GCP, but tinkering with some batch and buffer size settings might be worth it on AWS.


Changes in 6.3:

  • High-level client: fix clusterAlias parsing in SearchHit #32465
  • IndicesClusterStateService should replace an init. replica with an init. primary with the same allocation id #32374

Changes in 6.4:

  • Release requests in cors handle (#32410) #32505
  • REST high-level client: parse back _ignored meta field #32362
  • Add licensing enforcement for FIPS mode #32437
  • Improve the error message when an index is incompatible with field aliases. #32482
  • [Kerberos] Remove Kerberos bootstrap checks #32451
  • Make get all app privs requires "*" permission #32460

Changes in 6.5:

  • SQL: Minor fix for javadoc #32573
  • Core: Minor size reduction for AbstractComponent #32509
  • INGEST: Enable default pipelines #32286
  • Scripting: Conditionally use java time api in scripting #31441
  • SQL: Added support for string manipulating functions with more than one parameter #32356
  • Logging: Make node name consistent in logger #31588
  • Scripting: Fix painless compiler loader to know about context classes #32385
  • Increase max chunk size to 256Mb for repo-azure #32101
  • HLRC: Add delete watch action #32337
  • Use the determinant formula for calculating the orientation of a polygon #27967
  • update rollover to leverage write-alias semantics #32216

Changes in 7.0:

  • NETWORKING: Fix Netty Leaks by upgrading to 4.1.28 #32511
  • Fix AutoIntervalDateHistogram.testReduce random failures #32301
  • [Kerberos] Add missing javadocs #32469

Apache Lucene

Lucene 8.0

When we initially started the discussion about releasing Lucene 8.0, it was argued that we should first make the new top-hits optimizations easier to use and enabled by default, which we did. As a consequence, we revived the email thread to discuss a target date, the initial proposal being October 2018.

Faster queries when not counting hits

- We have a working patch that shows great speedups for boolean queries that mix MUST and SHOULD clauses when not counting total hits.

- We are proposing a simple optimization to dis-max queries, which is yet very effective in the common case that one clause dominates the scores, such as when searching across a short and a long field (eg. title/body).

- Sorted indices should stop calling the comparator when they are only counting hits.

BKD-based geo shapes

- We are adding support for indexing lines and points as triangles, so that they could be indexed into the same field as polygons.

- We are adding support for finding all polygons that intersect a given polygon (the query). We currently only support finding polygons that intersect a box.


- Sorted indices failed to validate that fields used for sorting had the expected doc-value type, which caused a very confusing null-pointer-exception upon abort. Jim fixed the NPE, and added index-time validation so that Lucene fails a single document rather than an entire segment.

- CheckIndex's duplication of SegmentInfos serialization was completely out-of-date.

- Even though RAMDirectory should only be used for testing, it deserves some cleanups.

- We shouldn't try to optimize exception handling by using pre-built exceptions in order to avoid filling stack traces. Not only does it make debugging impossible but it also shouldn't be necessary as the JVM can skip creating stack traces if they are not used.

- Should we add support for comparing Object[] arrays to FutureArrays?

- DaciukMihovAutomatonBuilder now has protection against stack overflows.