10 août 2018

This Week in Elasticsearch and Apache Lucene - 2018-08-10

Par

•

•

•

•

Colin Goodheart-Smithe

Elasticsearch

Highlights

Stability: We fixed several leaks that have been caused by third-party code. We’ve upgraded to Netty 4.1.28 which fixes a leak in their SSL implementation, and upgraded Log4J to fix a memory leak. We will ship the Log4J fix with 5.6.11 and 6.4.0.

Watcher: migrated Watcher to the PagerDuty v2 API; we want to upgrade early, although their v1 events API can still be used. Also, integrated BulkProcessor with Watcher to batch Watcher's document operations when users run lots of watches.

Improvements to Exception information for search failures: We have fixed a few different bugs that were uncovered whilst working on Cross Cluster Search. The first makes sure that we preserve the cluster alias when we get a failure in a cross cluster search. This means that if failures happen on different remote clusters with the same index name, the user can see the cluster that the failure happened on. The second fixes the QueryShardException so we report the index_uuid for the shard that failed as well as the other shard information.

Cross Cluster Replication: We’ve added support to enable metricbeat (system metrics) collection on all launched nodes. We have updated Rally's ccr-stats telemetry device to comply with the changes in the CCR stats API and set it as default for our benchmarks.

Mapping, Bulk and retries: We’ve had users reporting replica shards with huge translog files, caused by missing sequence numbers that were failed to replicate to the replicas, causing the local checkpoint on these replicas to fall behind. Since the translog is supposed to retain all operations above the local checkpoint, it became very big. The source of those missing sequence numbers turned out to be a mistake in how we handled timeouts while requesting the master to update the index mapping. In short, the entire bulk request was re-executed, causing the sequence numbers issued up to the point where the dynamic mapping were required to be lost. This was fixed in the 6.3.0 release of Elasticsearch with a targeted solution, but we wanted to have a more structural way to avoid retries all together. That refactoring was completed this week. The gist of the refactoring is that we now can pause and continue execution from where we were.

This may sound like a very technical internal thing (and it is!) but it is exciting as it opens the door to tackle another issue with mapping updates. When an indexing operation requires a mapping update, it currently has to wait for *all* nodes in the cluster to process that change. This is an overkill as only the nodes with the primary and its replicas actually need that mapping to complete the operation. It's a shame if an overloaded cold node slows a hot node from processing an operation. With the above refactoring, we now have the underpinnings to untangle this dependency.

Changes

Changes in 5.6:

LOGGING: Upgrade to Log4J 2.11.1 #32675
Fix content type detection with leading whitespace #32632
LOGGING: Upgrade to Log4J 2.11.1 #32616

Changes in 6.4:

LOGGING: Upgrade to Log4J 2.11.1 (#32616) #32668
Preserve index_uuid when creating QueryShardException #32677
Make sure that field collapsing supports field aliases. #32648
Add temporary directory cleanup workarounds #32615
Networking+Testing: Fix Netty ByteBuf Leaks in Test Code #32638
Cross-cluster search: preserve cluster alias in shard failures #32608
[Rollup] Improve ID scheme for rollup documents #32558
HLRC: Move commercial clients from XPackClient #32596
Fix race between replica reset and primary promotion #32442

Changes in 6.5:

Fix role query that can match nested documents #32705
Add expected mapping type to MapperException #31564
SQL: Bug fix for the optional "start" parameter usage inside LOCATE function #32576
SQL: Ignore H2 comparative tests for uppercasing/lowercasing string functions #32604
Upgrade to Lucene-7.5.0-snapshot-13b9e28f9d #32730
Whitelisting / from Circuit Breaker Exception (#32325) #32666
LOGGING: Upgrade to Log4J 2.11.1 (#32616) #32656
Core: Fix Java Time DateFormatter printers #32592
BREAKING: Switch WritePipelineResponse to AcknowledgedResponse #32722
TESTS: Explicitly Fail Http Client Timeouts #32708
Prevent cause from being null in ShardOperationFailedException #32640
CORE: Upgrade to Jackson 2.8.11 #32670
Expose whether or not the global checkpoint updated #32659
Include translog path in error message when translog is corrupted #32251
Verify primary mode usage with assertions #32667
Ignore script fields when size is 0 #31917
Tests: Fix Typo Causing Flaky Settings Test #32665
Docs: Allow snippets to have line continuation #32649
INGEST: Fix ThreadWatchDog Throwing on Shutdown #32578
Rest HL client: Add get license action #32438
Adds ckb to the list of unsupported languages #32611
Suppress LicensingDocumentationIT.testPutLicense in release builds #32613
Add cluster UUID to Cluster Stats API response #32206

Changes in 7.0:

serialize suggestion responses as named writeables #30284

Apache Lucene

Faster queries when not counting hits

Nightly benchmarks were fixed so you can now see how much it helps not to count total hits when computing top hits. This especially helps term queries and disjunctions, even though conjunctions and phrase queries got an interesting speedup as well. On the other hand, disjunctions within conjunctions slowed down, which we are investigating.

We improved collection of top hits with dis-max queries and boolean queries that contain a mix of SHOULD and MUST clauses.

Lucene 8.0

In the coming weeks, we will start an effort to upgrade the master branch of Elasticsearch to a Lucene 8.0 snapshot so that we can validate that changes in Lucene 8 play well with the way that Elasticsearch leverages Lucene.

Other

We fixed a test bug with IndexWriter.
BlendedInfixSuggester's way of computing scores is trappy with small weights.
The unified highlighter should merge overlapping matches.
We are adding support for DISJOINT and WITHIN relations to the new BKD-based shape support.
We propose to benchmark the new BKD-based shape support by reusing the existing benchmark that we use for points.

Elasticsearch Platform

Suite Elastic

Elastic Cloud

Observability

Security

Search

Par secteur

Par solution

Témoignage client

Développeurs

Communication

Apprentissage

Aide

Actualités d'Elastic

This Week in Elasticsearch and Apache Lucene - 2018-08-10

Elasticsearch

Highlights

Changes

Changes in 5.6:

Changes in 6.4:

Changes in 6.5:

Changes in 7.0:

Apache Lucene

Faster queries when not counting hits

Lucene 8.0

Other

Nous suivre

À propos de nous

Emplois

Presse

Partenaires

Confiance et sécurité

Relations investisseurs

EXCELLENCE AWARDS