We have completed all the deprecation info work for 6.x. This allows users to call an API that will check if they are using deprecated settings mapping, etc. in 6.x that would prevent them from successfully upgrading to 7.0. This API will be used by the migration assistant in Kibana to help users with the upgrade to 7.0.

Users will still need to check the deprecation logs to ensure that their client applications are not using deprecated features in their requests.

Enabling Nanosecond Timestamps

Before Christmas we spotted a significant reduction in indexing throughput when multiple date parsers are used. We now have an immediate fix and also an improvement so Java time is now even faster for this scenario in microbenchmarks.

A new store type for accessing index files

Elasticsearch has an expert setting called index.store.type that controls how we read index files. Currently the default is to use mmapfs which reads the files using memory-mapping. We recently found that mmapfs does not perform well when updates occur all over a large index (in the TB range).

We have now merged a new hybridfs store type which picks the best method of accessing the index files based on the Lucene file type and resulting access pattern. This store type is available from 6.7.0 onwards and will also be the default in Elasticsearch 7.0.0.

The benefit is dependent on the workload and the index size. Workloads with random accesses (bulk updates and queries) and indices that are large compared to the available page cache benefit most from hybridfs.

hybridfs will not be beneficial in every use case and may reduce indexing throughput for update workloads and some queries when the index size is "small" (compared to the available page cache). We are looking into further possibilities to improve the situation for smaller indices as well.

Cross Cluster Replication Follower Index UI

We made a number of additions to the CCR UI to allow users to manage follower indexes including adding a [table and detail view for displaying follower indices (https://github.com/elastic/kibana/pull/27804), adding UI actions to pause, resume and unfollow for a follower index, adding support on the Kibana server for fetching and creating follower indices, and adding the ability to configure advanced settings when creating a follower index.

Speeding up shard peer recoveries

We are working to speed up peer recoveries. The current recovery implementation sends one file chunk at a time, waiting for acknowledgment of the previous chunk before sending the next one. The new approach allows sending N chunks in parallel to more efficiently saturate a network pipe, which can half recovery times when using TLS and even have a significant impact when using plain connections.

Closed replicated indices

We have added unique IDs to cluster blocks to power the 2-phase-commit style close index API. With this, he has now merged the new close index API into master, and is currently backporting it to 6.7, where it will be used to provide a clean transition for indices to be frozen.

Faster Top Hits retrieval

There is a new option in the search request to limit the number of total hits that should be tracked. Instead of a boolean true/false it is now possible to set a numeric value in the track_total_hits option that limits the total hits tracked during the request. This option can be used to implement pagination on requests even when the total number of hits is not accurate.

Reindex from remote and SSL with Security

Current reindex doesn't know anything about the certificates that are used by security and trust can only be configured using the JVM wide system properties. We have been working on a solution to this and have opened the first PR, which creates a new library for common ssl configuration and loading of keys and certificates. This will be followed up by work that adds support for defining custom ssl configuration specifically for reindex.

OpenID Connect Support

We are working on creating an OpenID Connect realm. The basic realm infrastructure and necessary endpoints are up for review.

Changes

Changes in Elasticsearch

Changes in 7.0:

Fix rest reindex test for IPv4 addresses 37310
Zen2: Add join validation 37203
BREAKING: Support 'includetypename' in RestGetIndicesAction 37149
[Analysis] Deprecate Standard Html Strip Analyzer in master 26719
Deprecate reference to _type in lookup queries 37016
[API] spelling: unknown 37056
SNAPSHOT: Make Atomic Blob Writes Mandatory 37168
Add the ability to set the number of hits to track accurately 36357
Subclass NIOFSDirectory instead of using FileSwitchDirectory 37140
Deprecate use of the _type field in aggregations. 37131
Add hybridfs store type 36668
[Zen2] Elect freshest master in upgrade 37122
Deprecate use of type in reindex request body 36823
Rename setting to enable mmap 37070
query_string should use indexed prefixes 36895
BREAKING: Remove bwc logic for token invalidation 36893
Change missing authn message to not mention tokens 36750
Make SourceToParse immutable 36971
BREAKING: Remove special handling for ingest plugins 36967
Rewrite SourceToParse with resolved docType 36921
Scripting: Remove deprecated params.ctx 36848
Add JDK 12 to CI rotation 36915
Improve error message for 6.x style realm settings 36876

Changes in 6.7:

tests/fix randomly failing testWatcherRestart 35243
ingest: compile mustache template only if field includes '{{'' 37207
Support includetypename in the field mapping and index template APIs. 37210
Types removal - add constants for includetypenames 37304
Security: reorder realms based on last success 36878
ALLOC: Fail Stale Primary Alloc. Req. without Data 37226
[CCR] FollowingEngine should fail with 403 if operation has no seqno assigned 37213
Add getZone to JodaCompatibleZonedDateTime 37084
Enable Bulk-Merge if all source remains 37269
Fix type inference related compile issue in Eclipse 37264
Add an include_type_name option to 6.x. (#29453) 37147
Use List instead of priority queue for stable sorting in bucket sort aggregator 36748
Separate out validation of groups of settings 34184
[Tests] Change cluster scope in CorruptedFileIT and FlushIT 37229
Handle malformed license signatures 37137
Do not mutate RecoveryResponse 37204
Security: propagate auth result to listeners 36900
HLRC: Use nonblocking entity for requests 32249
Introduce retention lease expiration 37195
Stop automatically nesting mappings in index creation requests. 36924
Introduce shard history retention leases 37167
Ensure that local cluster alias is never treated as remote 37121
Add support for providing absolute start time to SearchRequest 37142
clarify what to run for gradle idea 37058
[API] spelling: subtract 37055
[API] spelling: repositories 37053
[API] spelling: interruptible 37049
[API] spelling: input 37048
[API] spelling: input 37048
[API] spelling: likelihood 37052
[API] spelling: cacheable 37047
Remove single shard optimization when suggesting shard_size 37041
Skip final reduction if SearchRequest holds a cluster alias 37000
Spelling docs 37046
Add support for local cluster alias to SearchRequest 36997
Fix suite scope random initialization 37163
Force Refresh Listeners when Acquiring all Operation Permits 36835
SQL: Improve error message when unable to translate to ES query DSL 37129
Fix Reindex from remote query logic 36908
Fix weighted_avg parser not found for RestHighLevelClient 37027
Implement Atomic Blob Writes for HDFS Repository 37066
SNAPSHOT: Speed up HDFS Repository Writes 37069
restrict node start-up when cluster name in data path 36519
Don't block on peer recovery on the target side 37076
Expose search.throttled on _cat/indices37073
[ILM] Add Freeze Action 36910
SQL: Preserve original source for each expression 36912
SQL: Enhance message for PERCENTILE[_RANK] with field as 2nd arg 36933
Replace the TreeMap in the composite aggregation 36675
[API] spelling: similar 37054
Deprecation check for Auth realm setting structure 36664
Replaced the word 'shards' with 'replicas' in an error message. (#36234) 36275
Keys are compared in BucketSortPipelineAggregation so making key type… 36407
[CCR] Added autofollowexception.timestamp field to auto follow stats 36947
BREAKING: Package ingest-user-agent as a module 36956
BREAKING: Package ingest-geoip as a module 36898
Move ingest-geoip default databases out of config 36949
RecoveryMonitor#lastSeenAccessTime should be volatile 36781
Deprecation check for indices with multiple types 36952

Changes in 6.6:

SQL: Fix bug regarding alias fields with dots 37279
[CCR] Make shard follow tasks more resilient for restarts 37239
[CCR] Resume follow Api should not require a request body 37217
SQL: Proper handling of COUNT(fieldname) and COUNT(DISTINCT fieldname) 37254
Reload SSL context on file change for LDAP 36937
SQL: fix COUNT DISTINCT filtering 37176
Fix setting by time unit 37192
Fix handling of fractional time value settings 37171
Fix handling of fractional byte size value settings 37172
SQL: Handle the bwc Joda ZonedDateTime scripting class in Painless 37024
Make sure to accept empty unnested mappings in create index requests. 37089
Retry JDK download when building Docker image 37113
[CCR] AutoFollowCoordinator and follower index already created 36540
Make CCR resilient against missing remote cluster connections 36682
Fix typo in unitTest task 36930

Changes in 6.5:

SQL: Fix issue with wrong NULL optimization 37124
Fix NPE in CachingUsernamePasswordRealm 36953
Handle Null in FetchSourceContext#fetchSource 36839

Changes in Elasticsearch Hadoop Plugin

Changes in 7.0:

Fix DateIndexFormatterTest 1232

Changes in Elasticsearch Management UI

Changes in 7.0:

trigger full load when encountering 403 for index list reload 28243

Changes in 6.7:

Prevent overwriting ilm config the ui does not know about 28370
[CCR] Put back integration test for remote cluster 27778

Changes in 6.6:

Fix missing escape field name in history list directive. 27112
Fix Index Management enricher response variable 28404
[ILM] Fix Index Management not loading when ILM enricher errors out 28108
[CCR] Tell user when multiple auto-follow patterns try to replicate the same data 27783

Changes in Elasticsearch SQL ODBC Driver

Changes in 6.6:

Fix: parameter length handling, error reporting, timestamp handling 85

Changes in Rally

Changes in 1.0.3:

Warn about skewed results when using node-stats telemetry device 627
Allow to specify a team revision 625
Fix conflicting pipelines and distribution version 617

Changes in Rally Tracks

Added missing files.txt for eventdata and so tracks 58
Make directory for target root match path for curl 57

Lucene

Lucene 8

Awesome news, the 8x branch has been cut in preparation for the next major Lucene/Solr 8.0 release. Next step is to remove all deprecations from master and ensure that we have viable alternatives for them.

GeoShapes

Work is progressing well adding Contains support for BKD-backed geoshapes, and there is now a patch in review. There is also an effort to decrease I/O pressure when merging BKD segments. We have noticed that when merging large segments for high dimensional points(like LatLonShape), there was a lot of I/O. Ignacio has changed the strategy used to perform the merging of segments and it seems it improves the usage of disk space and it actually improves the indexing throughput significantly especially for high dimensions.

Interval Queries

We have been working on improving scoring for the IntervalQuery. Currently this query uses the same term weighting as SpanQuery, in order to improve the scoring from similar functionality like SpanQueries Alan is experimenting with using a sloppy interval frequency and a saturation function to see if more useful scores can be extracted. We hope to open an issue about this in the next couple of days.

PMC extension

Nick Knize has accepted his invitation to join the Lucene PMC in recognition of his efforts driving Geo forward. Congratulations Nick!

The Search AI Company

Generative AI

Search

Security

Observability

By solution

Industries

This Week in Elasticsearch and Apache Lucene - 2019-01-11

Elasticsearch

Freeze action added to Index Lifecycle Management

Deprecation Info API work complete for 6.x

Enabling Nanosecond Timestamps

A new store type for accessing index files

Cross Cluster Replication Follower Index UI

Speeding up shard peer recoveries

Closed replicated indices

Faster Top Hits retrieval

Reindex from remote and SSL with Security

OpenID Connect Support

Changes

Changes in Elasticsearch

Changes in Elasticsearch Hadoop Plugin

Changes in Elasticsearch Management UI

Changes in Elasticsearch SQL ODBC Driver

Changes in Rally

Changes in Rally Tracks

Lucene

Lucene 8

GeoShapes

Interval Queries

PMC extension

Follow us

About us

Join us

Press

Partners

Trust & Security

Investor relations

EXCELLENCE AWARDS