This Week in Elasticsearch and Apache Lucene - 2018-11-16
We’ve added an optimization for frozen shards that allows executing “can_match" phases efficiently without opening the underlying index reader. This is particularly useful for time-based indices where older indices will typically be moved to frozen indices. Filtering frozen indices with date filters can then efficiently skip large portions of frozen shards that can't match the query in the pre-filter phase. We also added documentation for using frozen indices as well as the relevant _freeze and _unfreeze APIs. If you want to learn more about using frozen indices, give this one a read.
We’re continuing working on the close index API, making core changes to our replication and shard-level permit system to allow a clean transition from open to closed index.
Geo in SQL
We’ve been working on changes to make geo in Elasticsearch more robust and have now laid enough groundwork so we can expose the first GeoSQL functions in the SQL plugin. The first function is
ST_WKTToSQL and can be used to convert geo-shapes represented in WKT (Well Known Text) to GeoShape objects that Elasticsearch understands.
Pre-hashed Password in PUT Users API
We added formal support for the password_hash field in the put users api. This came about because of a request to have a bulk API for creating users; we discussed the notion of a bulk API for users and did not feel this was the right direction. However, the request identified a valid concern with the throughput of the users API. We performed a cursory performance test with a naive, single-threaded python script to create users. With default options, we were able to create about 8.5 users per second, whereas with client-side hashing and refresh=false, we were able to create about 37 users per second. If an administrator had 5000 users to create, this would take the overall time from about 10 minutes to about 2.5 minutes.
We published the CCR getting started guide and overview docs for CCR beta. With 6.5 out of the door, our eyes are now all set on driving to a successful GA. We are moving the auto-follow logic (based on the auto-follow patterns) to long-polling. The first step of this effort is to add a new option to the cluster state API to wait, up to a timeout, for a given metadata version.
We’ve implemented a fix to address indexing slowdowns we observed in update- or refresh-heavy use cases and will be benchmarking the effects of this fix.
Tasks API + Security
A bug was found shortly after the 6.5.0 release in that the permissions supplied for the Kibana system role prevent full use of the tasks api. Ultimately, this boils down to the fact that we leak the implementation detail that tasks are stored in an index and the kibana system role doesn't grant access to this index. For 6.5.1, we added these permissions as a short term fix. As a temporary workaround, users may run Kibana as a user with the
kibana_system role and a role that grants
"create_index", "read", and "create" privileges on the
Audit Request ID
We opened a PR that will add a synthesized ID based on a user’s request to each event in the file audit log. This will allow users to correlate events such as “authentication success” and “access granted” for the same request.
We are working towards running all existing integration tests and REST test suites using Zen2 and are very close to putting the last missing pieces into place for this.
We added the capability for safely down-scaling the number of master-eligible nodes in a cluster. With this, we can switch a substantial portion of our integration tests over to Zen2. We also introduced a bootstrapping process that will allow Elasticsearch to determine its own initial cluster configuration. At the moment this is controlled by a temporary node setting which works well for our REST tests (where we are sure to start the right number of nodes or die trying). Initial tests with this PR show that we can soon switch our REST tests over to Zen2.
We’ve also completed the most critical aspect of the cluster state persistence layer that will give us much stronger atomicity guarantees.
Finally, we implemented serialization compatibility between Zen1 and Zen2 transport actions, allowing a Zen2 node to join a fully-formed cluster with a Zen1 master and vice-versa. Follow-up work will focus on failure conditions and an automated transition from a Zen1 master to a Zen2 master, ultimately enabling a smooth rolling upgrade experience from 6.x to 7.0.
Changes in 6.5:
- Grant .tasks access to kibana_system role #35573
- Handle IndexOrDocValuesQuery in composite aggregation #35392
- SQL: clear the cursor if nested inner hits are enough to fulfill the query required limits #35398
- Correct implemented interface of ParsedReverseNested #35455
- SQL: Fix query translation for scripted queries #35408
- Upgrade to Joda 2.10.1 #35410
Changes in 6.6:
- HLRC: migration api - upgrade #34898
- [RCI] Check blocks while having index shard permit in TransportReplicationAction #35332
- HLRC: Add parameters to stopRollupJob API #35545
- Clean up XPackInfoResponse class and related tests #35547
- Extract RunOnce into a dedicated class #35489
- Suppress CachedTimeThread in hot threads output #35558
- HLRC: Adding ML Update Filter API #35522
- Add Delete Privileges API to HLRC #35454
- HLRC: add support for get license basic/trial status API #33176
- Formal support for "password_hash" in Put User #35242
- Add stop rollup job support to HL REST Client #34702
- [Rollup] Add wait_for_completion option to StopRollupJob API #34811
- HLRC: Adding ml get filters api #35502
- Rest HL client: Add watcher stats API #35185
- HLRC support for getTask #35166
- Allow efficient can_match phases on frozen indices #35431
- [HLRC] Added support for CCR Put Follow API #35409
- Handle OS pretty name on old OS without OS release #35453
- Geo: enables coerce support in WKT polygon parser #35414
- [HLRC] Add GetRollupIndexCaps API #35102
- Fix the names of CCR stats endpoints in usage API #35438
- Add a java level freeze/unfreeze API #35353
Changes in 7.0:
- Replace usages of AtomicBoolean based block of code by the RunOnce class #35553
Most blockers for Lucene 7.6 are resolved and we will cut the release branch soon.
Searching for geo points in a polygon got faster
A costly operation when running point-in-polygon queries is to check whether a line segment crosses a rectangle. We added a simple optimization that first checks whether either end of the segment belongs to the rectangle before running more costly computations. This triggered a 20% throughput improvement for one query that we use for benchmarking which searches for points in London.
- Should we stop multiplying every BM25 score by (k1+1)?
- Can we garbage-collect unused fields?
- Toke is proposing interesting ideas to speed up reads on sparse doc values, but there is disagreement whether this should be added to existing codecs via a cache, or only folded into a new codec and computed at index time.
- Itamar Syn-Hershko would like the simple query parser to support a field operator.
- Perhaps we should make it easier to walk token graphs in order to more easily add support for graphs to a greater number of token filters.
- Chistophe Bismuth has worked on making ExitableDirectoryReader capable of interrupting range and geo queries.
- Christophe is also working with us on early-terminating queries sorted by _doc and speeding up collection of top-k hits with constant-scoring queries.
- We found a bug that identifying ears of a polygon based on the encoded coordinates is unsafe and needs decoded coordinates.
- We identified room for improvement when merging point fields that have data-only dimensions as we are still doing operations on these dimensions that are only necessary for indexed dimensions.