This Week in Elasticsearch - 2019-12-20

Elasticsearch Highlights

Scripting

We merged #50106, which substantially improves query performance for re-execution of queries, including aggregations, that use scripts, given other caching requirements are met. This change enables the query shard cache for scripts that avoid whitelisted API methods that generate random numbers, random UUIDs, get the current system time and other non-deterministic things.

Searchable snapshots

We have added a simple interface for creating snapshot-backed indices in Elasticsearch. It works by creating a repository of type "searchable" which is a wrapper around another repository such that when you restore from this repository you get a snapshot-backed index instead of a real index. The goal here is to just add enough functionality that allows us to start looking at performance aspects. We expect to offer a much more streamlined UX in due course.

We have added a searchable snapshot directory implementation, which exposes the snapshot of a shard as a Lucene Directory, allowing Lucene to interact with the files of the snapshot. We are also adding a directory wrapper that is tracking statistics of the various access patterns by Lucene. This will guide the design of our node-local caching system, which will play a big role in making access to searchable snapshots blazingly fast.

Remote cluster connections

We are finishing up the new "proxy" remote connection mode to support CCR/CCS in Cloud environments. After discussion, we added support to the new connection mode to send a hostname as part of the TLS SNI header. This is because ECE/ESS encodes the remote cluster name in the hostname and the proxy, which is shared by multiple clusters, requires the header to route the connections to the appropriate cluster.

We also renamed the new connection mode from "simple" to "proxy". This led to a clash with an already existing undocumented setting (cluster.remote.cluster_name.proxy) which interferes with any new settings namespaced by proxy (ex: cluster.remote.cluster_name.proxy.address). We explored whether we could remove this setting in the same release where we would add the new settings. That turned out to be very tricky in the presence of rolling upgrades. Instead, we decided to remove the mode namespacing and rely on validation to ensure that improper settings are not used with the wrong mode.

Changes in Elasticsearch

Changes in 8.0:

  • Fixes to task result index mapping #50359
  • Refactor environment variable processing for Docker #49612

Changes in 7.6:

  • Add --data-dir option to run task #50342
  • Scripting: ScriptFactory not required by compile #50344
  • [DOCS] Adds inference processor documentation #50204
  • Add ILM histore store index #50287
  • Add per-field metadata. #49419
  • Geo: Switch generated WKT to upper case #50285
  • Scripting: Cache script results if deterministic #50106
  • Extract a create index method that only manipulates the ClusterState #50240
  • Do not load SSLService in plugin contructor #49667
  • Optimize composite aggregation based on index sorting #48399
  • Fix Index Deletion During Partial Snapshot Create #50234
  • Recovery buffer size 16B smaller #50100
  • "CONTAINS" support for BKD-backed geo_shape and shape fields #50141
  • [Transform] add actual timeout in message #50140
  • Respect ES_PATH_CONF on package install #50158
  • Validate exporter type is HTTP for HTTP exporter #49992

Changes in 7.5:

  • Handle renaming the README #50404
  • Always consume the body in has privileges #50298
  • Ensure global buildinfo plugin is applied for distro download #50249
  • Fix ingest simulate response document order if processor executes async #50244
  • SQL: fix NPE for JdbcResultSet.getDate(param, Calendar) calls #50184
  • Account trimAboveSeqNo in committed translog generation #50205
  • Fix Index Deletion during Snapshot Finalization #50202
  • Migrate peer recovery from translog to retention lease #49448
  • Improve DateFieldMapper ignore_malformed handling #50090

Changes in 6.8:

  • SQL: Fix issue with CAST and NULL checking. #50371

Changes in Elasticsearch SQL ODBC Driver

Changes in 7.6:

  • Fix test: force JSON format for tests #203
  • DSN editor: add option for payload compression #202
  • Enable payload compression #201
  • Compression: add zlib subtree #200

Changes in Rally

Changes in 1.4.0:

  • Add task exclude filter #844
  • Store Disk I/O metrics if available #841