Elasticsearch version 8.9.0edit

Also see Breaking changes in 8.9.

Known issuesedit

  • Question Answering fails on long input text. If the context supplied to the task is longer than the model’s max_sequence_length and truncate is set to none then inference fails with the message question answering result has invalid dimension. (issue: #97917)

Breaking changesedit

Aggregations
  • Switch TDigestState to use HybridDigest by default #96904

Bug fixesedit

Allocation
  • Attempt to fix delay allocation #95921
  • Fix NPE in Desired Balance API #97775
  • Fix autoexpand during node replace #96281
Authorization
  • Resolving wildcard application names without prefix query #96479 (issue: #96465)
CRUD
  • Fix retry_on_conflict parameter in update API to not retry indefinitely #96262
  • Handle failure in TransportUpdateAction#handleUpdateFailureWithRetry #97290 (issue: #97286)
Cluster Coordination
  • Avoid getStateForMasterService where possible #97304
  • Become candidate on publication failure #96490 (issue: #96273)
  • Fix cluster settings update task acknowledgment #97111
Data streams
  • Accept timestamp as object at root level #97401
Geo
  • Fix bug when creating empty geo_lines #97509 (issue: #97311)
  • Fix time-series geo_line to include reduce phase in MergedGeoLines #96953 (issue: #96983)
  • Support for Byte and Short as vector tiles features #97619 (issue: #97612)
ILM+SLM
  • Limit the details field length we store for each SLM invocation #97038 (issue: #96918)
Infra/CLI
Infra/Core
  • Capture max processors in static init #97119 (issue: #97088)
  • Interpret microseconds cpu stats from cgroups2 properly as nanos #96924 (issue: #96089)
Infra/Logging
  • Add slf4j-nop in order to prevent startup warnings #95459
Infra/REST API
  • Fix tchar pattern in RestRequest #96406
Infra/Scripting
  • Fix Painless method lookup over unknown super interfaces #97062 (issue: #97022)
Infra/Settings
  • Enable validation for versionSettings #95874 (issue: #95873)
Ingest Node
  • Fixing DateProcessor when the format is epoch_millis #95996
  • Fixing GeoIpDownloaderStatsAction$NodeResponse serialization by defensively copying inputs #96777 (issue: #96438)
  • Trim field references in reroute processor #96941 (issue: #96939)
Machine Learning
  • Catch exceptions thrown during inference and report as errors #2542
  • Fix WordPiece tokenization where stripping accents results in an empty string #97354
  • Improve model downloader robustness #97274
  • Prevent high memory usage by evaluating batch inference singularly #2538
Mapping
  • Avoid stack overflow while parsing mapping #95705 (issue: #52098)
  • Fix mapping parsing logic to determine synthetic source is active #97355 (issue: #97320)
Ranking
  • Fix sub_searches serialization bug #97587
Recovery
  • Promptly fail recovery from snapshot #96421 (issue: #95525)
Search
  • Prevent instantiation of top_metrics when sub-aggregations are present #96180 (issue: #95663)
  • Set new providers before building FetchSubPhaseProcessors #97460 (issue: #96284)
Snapshot/Restore
  • Fix blob cache races/assertion errors #96458
  • Fix reused/recovered bytes for files that are only partially recovered from cache #95987 (issues: #95970, #95994)
  • Fix reused/recovered bytes for files that are recovered from cache #97278 (issue: #95994)
  • Refactor RestoreClusterStateListener to use ClusterStateObserver #96662 (issue: #96425)
TSDB
  • Error message for misconfigured TSDB index #96956 (issue: #96445)
  • Min score for time series #96878
Task Management
  • Improve cancellability in TransportTasksAction #96279
Transform
  • Improve reporting status of the transform that is about to finish #95672

Enhancementsedit

Aggregations
  • Add cluster setting to SearchExecutionContext to configure TDigestExecutionHint #96943
  • Add support for dynamic pruning to cardinality aggregations on low-cardinality keyword fields #92060
  • Make TDigestState configurable #96794
  • Skip SortingDigest when merging a large digest in HybridDigest #97099
  • Support value retrieval in top_hits #95828
Allocation
  • Take into account expectedShardSize when initializing shard in simulation #95734
Analysis
  • Create .synonyms system index #95548
Application
  • Add template parameters to Search Applications #95674
  • Chunk profiling stacktrace response #96340
  • [Profiling] Add status API #96272
  • [Profiling] Allow to upgrade managed ILM policy #96550
  • [Profiling] Introduce ILM for K/V indices #96268
  • [Profiling] Require POST to retrieve stacktraces #96790
  • [Profiling] Tweak default ILM policy #96516
  • [Search Applications] Support arrays in stored mustache templates #96197
Authentication
  • Header validator with Security #95112
Authorization
  • Add Search ALC filter index prefix to the enterprise search user #96885
  • Ensure checking application privileges work with nested-limited roles #96970
Autoscaling
  • Add shard explain info to ReactiveReason about unassigned shards #88590 (issue: #85243)
DLM
  • Add auto force merge functionality to DLM #95204
  • Adding data_lifecycle to the _xpack/usage API #96177
  • Adding manage_data_stream_lifecycle index privilege and expanding view_index_metadata for access to data stream lifecycle APIs #95512
  • Allow for the data lifecycle and the retention to be explicitly nullified #95979
Data streams
  • Add support for logs@custom component template for `logs-- data streams #95481 (issue: #95469)
  • Adding ECS dynamic mappings component and applying it to logs data streams by default #96171 (issue: #95538)
  • Adjust ECS dynamic templates to support subobjects: false #96712
  • Automatically parse log events in logs data streams, if their message field contains JSON content #96083 (issue: #95522)
  • Change default of ignore_malformed to true in logs-*-* data streams #95329 (issue: #95224)
  • Set @timestamp for documents in logs data streams if missing and add support for custom pipeline #95971 (issues: #95537, #95551)
  • Update data streams implicit timestamp ignore_malformed settings #96051
Engine
  • Cache modification time of translog writer file #95107
  • Trigger refresh when shard becomes search active #96321 (issue: #95544)
Geo
  • Add brute force approach to GeoHashGridTiler #96863
  • Asset tracking - geo_line in time-series aggregations #94954
ILM+SLM
  • Chunk the GET _ilm/policy response #97251 (issue: #96569)
  • Move get lifecycle API to Management thread pool and make cancellable #97248 (issue: #96568)
  • Reduce WaitForNoFollowersStep requests indices shard stats #94510
Indices APIs
  • Bootstrap profiling indices at startup #95666
Infra/Node Lifecycle
  • SIGTERM node shutdown type #95430
Ingest Node
  • Add mappings for enrich fields #96056
  • Ingest: expose reroute inquiry/reset via Elastic-internal API bridge #96958
Machine Learning
  • Improved compliance with memory limitations #2469
  • Improve detection of calendar cyclic components with long bucket lengths #2493
  • Improve detection of time shifts, for example for daylight saving #2479
Mapping
  • Allow unsigned long field to use decay functions #96394 (issue: #89603)
Ranking
  • Add multiple queries for ranking to the search endpoint #96224
Recovery
  • Implement StartRecoveryRequest#getDescription #95731
Search
  • Add search shards endpoint #94534
  • Don’t generate stacktrace in EarlyTerminationException and TimeExceededException #95910
  • Feature/speed up binary vector decoding #96716
  • Improve brute force vector search speed by using Lucene functions #96617
  • Include search idle info to shard stats #95740 (issue: #95727)
  • Integrate CCS with new search_shards API #95894 (issue: #93730)
  • Introduce a filtered collector manager #96824
  • Introduce minimum score collector manager #96834
  • Skip shards when querying constant keyword fields #96161 (issue: #95541)
  • Support CCS minimize round trips in async search #96012
  • Support for patter_replace filter in keyword normalizer #96588
  • Support null_value for rank_feature field type #95811
Security
  • Add "_storage" internal user #95694
Snapshot/Restore
  • Reduce overhead in blob cache service get #96399
Stats
  • Add ingest information to the cluster info endpoint #96328 (issue: #95392)
  • Add script information to the cluster info endpoint #96613 (issue: #95394)
  • Add thread_pool information to the cluster info endpoint #96407 (issue: #95393)
TSDB
  • Feature: include unit support for time series rate aggregation #96605 (issue: #94630)
Vector Search
  • Leverage SIMD hardware instructions in Vector Search #96453 (issue: #96370)

New featuresedit

Application
  • Enable analytics geoip in behavioral analytics #96624
Authorization
  • Support restricting access of API keys to only certain workflows #96744
Data streams
  • Adding ability to auto-install ingest pipelines and refer to them from index templates #95782
Geo
ILM+SLM
  • Enhance ILM Health Indicator #96092
Infra/Node Lifecycle
  • Gracefully shutdown elasticsearch #96363
Infra/Plugins
  • [Fleet] Add .fleet-secrets system index #95625 (issue: #95143)
Machine Learning
  • Add support for xlm_roberta tokenized models #94089
  • Removes the technical preview admonition from query_vector_builder docs #96735
Snapshot/Restore
  • Add repo throttle metrics to node stats api response #96678 (issue: #89385)
Stats

Upgradesedit

Infra/Transport API
  • Bump TransportVersion to the first non-release version number. Transport protocol is now versioned independently of release version. #95286
Network
  • Upgrade Netty to 4.1.92 #95575
  • Upgrade Netty to 4.1.94.Final #97112
Search
  • Upgrade Lucene to a 9.7.0 snapshot #96433
  • Upgrade to new lucene snapshot 9.7.0-snapshot-a8602d6ef88 #96741