Elasticsearch version 8.14.0edit

Also see Breaking changes in 8.14.

Known issuesedit

  • If encountering directories on the Java library path the Elasticsearch process does not have permission to access, the process fails to boot with a NullPointerException: Cannot invoke "org.elasticsearch.nativeaccess.Systemd.notify_ready()" because "this.systemd" is null.
    This only affects on-prem installations of Elasticsearch, environments running Elasticsearch in a container are not affected, nor is Elastic Cloud.
    The workaround is to grant Elasticsearch read access to the directory mentioned in the java.nio.file.AccessDeniedException as observed in the logs and repeat this until the process starts up properly, see here for further details. With 8.14.1 the workaround won’t be necessary anymore and original permissions can be restored.

Breaking changesedit

Security
  • Prevent DLS/FLS if replication is assigned #108600
  • Apply stricter Document Level Security (DLS) rules for the validate query API with the rewrite parameter #105709
  • Apply stricter Document Level Security (DLS) rules for terms aggregations when min_doc_count is set to 0 #105714

Bug fixesedit

Aggregations
  • Cross check livedocs for terms aggs when index access control list is non-null #105714
  • ESQL: Enable VALUES agg for datetime #107016
  • Fix IOOBE in TTest aggregation when using filters #109034
  • Validate stats formatting in standard InternalStats constructor #107678 (issue: #107671)
Application
  • [Bugfix] Connector API - fix status serialisation issue in termquery #108365
  • [Connector API] Fix bug with filtering validation toXContent #107467
  • [Connector API] Fix bug with parsing *_doc_count nullable fields #108854
  • [Connector API] Fix bug with with wrong target index for access control sync #109097
Authorization
  • Users with monitor privileges can access async_search/status endpoint even when setting keep_alive #107383
CAT APIs
CCR
  • Add ?master_timeout query parameter to ccr apis #105168
CRUD
  • Fix noop_update_total is not being updated when using the _bulk #105745 (issue: #105742)
  • Use correct system index bulk executor #106150
Cluster Coordination
  • Fix support for infinite ?master_timeout #107050
Data streams
  • Add non-indexed fields to ecs templates #106714
  • Fix bulk NPE when retrying failure redirect after cluster block #107598
  • Improve error message when rolling over DS alias #106708 (issue: #106137)
  • Only skip deleting a downsampled index if downsampling is in progress as part of DSL retention #109020
Downsampling
  • Fix downsample action request serialization #106919 (issue: #106917)
EQL
  • Use #addWithoutBreaking when adding a negative number of bytes to the circuit breaker in SequenceMatcher #107655
ES|QL
  • ESQL: Allow reusing BUCKET grouping expressions in aggs #107578
  • ESQL: Disable quoting in FROM command #108431
  • ESQL: Fix MV_DEDUPE when using data from an index #107577 (issue: #104745)
  • ESQL: Fix error message when failing to resolve aggregate groupings #108101 (issue: #108053)
  • ESQL: Fix treating all fields as MV in COUNT pushdown #106720
  • ESQL: Re-enable logical dependency check #105860
  • ESQL: median, count and count_distinct over constants #107414 (issues: #105248, #104900)
  • ES|QL fix no-length substring with supplementary (4-byte) character #107183
  • ES|QL: Fix usage of IN operator with TEXT fields #106654 (issue: #105379)
  • ES|QL: Improve support for TEXT fields in functions #106810
  • Fix docs generation of signatures for variadic functions #107865
  • [ESQL] Mark date_diff as requiring all three arguments #108834 (issue: #108383)
Health
  • Don’t stop checking if the HealthNode persistent task is present #105449 (issue: #98926)
  • Health monitor concurrency fixes #105674 (issue: #105065)
Highlighting
  • Check preTags and postTags params for empty values #106396 (issue: #69009)
  • added fix for inconsistent text trimming in Unified Highlighter #99961 (issue: #101803)
Infra/CLI
  • Workaround G1 bug for JDK 22 and 22.0.1 #108571
Infra/Core
  • Add a check for the same feature being declared regular and historical #106285
  • Fix AffixSetting.exists to include secure settings #106745
  • Fix regression in get index settings (human=true) where the version was not displayed in human-readable format #107447
  • Nativeaccess: try to load all located libsystemds #108238 (issue: #107878)
  • Update several references to IndexVersion.toString to use toReleaseVersion #107828 (issue: #107821)
  • Update several references to TransportVersion.toString to use toReleaseVersion #107902
Infra/Logging
  • Log when update AffixSetting using addAffixMapUpdateConsumer #97072
Infra/Node Lifecycle
  • Consider ShardRouting roles when calculating shard copies in shutdown status #106063
  • Wait indefintely for http connections on shutdown by default #106511
Infra/Scripting
  • Guard against a null scorer in painless execute #109048 (issue: #43541)
  • Painless: Apply true regex limit factor with FIND and MATCH operation #105670
Ingest Node
  • Catching StackOverflowErrors from bad regexes in GsubProcessor #106851
  • Fix uri_parts processor behaviour for missing extensions #105689 (issue: #105612)
  • Remove leading is_ prefix from Enterprise geoip docs #108518
  • Slightly better geoip databaseType validation #106889
License
Machine Learning
  • Fix NPE in ML assignment notifier #107312
  • Fix startOffset must be non-negative error in XLMRoBERTa tokenizer #107891 (issue: #104626)
  • Fix the position of spike, dip and distribution changes bucket when the sibling aggregation includes empty buckets #106472
  • Make OpenAI embeddings parser more flexible #106808
Mapping
  • Dedupe terms in terms queries #106381
  • Extend support of allowedFields to getMatchingFieldNames and getAllFields #106862
  • Fix for raw mapping merge of fields named "properties" #108867 (issue: #108866)
  • Handle infinity during synthetic source construction for scaled float field #107494 (issue: #107101)
  • Handle pass-through subfields with deep nesting #106767
  • Wrap "Pattern too complex" exception into an IllegalArgumentException #109173
Network
  • Fix HTTP corner-case response leaks #105617
Search
  • Add internalClusterTest for and fix leak in ExpandSearchPhase #108562 (issue: #108369)
  • Avoid attempting to load the same empty field twice in fetch phase #107551
  • Bugfix: Disable eager loading BitSetFilterCache on Indexing Nodes #105791
  • Cross-cluster painless/execute actions should check permissions only on target remote cluster #105360
  • Fix error 500 on invalid ParentIdQuery #105693 (issue: #105366)
  • Fix range queries for float/half_float fields when bounds are out of type’s range #106691
  • Fixing NPE when requesting [none] for stored_fields #104711
  • Fork when handling remote field-caps responses #107370
  • Handle parallel calls to createWeight when profiling is on #108041 (issues: #104131, #104235)
  • Harden field-caps request dispatcher #108736
  • Replace UnsupportedOperationException with IllegalArgumentException for non-existing columns #107038
  • Unable to retrieve multiple stored field values #106575
  • Validate model_id is required when using the learning_to_rank rescorer #107743
Security
  • Disable validate when rewrite parameter is sent and the index access control list is non-null #105709
  • Fix field caps and field level security #106731
Snapshot/Restore
  • Fix double-pausing shard snapshot #109148 (issue: #109143)
  • Treat 404 as empty register in AzureBlobStore #108900 (issue: #108504)
  • SharedBlobCacheService.maybeFetchRegion should use computeCacheFileRegionSize #106685
TSDB
  • Flip dynamic mapping condition when create tsid #105636
Transform
  • Consolidate permissions checks #106413 (issue: #105794)
  • Disable PIT for remote clusters #107969
  • Make force-stopping the transform always remove persistent task from cluster state #106989 (issue: #106811)
  • Only trigger action once per thread #107232 (issue: #107215)
  • [Transform] Auto retry Transform start #106243
Vector Search
  • Fix multithreading copies in lib vec #108802
  • [8.14] Fix multithreading copies in lib vec #108810

Deprecationsedit

Mapping
  • Deprecate allowing fields in scenarios where it is ignored #106031

Enhancementsedit

Aggregations
  • Add a PriorityQueue backed by BigArrays #106361
  • All new shard_seed parameter for random_sampler agg #104830
Allocation
  • Add allocation stats #105894
  • Add index forecasts to /_cat/allocation output #97561
Application
  • [Profiling] Add TopN Functions API #106860
  • [Profiling] Allow to override index settings #106172
  • [Profiling] Speed up serialization of flamegraph #105779
Authentication
  • Support Profile Activate with JWTs with client authn #105439 (issue: #105342)
Authorization
  • Allow users to get status of own async search tasks #106638
  • [Security Solution] Add read permission for third party agent indices for kibana_system #107046
Data streams
  • Add data stream lifecycle to kibana reporting template #106259
ES|QL
  • Add ES|QL Locate function #106899 (issue: #106818)
  • Add ES|QL signum function #106866
  • Add status for enrich operator #106036
  • Add two new OGC functions ST_X and ST_Y #105768
  • Adjust array resizing in block builder #106934
  • Bulk loading enrich fields in ESQL #106796
  • ENRICH support for TEXT fields #106435 (issue: #105384)
  • ESQL: Add timers to many status results #105421
  • ESQL: Allow grouping key inside stats expressions #106579
  • ESQL: Introduce expression validation phase #105477 (issue: #105425)
  • ESQL: Log queries at debug level #108257
  • ESQL: Regex improvements #106429
  • ESQL: Sum of constants #105454
  • ESQL: Support ST_DISJOINT #107007
  • ESQL: Support partially folding CASE #106094
  • ESQL: Use faster field caps #105067
  • ESQL: extend BUCKET with spans #107272
  • ESQL: perform a reduction on the data node #106516
  • Expand support for ENRICH to full set supported by ES ingest processors #106186 (issue: #106162)
  • Introduce ordinal bytesref block #106852 (issue: #106387)
  • Leverage ordinals in enrich lookup #107449
  • Serialize big array blocks #106373
  • Serialize big array vectors #106327
  • Specialize serialization for ArrayVectors #105893
  • Specialize serialization of array blocks #106102
  • Speed up serialization of BytesRefArray #106053
  • Support ST_CONTAINS and ST_WITHIN #106503
  • Support ST_INTERSECTS between geometry column and other geometry or string #104907 (issue: #104874)
Engine
  • Add metric for calculating index flush time excluding waiting on locks #107196
Highlighting
  • Enable encoder and tags_schema highlighting settings at field level #107224 (issue: #94028)
ILM+SLM
  • Add a flag to re-enable writes on the final index after an ILM shrink action. #107121 (issue: #106599)
Indices APIs
  • Wait forever for IndexTemplateRegistry asset installation #105985
Infra/CLI
  • Enhance search tier GC options #106526
  • Increase KDF iteration count in KeyStoreWrapper #107107
Infra/Core
  • Add pluggable BuildVersion in NodeMetadata #105757
Infra/Metrics
  • Infrastructure for metering the update requests #105063
  • DocumentParsingObserver to accept an indexName to allow skipping system indices #107041
Infra/Scripting
Ingest Node
  • Add support for the Anonymous IP database to the geoip processor #107287 (issue: #90789)
  • Add support for the Enterprise database to the geoip processor #107377
  • Adding cache_stats to geoip stats API #107334
  • Support data streams in enrich policy indices #107291 (issue: #98836)
Machine Learning
  • Add GET _inference for all inference endpoints #107517
  • Added a timeout parameter to the inference API #107242
  • Enable retrying on 500 error response from Cohere text embedding API #105797
Mapping
  • Make int8_hnsw our default index for new dense-vector fields #106836
Ranking
  • Add retrievers using the parser-only approach #105470
Search
  • Add Lucene spanish plural stemmer #106952
  • Add modelId and modelText to KnnVectorQueryBuilder #106068
  • Add a SIMD (Neon) optimised vector distance function for int8 #106133
  • Add transport version for search load autoscaling #106377
  • CCS with minimize_roundtrips performs incremental merges of each SearchResponse #105781
  • Track ongoing search tasks #107129
Security
  • Invalidating cross cluster API keys requires manage_security #107411
  • Show owner realm_type for returned API keys #105629
Snapshot/Restore
  • Add setting for max connections to S3 #107533
  • Distinguish different snapshot failures by log level #105622
Stats
  • (API+) CAT Nodes alias for shard header to match CAT Allocation #105847
  • Add total size in bytes to doc stats #106840 (issue: #97670)
TSDB
  • Improve short-circuiting downsample execution #106563
  • Support non-keyword dimensions as routing fields in TSDB #105501
  • Text fields are stored by default in TSDB indices #106338 (issue: #97039)
Transform
  • Check node shutdown before fail #107358 (issue: #100891)
  • Do not log error on node restart when the transform is already failed #106171 (issue: #106168)

New featuresedit

Application
  • Allow typed_keys for search application Search API #108007
  • [Connector API] Support cleaning up sync jobs when deleting a connector #107253
ES|QL
  • ESQL: Values aggregation function #106065 (issue: #103600)
  • ESQL: allow sorting by expressions and not only regular fields #107158
  • Support ES|QL requests through the NodeClient::execute #106244
Indices APIs
  • Add granular error list to alias action response #106514 (issue: #94478)
Machine Learning
  • Add Cohere rerank to _inference service #106378
  • Add support for Azure OpenAI embeddings to inference service #107178
  • Create default word based chunker #107303
  • Text structure endpoints to determine the structure of a list of messages and of an indexed field #105660
Mapping
Security
  • Get and Query API Key with profile uid #106531
Vector Search
  • Adding support for hex-encoded byte vectors on knn-search #105393

Upgradesedit

Infra/Core
Ingest Node
  • Updating the tika version to 2.9.1 in the ingest attachment plugin #106315
Network
Packaging
  • Update bundled JDK to Java 22 (again) #108654