Breaking changes in 8.0edit

This section discusses the changes that you need to be aware of when migrating your application to Elasticsearch 8.0.

See also Release highlights and Release notes.

Note

Coming in 8.0.0.

Indices created before 7.0edit

Elasticsearch 8.0 can read indices created in version 7.0 or above. An Elasticsearch 8.0 node will not start in the presence of indices created in a version of Elasticsearch before 7.0.

Important

Reindex indices from Elasticsearch 6.x or before

Indices created in Elasticsearch 6.x or before will need to be reindexed with Elasticsearch 7.x in order to be readable by Elasticsearch 8.x.

Analysis changesedit

The nGram and edgeNGram token filter names have been removededit

The nGram and edgeNGram token filter names that have been deprecated since version 6.4 have been removed. Both token filters can only be used by their alternative names ngram and edge_ngram since version 7.0.

Allocation changesedit

Auto-release flood-stage block no longer optionaledit

If a node exceeds the flood-stage disk watermark then we add a block to all of its indices to prevent further writes as a last-ditch attempt to prevent the node completely exhausting its disk space. By default, from 7.4 onwards the block is automatically removed when a node drops below the high watermark again, but this behaviour could be disabled by setting the system property es.disk.auto_release_flood_stage_block to false. This behaviour is no longer optional, and this system property must now not be set.

Packaging changesedit

In Flight Request Circuit Breakeredit

The name of the in flight requests circuit breaker in log output and diagnostic APIs (such as the node stats API) changes from in_flight_requests to inflight_requests to align it with the name of the corresponding settings.

Discovery changesedit

Removal of old discovery settingsedit

All settings under the discovery.zen namespace, which existed only for BWC reasons in 7.x, will no longer be supported. In particular, this includes:

  • discovery.zen.no_master_block
  • discovery.zen.hosts_provider
  • discovery.zen.publish_timeout
  • discovery.zen.commit_timeout
  • discovery.zen.publish_diff.enable
  • discovery.zen.ping.unicast.concurrent_connects
  • discovery.zen.ping.unicast.hosts.resolve_timeout
  • discovery.zen.ping.unicast.hosts
  • discovery.zen.unsafe_rolling_upgrades_enabled
  • discovery.zen.commit_timeout
  • discovery.zen.fd.connect_on_network_disconnect
  • discovery.zen.fd.ping_interval
  • discovery.zen.fd.ping_timeout
  • discovery.zen.fd.ping_retries
  • discovery.zen.fd.register_connection_listener
  • discovery.zen.join_retry_attempts
  • discovery.zen.join_retry_delay
  • discovery.zen.max_pings_from_another_master
  • discovery.zen.send_leave_request
  • discovery.zen.master_election.wait_for_joins_timeout
  • discovery.zen.master_election.ignore_non_master_pings
  • discovery.zen.publish.max_pending_cluster_states

Mapping changesedit

Limiting the number of completion contextsedit

The number of completion contexts within a single completion field has been limited to 10.

Defining multi-fields within multi-fieldsedit

Previously, it was possible to define a multi-field within a multi-field. Defining chained multi-fields was deprecated in 7.3 and is now no longer supported. To migrate the mappings, all instances of fields that occur within a fields block should be removed, either by flattening the chained fields blocks into a single level, or by switching to copy_to if appropriate.

Packaging changesedit

Java 11 is requirededit

Java 11 or higher is now required to run Elasticsearch and any of its command line tools.

Rollup changesedit

StartRollupJob endpoint returns success if job already startededit

Previously, attempting to start an already-started rollup job would result in a 500 InternalServerError: Cannot start task for Rollup Job [job] because state was [STARTED] exception.

Now, attempting to start a job that is already started will just return a successful 200 OK: started response.

Snapshot and Restore changesedit

Get snapshots response format is changededit

It’s possible to get snapshots from multiple repositories in one go. The response format has changed and now contains separate response for each repository.

For example, requesting one snapshot from particular repository

GET _snapshot/repo1/snap1

produces the following response

{
    "responses": [
        {
            "repository": "repo1",
            "snapshots": [
                {
                    "snapshot": "snap1",
                    "uuid": "cEzdqUKxQ5G6MyrJAcYwmA",
                    "version_id": 8000099,
                    "version": "8.0.0",
                    "indices": [],
                    "include_global_state": true,
                    "state": "SUCCESS",
                    "start_time": "2019-05-10T17:01:57.868Z",
                    "start_time_in_millis": 1557507717868,
                    "end_time": "2019-05-10T17:01:57.909Z",
                    "end_time_in_millis": 1557507717909,
                    "duration_in_millis": 41,
                    "failures": [],
                    "shards": {
                        "total": 0,
                        "failed": 0,
                        "successful": 0
                    }
                }
            ]
        }
    ]
}

See Snapshot And Restore for more information.

Deprecated node level compress setting removededit

For shared file system repositories ("type": "fs"), the node level setting repositories.fs.compress could previously be used to enable compression for all shared file system repositories where compress was not specified. The repositories.fs.compress setting has been removed.

Instead use the repository specific compress setting to enable compression. See Snapshot And Restore for information on the compress setting.

Compression of meta data files is now on by defaultedit

Previously, the default value for compress was false. The default has been changed to true.

This change will affect both newly created repositories and existing repositories where compress=false has not been explicitly specified.

For more information on the compress option, see Snapshot And Restore

The S3 repository plugin uses the DNS style access pattern by defaultedit

Starting in version 7.4 the repository-s3 plugin does not use the now-deprecated path-style access pattern by default. In versions 7.0, 7.1, 7.2 and 7.3 the repository-s3 plugin always used the path-style access pattern. This is a breaking change for deployments that only support path-style access but which are recognized as supporting DNS-style access by the AWS SDK. If your deployment only supports path-style access and is affected by this change then you must configure the S3 client setting path_style_access to true. This breaking change was made necessary by AWS’s announcement that the path-style access pattern is deprecated and will be unsupported on buckets created after September 30th 2020.

Security changesedit

The accept_default_password setting has been removededit

The xpack.security.authc.accept_default_password setting has not had any affect since the 6.0 release of Elasticsearch. It has been removed and cannot be used.

The roles.index.cache.* settings have been removededit

The xpack.security.authz.store.roles.index.cache.max_size and xpack.security.authz.store.roles.index.cache.ttl settings have been removed. These settings have been redundant and deprecated since the 5.2 release of Elasticsearch.

The elasticsearch-migrate tool has been removededit

The elasticsearch-migrate tool provided a way to convert file realm users and roles into the native realm. It has been deprecated since 7.2.0. Users and roles should now be created in the native realm directly.

The transport.profiles.*.xpack.security.type setting has been removededit

The transport.profiles.*.xpack.security.type setting has been removed since the Transport Client has been removed and therefore all client traffic now uses the HTTP transport. Transport profiles using this setting should be removed.

Index Lifecycle Management changesedit

indices.lifecycle.poll_interval must be greater than 1 secondedit

The setting indices.lifecycle.poll_interval, if set too low, can cause excessive load on a cluster. This setting must now be set to 1 second or higher.

indexlifecycle packages renamed to ilmedit

In the high level REST client, the indexlifecycle package has been renamed to ilm to match the package rename inside the Elasticsearch code.

Java API changesedit

Changes to Fuzzinessedit

To create Fuzziness instances, use the fromString and fromEdits method instead of the build method that used to accept both Strings and numeric values. Several fuzziness setters on query builders (e.g. MatchQueryBuilder#fuzziness) now accept only a `Fuzziness`instance instead of an Object. You should preferably use the available constants (e.g. Fuzziness.ONE, Fuzziness.AUTO) or build your own instance using the above mentioned factory methods.

Fuzziness used to be lenient when it comes to parsing arbitrary numeric values while silently truncating them to one of the three allowed edit distances 0, 1 or 2. This leniency is now removed and the class will throw errors when trying to construct an instance with another value (e.g. floats like 1.3 used to get accepted but truncated to 1). You should use one of the allowed values.

Changes to Repositoryedit

Repository has no dependency on IndexShard anymore. The contract of restoreShard and snapshotShard has been reduced to Store and MappingService in order to improve testability.

Network changesedit

Removal of old network settingsedit

The network.tcp.connect_timeout setting was deprecated in 7.x and has been removed in 8.0. This setting was a fallback setting for transport.connect_timeout. To change the default connection timeout for client connections transport.connect_timeout should be modified.

Node changesedit

Removal of node.max_local_storage_nodes settingedit

The node.max_local_storage_nodes setting was deprecated in 7.x and has been removed in 8.0. Nodes should be run on separate data paths to ensure that each node is consistently assigned to the same data path.

Change of data folder layoutedit

Each node’s data is now stored directly in the data directory set by the path.data setting, rather than in ${path.data}/nodes/0, because the removal of the node.max_local_storage_nodes setting means that nodes may no longer share a data path. At startup, Elasticsearch will automatically migrate the data path to the new layout. This automatic migration will not proceed if the data path contains data for more than one node. You should move to a configuration in which each node has its own data path before upgrading.

If you try to upgrade a configuration in which there is data for more than one node in a data path then the automatic migration will fail and Elasticsearch will refuse to start. To resolve this you will need to perform the migration manually. The data for the extra nodes are stored in folders named ${path.data}/nodes/1, ${path.data}/nodes/2 and so on, and you should move each of these folders to an appropriate location and then configure the corresponding node to use this location for its data path. If your nodes each have more than one data path in their path.data settings then you should move all the corresponding subfolders in parallel. Each node uses the same subfolder (e.g. nodes/2) across all its data paths.

Rejection of ancient closed indicesedit

In earlier versions a node would start up even if it had data from indices created in a version before the previous major version, as long as those indices were closed. Elasticsearch now ensures that it is compatible with every index, open or closed, at startup time.

Transport changesedit

Removal of old transport settingsedit

The following settings have been deprecated in 7.x and removed in 8.0. Each setting has a replacement setting that was introduced in 6.7.

  • transport.tcp.port replaced by transport.port
  • transport.tcp.compress replaced by transport.compress
  • transport.tcp.connect_timeout replaced by transport.connect_timeout
  • transport.tcp_no_delay replaced by transport.tcp.no_delay
  • transport.profiles.profile_name.tcp_no_delay replaced by transport.profiles.profile_name.tcp.no_delay
  • transport.profiles.profile_name.tcp_keep_alive replaced by transport.profiles.profile_name.tcp.keep_alive
  • transport.profiles.profile_name.reuse_address replaced by transport.profiles.profile_name.tcp.reuse_address
  • transport.profiles.profile_name.send_buffer_size replaced by transport.profiles.profile_name.tcp.send_buffer_size
  • transport.profiles.profile_name.receive_buffer_size replaced by transport.profiles.profile_name.tcp.receive_buffer_size

HTTP changesedit

Removal of old HTTP settingsedit

The http.tcp_no_delay setting was deprecated in 7.x and has been removed in 8.0. It has been replaced by http.tcp.no_delay.

Changes to Encoding Plus Signs in URLsedit

Starting in version 7.4, a + in a URL will be encoded as %2B by all REST API functionality. Prior versions handled a + as a single space. If your application requires handling + as a single space you can return to the old behaviour by setting the system property es.rest.url_plus_as_space to true. Note that this behaviour is deprecated and setting this system property to true will cease to be supported in version 8.

Reindex changesedit

Reindex from remote would previously allow URL encoded index-names and not re-encode them when generating the search request for the remote host. This leniency has been removed such that all index-names are correctly encoded when reindex generates remote search requests.

Instead, please specify the index-name without any encoding.

Removal of typesedit

The /{index}/{type}/_delete_by_query and /{index}/{type}/_update_by_query REST endpoints have been removed in favour of /{index}/_delete_by_query and /{index}/_update_by_query, since indexes no longer contain types, these typed endpoints are obsolete.

Removal of size parameteredit

Previously, a _reindex request had two different size specifications in the body:

  • Outer level, determining the maximum number of documents to process
  • Inside the source element, determining the scroll/batch size.

The outer level size parameter has now been renamed to max_docs to avoid confusion and clarify its semantics.

Similarly, the size parameter has been renamed to max_docs for _delete_by_query and _update_by_query to keep the 3 interfaces consistent.

Search Changesedit

Removal of typesedit

The /{index}/{type}/_search, /{index}/{type}/_msearch, /{index}/{type}/_search/template and /{index}/{type}/_msearch/template REST endpoints have been removed in favour of /{index}/_search, /{index}/_msearch, /{index}/_search/template and /{index}/_msearch/template, since indexes no longer contain types, these typed endpoints are obsolete..

The /{index}/{type}/_termvectors, /{index}/{type}/{id}/_termvectors and /{index}/{type}/_mtermvectors REST endpoints have been removed in favour of /{index}/_termvectors, /{index}/{id}/_termvectors and /{index}/_mtermvectors, since indexes no longer contain types, these typed endpoints are obsolete..

Removal of queriesedit

The common query, deprecated in 7.x, has been removed in 8.0. The same functionality can be achieved by the match query if the total number of hits is not tracked.

Removal of query parametersedit

The cutoff_frequency parameter, deprecated in 7.x, has been removed in 8.0 from match and multi_match queries. The same functionality can be achieved without any configuration provided that the total number of hits is not tracked.

Removal of sort parametersedit

The nested_filter and nested_path options, deprecated in 6.x, have been removed in favor of the nested context.

Shard allocation awareness in Search and Get requestsedit

Elasticsearch will no longer prefer using shards in the same location (with the same awareness attribute values) to process _search and _get requests. Adaptive replica selection (activated by default in this version) will route requests more efficiently using the service time of prior inter-node communications.

Settings changesedit

The search.remote settings have been removededit

In 6.5 these settings were deprecated in favor of cluster.remote. In 7.x we provided automatic upgrading of these settings to their cluster.remote counterparts. In 8.0.0, these settings have been removed. Elasticsearch will refuse to start if you have these settings in your configuration or cluster state.

pidfile setting is replaced by node.pidfileedit

To ensure that all settings are in a proper namespace, the pidfile setting was previously deprecated in version 7.4.0 of Elasticsearch, and is removed in version 8.0.0. Instead, use node.pidfile.

processors setting is replaced by node.processorsedit

To ensure that all settings are in a proper namespace, the processors setting was previously deprecated in version 7.4.0 of Elasticsearch, and is removed in version 8.0.0. Instead, use node.processors.

node.processors can no longer exceed the available number of processorsedit

Previously it was possible to set the number of processors used to set the default sizes for the thread pools to be more than the number of available processors. As this leads to more context switches and more threads but without an increase in the number of physical CPUs on which to schedule these additional threads, the node.processors setting is now bounded by the number of available processors.

Force Merge API changesedit

Previously, the Force Merge API allowed the parameters only_expunge_deletes and max_num_segments to be set to a non default value at the same time. But the max_num_segments was silently ignored when only_expunge_deletes is set to true, leaving the false impression that it has been applied.

The Force Merge API now rejects requests that have a max_num_segments greater than or equal to 0 when the only_expunge_deletes is set to true.