Why Delete-By-Query is a pluginedit

The old delete-by-query API in Elasticsearch 1.x was fast but problematic. We decided to remove the feature from Elasticsearch for these reasons:

Forward compatibility
The old implementation wrote a delete-by-query request, including the query, to the transaction log. This meant that, when upgrading to a new version, old unsupported queries which cannot be executed might exist in the translog, thus causing data corruption.
Consistency and correctness
The old implementation executed the query and deleted all matching docs on the primary first. It then repeated this procedure on each replica shard. There was no guarantee that the queries on the primary and the replicas matched the same document, so it was quite possible to end up with different documents on each shard copy.
The old implementation could cause out-of-memory exceptions, merge storms, and dramatic slow downs if used incorrectly.

New delete-by-query implementationedit

The new implementation, provided by this plugin, is built internally using scan and scroll to return the document IDs and versions of all the documents that need to be deleted. It then uses the bulk API to do the actual deletion.

This can have performance as well as visibility implications. Delete-by-query now has the following semantics:

A delete-by-query may fail at any time while some documents matching the query have already been deleted.
A delete-by-query may fail at any time and will not retry it’s execution. All retry logic is left to the user.
syntactic sugar
A delete-by-query is equivalent to a scan/scroll search and corresponding bulk-deletes by ID.
A delete-by-query will only delete the documents that are visible at the point in time the delete-by-query was started, equivalent to the scan/scroll API.
A delete-by-query will yield consistent results across all replicas of a shard.
A delete-by-query will only send IDs to the shards as deletes such that no queries are stored in the transaction logs that might not be supported in the future.
The effect of a delete-by-query request will not be visible to search until the user refreshes the index, or the index is refreshed automatically.

The new implementation suffers from two issues, which is why we decided to move the functionality to a plugin instead of replacing the feautre in core:

  • It is not as fast as the previous implementation. For most use cases, this difference should not be noticeable but users running delete-by-query on many matching documents may be affected.
  • There is currently no way to monitor or cancel a running delete-by-query request, except for the timeout parameter.

We have plans to solve both of these issues in a later version of Elasticsearch.