collectors Sectionedit

The Collectors portion of the response shows high-level execution details. Lucene works by defining a "Collector" which is responsible for coordinating the traversal, scoring and collection of matching documents. Collectors are also how a single query can record aggregation results, execute unscoped "global" queries, execute post-query filters, etc.

Looking at the previous example:

"collector": [
    {
       "name": "SimpleTopScoreDocCollector",
       "reason": "search_top_hits",
       "time": "2.206529000ms"
    }
]

We see a single collector named SimpleTopScoreDocCollector. This is the default "scoring and sorting" Collector used by Elasticsearch. The "reason" field attempts to give an plain english description of the class name. The "time is similar to the time in the Query tree: a wall-clock time inclusive of all children. Similarly, children lists all sub-collectors.

It should be noted that Collector times are independent from the Query times. They are calculated, combined and normalized independently! Due to the nature of Lucene’s execution, it is impossible to "merge" the times from the Collectors into the Query section, so they are displayed in separate portions.

For reference, the various collector reason’s are:

search_sorted

A collector that scores and sorts documents. This is the most common collector and will be seen in most simple searches

search_count

A collector that only counts the number of documents that match the query, but does not fetch the source. This is seen when size: 0 or search_type=count is specified

search_terminate_after_count

A collector that terminates search execution after n matching documents have been found. This is seen when the terminate_after_count query parameter has been specified

search_min_score

A collector that only returns matching documents that have a score greater than n. This is seen when the top-level parameter min_score has been specified.

search_multi

A collector that wraps several other collectors. This is seen when combinations of search, aggregations, global aggs and post_filters are combined in a single search.

search_timeout

A collector that halts execution after a specified period of time. This is seen when a timeout top-level parameter has been specified.

aggregation

A collector that Elasticsearch uses to run aggregations against the query scope. A single aggregation collector is used to collect documents for all aggregations, so you will see a list of aggregations in the name rather.

global_aggregation

A collector that executes an aggregation against the global query scope, rather than the specified query. Because the global scope is necessarily different from the executed query, it must execute it’s own match_all query (which you will see added to the Query section) to collect your entire dataset