Scrolling
editScrolling
editThe Scrolling functionality of Elasticsearch is used to paginate over many documents in a bulk manner, such as exporting all the documents belonging to a single user. It is more efficient than regular search because it doesn’t need to maintain an expensive priority queue ordering the documents.
Scrolling works by maintaining a "point in time" snapshot of the index which is then used to page over.
This window allows consistent paging even if there is background indexing/updating/deleting. First, you execute a search
request with scroll enabled. This returns a "page" of documents, and a scroll_id which is used to continue
paginating through the hits.
More details about scrolling can be found in the Link: reference documentation.
This is an example which can be used as a template for more advanced operations:
$client = ClientBuilder::create()->build();
$params = [
"scroll" => "30s", // how long between scroll requests. should be small!
"size" => 50, // how many results *per shard* you want back
"index" => "my_index",
"body" => [
"query" => [
"match_all" => new \stdClass()
]
]
];
// Execute the search
// The response will contain the first batch of documents
// and a scroll_id
$response = $client->search($params);
// Now we loop until the scroll "cursors" are exhausted
while (isset($response['hits']['hits']) && count($response['hits']['hits']) > 0) {
// **
// Do your work here, on the $response['hits']['hits'] array
// **
// When done, get the new scroll_id
// You must always refresh your _scroll_id! It can change sometimes
$scroll_id = $response['_scroll_id'];
// Execute a Scroll request and repeat
$response = $client->scroll([
"scroll_id" => $scroll_id, //...using our previously obtained _scroll_id
"scroll" => "30s" // and the same timeout window
]
);
}