- PHP Client: other versions:
- Overview
- Quickstart
- Installation
- Configuration
- Inline Host Configuration
- Extended Host Configuration
- Authorization and Encryption
- Set retries
- Enabling the Logger
- Configure the HTTP Handler
- Setting the Connection Pool
- Setting the Connection Selector
- Setting the Serializer
- Setting a custom ConnectionFactory
- Set the Endpoint closure
- Building the client from a configuration hash
- Per-request configuration
- Future Mode
- Dealing with JSON Arrays and Objects in PHP
- Index Management Operations
- Indexing Documents
- Getting Documents
- Updating Documents
- Deleting documents
- Search Operations
- Namespaces
- Security
- Connection Pool
- Selectors
- Serializers
- PHP Version Requirement
- Breaking changes from 1.x
- Community DSLs
- Community Integrations
- Reference - Endpoints
Scan/Scroll
editScan/Scroll
editThe Scan/Scroll functionality of Elasticsearch is similar to search, but different in many ways. It works by executing a search query with a search_type
of scan
. This initiates a "scan window" which will remain open for the duration of the scan. This allows proper, consistent pagination.
Once a scan window is open, you may start _scrolling) over that window. This returns results matching your query…but returns them in random order. This random ordering is important to performance. Deep pagination is expensive when you need to maintain a sorted, consistent order across shards. By removing this obligation, Scan/Scroll can efficiently export all the data from your index.
This is an example which can be used as a template for more advanced operations:
$client = ClientBuilder::create()->build(); $params = [ "search_type" => "scan", // use search_type=scan "scroll" => "30s", // how long between scroll requests. should be small! "size" => 50, // how many results *per shard* you want back "index" => "my_index", "body" => [ "query" => [ "match_all" => [] ] ] ]; $docs = $client->search($params); // Execute the search $scroll_id = $docs['_scroll_id']; // The response will contain no results, just a _scroll_id // Now we loop until the scroll "cursors" are exhausted while (\true) { // Execute a Scroll request $response = $client->scroll([ "scroll_id" => $scroll_id, //...using our previously obtained _scroll_id "scroll" => "30s" // and the same timeout window ] ); // Check to see if we got any search hits from the scroll if (count($response['hits']['hits']) > 0) { // If yes, Do Work Here // Get new scroll_id // Must always refresh your _scroll_id! It can change sometimes $scroll_id = $response['_scroll_id']; } else { // No results, scroll cursor is empty. You've exported all the data break; } }