Finally, remember that there is no rule that limits your application to using only a single index. When we issue a search request, it is forwarded to a copy (a primary or a replica) of all the shards in an index. If we issue the same search request on multiple indices, the exact same thing happens—there are just more shards involved.
Searching 1 index of 50 shards is exactly equivalent to searching 50 indices with 1 shard each: both search requests hit 50 shards.
This can be a useful fact to remember when you need to add capacity on the fly. Instead of having to reindex your data into a bigger index, you can just do the following:
- Create a new index to hold new data.
- Search across both indices to retrieve new and old data.
In fact, with a little forethought, adding a new index can be done in a completely transparent way, without your application ever knowing that anything has changed.
In Index Aliases and Zero Downtime, we spoke about using an index alias to point to the
current version of your index. For instance, instead of naming your index
tweets, name it
tweets_v1. Your application would still talk to
but in reality that would be an alias that points to
tweets_v1. This allows
you to switch the alias to point to a newer version of the index on the fly.
A similar technique can be used to expand capacity by adding a new index. It requires a bit of planning because you will need two aliases: one for searching and one for indexing:
New documents should be indexed into
tweets_index, and searches should be
tweets_search. For the moment, these two aliases point to
the same index.
When we need extra capacity, we can create a new index called
update the aliases as follows:
A search request can target multiple indices, so having the search alias point
tweets_2 is perfectly valid. However, indexing requests can
target only a single index. For this reason, we have to switch the index alias
to point to only the new index.
Using multiple indices to expand index capacity on the fly is of particular benefit when dealing with time-based data such as logs or social-event streams, which we discuss in the next section.