Back in early 2010, when the first version of Elasticsearch was released, the future (we were reliably informed) was orange. Or was it black? Or green? Nobody really knew. Nobody knew how the NoSQL space would envolve, nor how Elasticsearch would end up being used.
Back then, we tried to have a horse in every race. You could talk to Elasticsearch with JSON, YAML, SMILE, CBOR, Thrift, and Memcached protocol. You could specify settings using JSON, YAML, Java properties, and environment variables. We logged to Log4J and SLF4J. There were three ways to configure mappings. And templates. And analyzers. Elasticsearch could run standalone or embedded. Your HTTP web servers could join the Elasticsearch cluster as real nodes. Plugins could override pretty much any part of Elasticsearch, replacing integral parts of the core.
One of the things that made Elasticsearch so popular was its flexibility and its familiarity. It was easy to get started, and easy to integrate Elasticsearch into your application, no matter how unusual the design.
Six years later, Elasticsearch has evolved into a powerful search and analytics engine. It has been downloaded by millions of people and is a core technology depended upon by hundreds of thousands of users and companies. Flexibility and leniency are no longer as important as:
- A reliable stable cluster
- Predictable performance
- Clear error messages
- Warning early about problems that may occur in production
- Safety from attacks
Flexibility comes at a price: complexity. With so many alternatives it is impossible to test them all and to be sure that they actually work. Complexity interferes with the goals listed above. We can’t have it all, so we have to narrow our focus to be able to deliver a solution that can be relied on.
Elasticsearch should run as a standalone server, and should communicate with other applications via a client. This enables us to lock Elasticsearch down, to prevent it being run in an unsafe way:
As part of the bootstrap process we use the Java Security Manager to restrict the privileges available to any module to the minimum required for it to do its job. We use
seccomp on Linux to provide application sandboxing. We check for JAR Hell, to ensure that only a single version of a library is present in your class path — trying to debug an issue caused by using an incorrect version of a library that you are not even aware of is hell indeed!
In version 5.0 we have added bootstrap checks which ensure (amongst other things) that you have set the heap size, provided enough file descriptors and processes, and enough virtual memory — all of these are common errors with dire consequences that we see frequently in production. These checks are logged as warnings during development mode but become hard exceptions when you switch to production mode (when you bind to anything other than localhost). These checks are there to protect you.
We have added more circuit breakers, to protect your cluster from too many in flight requests, or aggregations which produce too many buckets because of combinatorial explosion. We have added soft limits to enable sysadmins to protect their clusters from untrustworthy users who might request a billion hits, run an expensive aggregation on an analyzed string field, cause a mapping explosion by using IP addresses as field names, or hardcode parameters into scripts.
We have reduced the number of alternatives: configuration now uses only YAML, logging uses only Log4j (soon to be Log4j2). Index settings, mapping, and analyzers can no longer be configured via the filesystem or config files — now they must be configured at index creation time or using index templates. Bad settings are no longer silently ignored — if we see something we don’t recognise, we complain early and loudly, as well as using fuzzy matching to suggest the correct setting. This change alone will save developers countless hours of their lives otherwise wasted on debugging a problem that ultimately proves to be a typo.
As part of our code base cleanup, we are formalising the plugin API to provide a cleaner and more stable interface and, in the process, we are slowly reducing our reliance on Guice with the intent of removing it completely. We will also limit the scope of plugins to prevent users from ripping out core functionality. Some things, like Lucene directories or shard routing, are just too delicate to touch. If you really want to mess with the internals, then fork Elasticsearch. If you build something that is better than what is there already, then please send a pull request.
All languages except Java communicate with Elasticsearch via HTTP clients, until now. Elasticsearch 5.0 will see the first version of a Java HTTP client. Right now, the client is pretty basic. We will be adding sugar to make it easier to use in IDEs and with your applications, but it can already be used with Elasticsearch v5 and v2. Benchmarks show that the HTTP client performance is similar to the of the Transport client. When the client is mature enough, we will deprecate and eventually remove the Java transport client. All communication with the outside world will happen via HTTP, which makes it much easier to define the boundary between the cluster and the outside world.
Some users run Elasticsearch as embedded. We are not going to stop them from doing so, but we cannot support it. Embedding Elasticsearch bypasses the security manager, the Jar Hell checks, the bootstrap checks, and plugin loading. It is inherently unsafe and not recommended for production. For the sanity of our developers and support team, we cannot support users who disable all of the safety mechanisms which we have added for good reasons. For the same reason, we will not accept pull requests or make changes specifically to support the embedded use case.
We realise that these changes will impact some users who were relying on this aspect of Elasticsearch, and for this we apologise. The goal here is not to make life harder for you, but to make a better, more streamlined, more stable product that users can rely on.