In the recent 5.4 release of the Elastic Stack, we standardized on CentOS 7 as the base for our official Docker images (Elasticsearch, Logstash, Kibana, Beats). Over time, as we've rolled out Docker support for more of our products, we have used images that met a narrow window of concerns at any given point instead of maintaining a cohesive story across the whole product line. Using a single OS base, we hope, will bring our stack alignment that will improve the overall experience. We've found that CentOS provides that foundation -- great support for our products and friendliness to our users.
We realize the most radical departure will be for those of you who have come to depend on our Alpine Linux base. We considered the cost and feel like it's time to move on. Don't get us wrong: we greatly appreciate the good work over at Alpine. A minimalistic distribution with a focus on security and simplicity is, and has been, right up our alley. In practice, however, two things became barriers. The first is actually our fault: some of our software has heavy dependencies. Two of our flagship products depend on the JVM, which weighs down any container image and Alpine is no different. We aren't the typical consumer of the 5MB base image; ours at best are still at least 150MB.
The most common benefit of Alpine is a small image, and for our stack, we usually don't get to see it. By comparison, the CentOS 7 base image is 70MB. While that's a far cry from 5MB, it's comparable to a lightweight network install, and has many batteries included that we had to install anyway, like bash. Once we add our stuff, we're only looking at about a 2x increase in CentOS over Alpine. Moreover, a nice property of the layered approach to storage is that the more common the base, the more those layers get reused. For a host using more than one of our images, it probably won't even see a 2x increase overall, not to mention fewer bits to transfer over the network.
The other barrier is that our community is struggling with issues that are outside of our control. Underneath the magic of Docker is a very complex layering of software that still has to work in concert, and sometimes it doesn't. The combination of all those pieces also makes diagnosing these issues complicated.
One recent problem illustrates this very well. musl libc's implementation of
getmntent_r(), a very basic syscall Elasticsearch uses through the JNI, doesn't seem to return information about the filesystem in cases where many layers of
overlay2 filesystems exist for a running container. We've seen countless other issues surface in our CI environment, and while most are solvable with a hack here and there, we wonder how much benefit there is with the hacks. In an already complex distributed system, libc complications can be devastating, and most people would likely pay more megabytes for a working system. CentOS uses the more venerable GNU libc. While certainly bloated from decades of accretion, glibc has also had time to find and fix bugs in more esoteric parts of the codebase that may only get exposed in, for example, math-intensive workloads. We know that Alpine ships glibc-based packages, but using a second-class libc in a niche OS seemed like a step further away from stability rather than one toward it.
In the end, Docker does a really nice job of getting the base layer out of your way once you have a buildable image. If we can help you get there more consistently, it will be a better experience all around. But as always, please head over to our forums if you are struggling, or open an issue if you find a bug (Elasticsearch, Logstash, Kibana, Beats).
If you are brand new to Elastic, or haven't had a chance to use our Docker images yet, we provide a simple path to getting started with the whole stack. You only need Docker installed!