Customers

Elastic on Elastic: How InfoSec deploys infrastructure and stays up-to-date with ECK

This post is part of a blog series highlighting how we embrace the solutions and features of the Elastic Stack to support our business and drive customer success.

The Elastic InfoSec Security Engineering team is responsible for deploying and managing InfoSec's infrastructure and tools. At Elastic, speed, scale, and relevance is our DNA and leveraging the power of the Elastic Stack is the heart of InfoSec. Our goal is not only to test new features as they are rolled out, but to use these capabilities to protect Elastic.

We trust and believe in our products to protect our brand. If you, too, use the Elastic Stack, you’re likely familiar with its rich features and power, which are deliberately released at a rapid cadence. The intent behind our frequent releases is to enable customers to solve issues and make life easier where possible, and what better way to do this than to consistently make new features readily available?

I know what you are thinking: in theory, remaining up-to-date sounds like an excellent idea, but frequently upgrading can be time consuming, can introduce new bugs, and can disrupt production systems. That is why here at Elastic, we put theory into practice. We would not expect you to keep up with our release cadence if we could not keep up with our own pace. So let us show you how we are harnessing Elastic Cloud on Kubernetes (ECK) and Helm to deploy infrastructure and remain up-to-date. If you would like more information on how InfoSec leverages our technology to protect Elastic, check out “Securing our endpoints with Elastic Security.”

Why ECK?

Our managed Elasticsearch offering — Elasticsearch Service on Elastic Cloud — meets all of our requirements around scalability, functionality, and capability. That said, why doesn’t InfoSec use Elasticsearch Service? If you think about how you would handle operational logs, the general best practice is to ship monitoring logs to a remote cluster because you want to ensure that data is available to you in the event the cluster you are monitoring becomes unavailable. For this reason, InfoSec wanted to operate independently from the Elasticsearch Service to ensure we are able to monitor our services if issues were to arise.

A couple years ago, SecEng began exploring using Kubernetes to deploy Elasticsearch clusters and when our internal teams developed ECK, it was a no brainer for us to test and provide feedback.

What is ECK?

ECK not only makes it easier to deploy Elasticsearch and Kibana, but it enables you to quickly upgrade, manage multiple clusters, perform backups, scale, and update clusters. Plus, it handles security by default. SecEng currently manages 6 production Elasticsearch clusters that are home to audit, monitoring, and various security logs. Our largest cluster is roughly 35 GKE nodes and growing as we collect additional data for better visibility.

As of today, we receive an average of 30,000 events per second (EPS), which is over 2 billion events per day or 6TB of data a day. Infosec is also leveraging cross-cluster search, which allows our Detection and Response teams to centralize reporting and alerting. With the latest ECK release, enabling this feature has never been easier. Below, we will show you how we have added this to our Helm chart, as well as other features that Elastic Infosec normally enables.

Getting started with ECK

If you are new to ECK, please review the ECK quickstart documentation to get started. If you want to a deploy a one-node Elasticsearch cluster, your manifest would look like this:

apiVersion: elasticsearch.k8s.elastic.co/v1 
kind: Elasticsearrch 
metadata: 
  name: quickstart 
spec: 
  version: 7.8.0 
  nodeSets: 
  - name: default 
    count: 1 
    config: 
      node.master: true 
      node.data: true 
      node.ingest: true 
      node.store.allow_mmap: false

As you can see, you can easily configure the number of nodes and node types for your deployment. The Kibana manifest looks very similar:

cat <<EOF | kubectl apply -f - 
apiVersion: kibana.k8s.elastic.co/v1 
kind: Kibana 
metadata: 
  name:  
spec: 
  version: 7.8.0 
  count: 1 
  elasticsearchRef: 
    name: quickstart 
EOF

Whether you want to deploy a one-node cluster or 100 nodes, you can configure the manifest to meet your needs. When we first started leveraging ECK, we would create a separate manifest per cluster. After deploying several clusters, we noticed several things:

  1. When creating a new cluster, we would manually copy the base Elasticsearch and Kibana manifest. Note: This made deployments prone to errors and more difficult to review in GitHub because it was being treated as new code.
  2. If there were any changes to the operator, we would have to update all of our manifest files as we upgraded.
  3. While our deployments consisted of the same services — Elasticsearch, Kibana, Logstash, Beats, beats-setup, NGINX-ingress — we needed to alter the specs for each cluster: resource sizes, SAML settings, secrets, storage, services and ports, cross-cluster search, etc. This resulted in a large amount of duplicate/repetitive code, making our deployments difficult to manage
  4. Wait… We still have to manage Logstash, Beats, NGINX — oh my?

We concluded that we needed to abstract specific values relevant to each of the clusters and make it customizable. With this requirement in mind, we started experimenting with Helm, a package manager and templating engine for Kubernetes. We found that Helm allowed us to template and deploy our standard services and dependencies consistently across our clusters, while still allowing for flexibility.

Note: We already had pockets of Helm deployments in our environment so we expanded on its usage. You could achieve the same results with whatever package manager you would like to use: bring your own package manager.

ECK and Helm

We decided to create Helm charts for all of the components that made up our stack and re-use charts where possible. By doing so, we significantly reduced the amount of code that needed to be modified and reviewed. We relied on the requirements.yaml to define our dependencies and the values.yaml to pass the appropriate values for our cluster. Using this approach, we were able to reuse a single ECK manifest to deploy Elasticsearch and Kibana. If modifications needed to be made, we would submit a PR against the manifest and then update each of the cluster’s dependencies. By default, here are the basic components that make up an Infosec cluster:

NGINX

Route users to an OAuth proxy to check whether or not they are an Elastician. if true, route to the appropriate service — Elasticsearch or Kibana. We created a ConfigMap for the nginx.yml and abstracted custom values such as:

  • DNS name
  • Enable OAuth
  • Load custom certificates
  • Define backend services

This is the values.yaml:

nginx: 
hosts: 
  elasticsearch: 
    oauth: true 
    proxy: "https://oauth-url" 
    host: my-es-cluster.com 
    pki: 
      secretName: nginx-certificate 
    backend: 
      address: "infosec-demo-es-http:9200" 
      pki: 
        secretName: infosec-demo-http-certs-public 
        key: tls.crt 
  kibana: 
    oauth: true 
    proxy: "https://oauth-url" 
    host: my-kibana-instance.com 
    pki: 
      secretName: nginx-certificate 
    backend: 
      address: "infosec-demo-kb-http:5601" 
      pki: 
        secretName: infosec-demo-kb-http-certs-public 
        key: tls.crt

Logstash

This is our ingestion tier for logs. It Receives logs from Beats deployed at Elastic. We use the official Elastic Helm chart for Logstash.

Beats

We monitor our stack using Auditbeat, Filebeat, and Heartbeat. We use the official Elastic Helm chart for Filebeat, but have created custom charts for Auditbeat and Heartbeat, since they are tailored for our use. ConfigMaps are created for both auditbeat.yml and heartbeat.yml anytime we add these charts as a dependency.

Beats-Setup

We load Beats templates as we upgrade, so we don’t run into mapping conflicts. This is the values.yaml:

beats-setup: 
  elasticsearchAddress: my-es-cluster-es-http:9220 
  kibanaAddress: my-es-cluster-kb-http:5601 
  clusterName: my-es-cluster 
  versions: [ 
    7.6.0, 7.6.1, 7.6.2, 
    7.7.0, 7.7.1, 
    7.8.0 
  ] 
  credentials: 
    # The secret name where the Elasticsearch credentials are stored 
    secretName: my-es-cluster-es-elastic-user 
    passwordKey: elastic 
    usernameKey: null

job.yaml

When the beats-setup job runs, it will automatically connect to Elasticsearch, use the *beats.yml, and load any dashboards/templates based on the version specified.

apiVersion: batch/v1 
kind: Job 
metadata: 
  name: {{ $Release.Name }}-auditbeat-setup 
spec: 
  template: 
    spec: 
      restartPolicy: OnFailure 
      hostPID: true 
      containers: 
      {{- range $version :=  .Values.versions }} 
      - name: auditbeat-setup-{{ regexReplaceAllLiteral :\\." $version "-" }} 
        image: docker.elastic.co/beats/auditbeat:{{ $version }} 
        # Setup dashboards and templates. 
        command: ["auditbeat", "-e", "setup"] 
        securityContext: 
          capabilities: 
            add: ["AUDIT_CONTROL", "AUDIT_READ"] 
        resources: 
          limits: 
            memory: 200Mi 
          requests: 
            cpu: 100m 
            memory: 200Mi 
        volumeMounts: 
        - name: auditbeat-setup 
          mountPath: /usr/share/auditbeat/auditbeat.yml 
          subPath: auditbeat.yml 
        - name: elasticsearch-pki 
          mountPath: /usr/share/auditbeat/pki/elasticsearch.crt 
          subPath: elasticsearch.crt 
        - name: kibana-pki 
          mountPath: /usr/share/auditbeat/pki/kibana.crt 
          subPath: kibana.crt 
        env: 
          - name: ELASTIC_USERNAME 
            {{ if $Values.credentials.usernameKey -}} 
            valueFrom: { secretKeyRef: { name: {{ $Values.credentials.secretName }}, key: {{ $Values.credentials.usernameKey }} } } 
            {{- else -}} 
            value: elastic 
            {{- end }} 
          - name: ELASTIC_PASSWORD 
            valueFrom: { secretKeyRef: { name: {{ $Values.credentials.secretName }}, key: {{ $Values.credentials.passwordKey }} } } 
      {{- end }} 
      volumes: 
      - name: auditbeat-setup 
        configMap: 
          name: auditbeat-setup 
          items: 
          - key: auditbeat.yml 
            path: auditbeat.yml 
      - name: elasticsearch-pki 
        secret: 
          secretName: {{ .Values.clusterName }}-es-http-certs-public 
          items: 
          - key: tls.crt 
            path: elastic.crt 
   - name: kibana-pki 
        secret: 
          secretName: {{ .Values.clusterName }}-kb-http-certs-public 
          items: 
          - key: tls.crt 
            path: kibana.crt 
___

Elasticsearch / Kibana

We created an ECK chart to abstract the following values:

  • Resources: CPU, RAM and Java Heap, storage (Google SSD persistent storage)
  • Enable SAML SSO
  • Kube Secrets: Webhooks, certificates, and credentials
  • Node roles: Master, data, ingest, machine-learning, Coordination. We also allow combined roles; although, we don’t have every possible combination, just those commonly used for our clusters
  • IsBC: Is this a build candidate? InfoSec frequently deploys build candidates in production to validate them in real-world scenarios.
  • Cross-cluster search: Enables and orchestrates cross-cluster search by just passing the client server’s CA. This automatically configures transport layer services.
{{- if $config.ccs }} 
     xpack.security.transport.ssl.certificate_authorities: 
     {{- range $cert := $config.ccs.secretNames }} 
     - "/usr/share/elasticsearch/config/other/{{ $cert }}" 
     {{- end }} 
     {{- end }} 
      containers: 
       - name: elasticsearch 
         env: 
         - name: ES_JAVA_OPTS 
           value: {{ $config.java_heap }} 
         - name: NSS_SDB_USE_CACHE 
           value: "no" 
         resources: 
           requests: 
             memory: {{ $config.memory }} 
             cpu: {{ $config.cpu }} 
           limits: 
             memory: {{ $config.memory }} 
       # Loads remote certificates if Cross-Cluster Search is Enabled. 
       {{- if $config.ccs }} 
         volumeMounts: 
         - mountPath: "/usr/share/elasticsearch/config/other" 
          name: remote-certs 
       volumes: 
       - name: remote-certs 
         secret: 
           secretName: remote-certs 
       {{- end }}

Deploy a cluster using ECK and Helm

Let’s walk through a deployment. In this example, we are going to deploy a two-node Elasticsearch cluster, Kibana, and Auditbeat. One node will have the master role and the appropriate resource specs. The other node will be a data and ingest node.

values.yaml

nginx: 
 hosts: 
   elasticsearch: 
     oauth: true 
     proxy: "https://oauth-url" 
     host: my-es-cluster.com 
     pki: 
       secretName: nginx-certificate 
     backend: 
       address: "infosec-demo-es-http:9200" 
       pki: 
         secretName: infosec-demo-http-certs-public 
         key: tls.crt 
   kibana: 
     oauth: true 
     proxy: "https://oauth-url" 
     host: my-kibana-instance.com 
     pki: 
       secretName: nginx-certificate 
     backend: 
       address: "infosec-demo-kb-http:5601" 
       pki: 
         secretName: infosec-demo-kb-http-certs-public 
         key: tls.crt 
eck: 
 clusterName: infosec-demo 
 version: 7.7.0 
 enableSecrets: false 
 secrets: [ 
 ] 
 elasticsearch: 
   nodes: 
     master: 
       isBC: false 
       # Enable Single Sign-on 
       enableSSO: false 
       role: master 
       count: 1 
       storage: 5Gi 
       memory: 8Gi 
       cpu: 2 
       java_heap: -Xms4g -Xmx4g 
     data: 
       isBC: false 
       # Enable Single Sign-on 
       enableSSO: false 
       role: data+ingest 
       count: 1 
       storage: 10Gi 
       spec: 
       memory: 8Gi 
       cpu: 2 
       java_heap: -Xms4g -Xmx4g 
 kibana: 
   count: 1 
   enableSSO: false 
   enableSecrets: false 
   secrets: [ 
   ] 
   isBC: false 
auditbeat: 
 version: 7.7.0

requirements.yaml

dependencies: 
  - name: nginx 
    repository: file://../../infosec-demo/charts/nginx 
    version: 0.1.0 
  - name: eck 
    repository: file://../../infosec-demo/charts/eck 
    version: 0.1.0 
  - name: auditbeat 
    repository: file://../../infosec-demo/charts/auditbeat 
    version: 0.1.0

With the values.yml and dependencies above, we are able to quickly and consistently deploy an Elasticsearch cluster that meets all of our standards into a new K8s cluster. This helps us to efficiently meet the needs of our internal customers when a dedicated cluster is required in a new location. With this as our foundation, we are able to quickly upgrade or scale our existing clusters by simply tweaking the values.yml for each cluster. For example, if we wanted to upgrade to a newer version and add additional nodes to the above cluster, we would simply make these changes in our values.yml and then deploy:

By keeping these changes confined to the single values.yml file, it becomes simple for a teammate to quickly review the PR and validate the changes that were made prior to deployment.

Safety

Deploying build candidates can be a bit scary, but the benefits they provide our team outweighs the bad. By deploying build candidates, we enable our product teams to test and build content using real-life data. We minimize upgrade risks by scheduling hourly snapshots using snapshot lifecycle management (SLM), creating alerts (pod-activity, system performance, search latency, Metricbeat, etc., -), and having the flexibility to quickly rebuild a new cluster with our charts.

Conclusion

ECK and Helm have been a game changer for InfoSec. We went from managing dependencies and struggling to keep up with the pace of security, product releases, and human error to focusing on making our services better. We have significantly reduced our upgrade time from weeks to upgrading our entire environment in under two hours. With this approach, we no longer have to sacrifice platform stability to enhance security visibility.

If you are new to Kubernetes, download ECK and enjoy the journey. Take your time understanding the basics and incrementally create your templates. This has been an exciting six-month journey for InfoSec and I can promise you, there is light at the end of the tunnel. Deploying infrastructure and remaining up to date doesn’t have to be a small project anymore and you can enjoy the awesome features Elastic has to offer now — $helm upgrade!