Security and Alerting for Elasticsearch: A Vandis Story (Part 2)

It’s just another day when Shield, Watcher, Marvel, and Beats help Vandis identify and resolve problems before their customers know anything’s amiss. 


Recently, we brought you an excerpt from an interview with Ryan Young, Director of Engineering at Vandis, and Jason Bryan, Elastic Support Engineer. This week, we picked Ryan’s brain about their use of Elastic’s commercial products, like Shield (security for Elasticsearch).

Brief recap: Vandis is a technology reseller based in New York that does consultative architectural design and security solutions for companies requiring infrastructure services like networking and firewalls, in addition to centralized logging and syslog threat analysis.

Please note that this interview has been edited for length and clarity.
Now, to the good stuff!


Ryan, you've provided insight into what the support relationship is like, but that’s only one part of the equation. Can you describe the value you get out of Elastic products like Shield (security), Watcher (alerting for Elasticsearch), and Marvel (monitoring for Elasticsearch)?

Ryan: Watcher is instrumental to us in terms of notifying us when things are going awry inside of a customer network. When we picked up Elasticsearch, we also picked up PagerDuty, and Watcher is tightly integrated with it. Each one of our customers has a service with PagerDuty and each one of those services has on-call, tier-one engineers, and tier-two and tier-three escalation engineers. Any time there's an event, we get notified, days, nights, weekends, and really that's because Watcher and Elasticsearch are doing their jobs.

Can you provide an example?

Ryan: I think we're at 1.2 billion documents right now, and that’s not even 30 days’ worth of data. We have one customer who has a network operations center (NOC). They have a voice monitoring solution and because of how fast Elasticsearch is parsing the logs and alerting, we actually were on the phone with them before the NOC even called to say that they were down. That has been deeply instrumental to us.

And what about Shield or Marvel?

Ryan: Marvel is its own cluster running in our data center, watching our clusters to make sure that life continues smoothly.

Shield was a requirement for us. As we build out our deployment, it's a requirement for me to have role-based access control so that my sales team can go in and look at something and not expose data to customers or people who shouldn’t have access. Whereas the engineering team needs to have full access to all the data.

For example, a proof-of-concept customer the other day called to say, "We have a problem. We don't know what the problem is." We built a Kibana dashboard for them that focused on the threat and almost immediately, they knew that they had a machine within their facility that had been compromised and was infecting other hosts. We were able to turn that around for them in about five to ten minutes. They responded with, "Wow, we knew about the value, but now we really see the value."

So it sounds like while your initial engagement with Elastic was oriented around support, having the commercial products in use really makes for a holistic package. Is that accurate?

Ryan: Oh, absolutely. I was actually having this debate with someone on a bus on the way to Elastic{ON}, going from the hotel to the venue. He was telling me that they had support previously, and ended up not renewing. I replied, "You might want to look at that again. It's a whole different world."

That got us talking about how he's gone and built his own software stack to pretty much do exactly what Watcher and Shield do. He was lamenting about how many cycles he's spending on that and I said, "That’s another reason you might want to look at a subscription. You don't have to support or maintain any of that. That's actually done for you. It's part of the deal."

It's awesome. Shield was probably the driving force to get my management to sign off on a subscription. Then, almost immediately after the engagement began and Jason was sleeves rolled up, deep in it with us, they turn around and were like, "Oh, there's actually a lot more value to this."

It’s been great talking with you, Ryan. Anything I haven’t asked you that you’d like to add?

Ryan: I’ll leave you with one cool story. This is actually one of my favorite Elastic stories.

All of my engineers were out at a conference. We were running very thin. We had a customer who called and said, "We have this crazy issue. We can't figure it out. We have our firewall vendor involved. We have our load-balancer vendor involved. We have our virtualization platform vendor involved." He went through the full list of vendors. He said, "Everyone is just pointing fingers at each other. No one is telling me anything."

I happened to have the most recent versions of Elasticsearch and Packetbeat running on my laptop. I went into the office, threw Packetbeat on their network and let it sit there for an hour. We sat and drank some coffee, talked about our kids, what we're doing on the weekend. Meanwhile, in the background, I have Kibana up and it's just running the Beats dashboards.

The customer came back and asked, "So, any idea what we got going on here?" We look over at the dashboards, and see they have 75% DNS failure. Their DNS servers had just gone off the reservation. Within an hour, we had root cause and solution.

The customer asked, "What was that?"

We replied, "Oh, that was Elastic."

Talk about winning a victory for our team.

Did you miss part 1 of this interview featuring Ryan and Elastic Support Engineer Jason Bryan? Don’t panic. Grab a delicious beverage, kick back, and enjoy a pleasant read.