NetApp: Finding Intelligent Operation with Embedded Elasticsearch

NetApp is one of the fastest growing all-flash array vendors, recognized as a leader in general-purpose disk arrays, solid-state arrays, and integrated systems. OnCommand Insight (OCI) was created by NetApp to be used as a monitoring tool for data centers with on-prem, private, and public cloud setups. OCI has three common use cases: intelligent operations, business insights, and ecosystem optimization (including forecast performance, resource optimization, etc).

Recently, NetApp expanded its focus to help customers get more value out of their data, especially when moving between on-prem and cloud data stores. They did this by harnessing the power of the hybrid cloud, building a next generation data center, and modernizing storage through data management.

Today, OCI is an intelligent operation management system capable of diving deep into data, learning and forecasting trends, and optimizing resources. However, the tool did not always have this level of flexibility. At Elastic{ON} 2018, Karen Dagen and Francisco Rosa from the NetApp team revealed how they embedded Elasticsearch into OCI to create a more robust app for their customers.

All Roads Lead to Elasticsearch

It all started with the need to scale.

In 2014, NetApp was upgraded to version 7.0, bringing new scale and functionality to the system. Until this point, NetApp had an architectural flow that moved from OCI to MySQL, where everything was stored. Lucene was used to discover objects stored in the database via attributes. In addition, all time series events were stored in Cassandra, which, though fast and efficient, was not capable of helping customers visualize more than one time series event at a time.

To solve this problem, NetApp engineers replaced Lucene and Cassandra with Elasticsearch as an embedded solution delivered within OCI. Using Elasticsearch, a large index was created for virtual machines that included both attributes and counters. This allowed OCI to display multiple timeline visualizations, filter results, and present hypervisor averages — all features previously unavailable. 

Lessons Learned Along the Way

NetApp engineers learned a few lessons when creating the OnCommand Insight dashboard as it is today.

First, embedding worked for NetApp. Being able to install what was needed as the app scaled was incredibly helpful. Elasticsearch can do a lot, but for what it can’t do out the box, it is easily expandable using plugins. While plugin documentation is sometimes scarce, being able to look at the open source Elasticsearch code makes it possible to create the plugins needed for success. In NetApp’s case, having the functionality of a plugin outweighed whatever extra work was necessary to keep up with Elasticsearch as it updated.

Second, machine learning proved its value for NetApp by revealing that dormant resources within the system, which are normally inactive, can create a small anomaly when activated — small blips on a timeline that would be alarming without an alert.

Building Solutions Customer Love

Since integrating Elasticsearch into OCI, NetApp has received great feedback from their customers. From the scale of the solution to the richness and quality of the data from the sources and API, embedding Elasticsearch in the OnCommand Insight app greatly expanded the possibilities of the product. NetApp has been working to incorporate more components of the Elastic Stack into their offering (including Logstash and Beats) and will continue adapting, and learning lessons, as the years go by and the data continues to scale.

Watch the full session from Elastic{ON} 2018 to find out how NetApp has expanded its focus to help customers get more value out of their data with an embedded Elasticsearch solution.