Internal innovation and the Elastic platform helped Verizon transition from relying on an expensive log analytics tool into a state-of-the-art operational center of excellence.
Secured way to access and visualize analysis based on the log data in real time.
With the help of the Elastic Stack, Verizon increased the system responsiveness and remedied non-outage performance issues.
Verizon Wireless, a wholly operated subsidiary of Verizon Communications, is the largest wireless provider in the United States. Verizon Wireless was founded in 2000, as a joint venture between Bell Atlantic (Verizon Communications’ predecessor) and UK-based Vodafone.
Verizon Communications Inc. (NYSE, Nasdaq: VZ) employs a diverse workforce of 177,700 and generated nearly $132 billion in 2015 revenues. Verizon operates America’s most reliable wireless network, with more than 112 million retail connections nationwide. Headquartered in New York, the company also provides communications and entertainment services over America’s most advanced fiber-optic network, and delivers integrated business solutions.
Long before anyone ever spoke the term log analytics, Verizon Wireless knew the power of its log data and the critical insights that it could provide to understand system performance, behavior, and user experience that impact the entire business. Verizon Wireless built a custom solution as well as deployed a vendor-supported solution to centralize, monitor, and analyze. This combination saw them through many years of growth and success. However, couple of years ago, they found themselves outgrowing this solution due to the cost of handling their massive data volumes as well as the difficulties of using RDBMS to search and analyze the unstructured, time-sensitive data coming out of their Customer Care applications.
Along with an organization-wide move to use more modern technologies like NoSQL, the team looked to the open source ecosystem to find a solution backed by an active community and enterprise support.
In order to keep scaling up, a better solution was needed.
When the PaaS team at Verizon evaluated the Elastic Stack, they were won over by its capabilities and the ease of both getting started and scaling up.
The Verizon PaaS team quickly built a centralized log analytics platform (see diagram) for application teams across the entirety of Verizon Wireless.
Their Elastic Stack-backed platform collects and processes over 4 TB logs per day. This includes infrastructure, web server, and application server logs. It enables real-time access to analyze log data for two to four weeks, after which the metrics are archived to Hadoop using the ES-Hadoop connector. More than 200 operators across more than 50 application groups are using the system in a multi-tenant fashion. While using Kibana and Shield for a secure way to access and visualize analysis based on the log data, Verizon Wireless was also able to build a custom user interface, the Verizon Elastic Log Intelligence Tool for Enterprise (vElite), to enable specialized workflows for some of their operator
Rapid adoption and growth of data into the vElite platform came with its own challenges. The Verizon PaaS team recognized early that Elastic is built to scale. Growing to a small size cluster was almost painless although achieving Verizon’s scale did require a bit more planning and design.
The move to Elastic resulted in far-reaching effects throughout Verizon Wireless’ operations. For example, an incorrectly implemented marketing campaign generated 8 times the expected traffic to the e-commerce front-end, threatening availability of critical services. Reducing resolution time was critical. Using real-time intelligence of the central logging infrastructure, the operations team was able to quickly identify the source of the offending traffic and rectify the situation within minutes.
On the whole, the team was able to reduce MTTR from 20-30 minutes to 2-3 minutes on average, a 10x improvement.
With this initial success, the Production Operations team quickly doubled down on this operational success by using the same Elastic-backed analytics infrastructure to help the entire business achieve their most important customer satisfaction goals. Increasing system responsiveness and remedying non-outage performance issues translates directly to providing outstanding service for their customers.
Bottom line is - It’s all about enhancing the real customer experience. Even if 2 out of 100 servers are responding erratically or slowly, thanks to the Elastic Stack, Verizon can now quickly drill down to these offending servers in just few clicks by using the vElite dashboard.
Verizon Wireless continues expanding its centralized logging platform by adding more teams -- including those on the business side of the house -- and with them came more and more data.
Verizon continues to add other types of operational data to the system, too, including infrastructure metrics and network packets. As the comprehensiveness of data in the system grows, Verizon Wireless plans to leverage the Elastic Stack as a source of real-time dashboards for audiences outside of operations, including members of the marketing and Executive teams.
As the team grows to the next level of analytics, Verizon Wireless is actively evaluating Watcher as well as eagerly testing out Elasticsearch 5.0 in order to provide early warning and predictive analytics on top of their already successful Elastic platform.