How IT leaders build more proactive organizations using observability
Mining performance data for insights can be the difference-maker in positive business outcomes
- Observability can help turn your IT organization from reactive to proactive
- Identifying issues before they arise is what prevents poor user experience
- Real-time signal is what helps you improve your MTTR (Mean Time To Resolution) metric
In the lingua franca of modern IT architecture, latency is the perfect metaphor for an unresponsive business. If an organization’s latency is problematic, operations teams are always a step behind — reacting to events rather than anticipating them.
But it doesn’t have to stay that way.
For instance, Wells Fargo, a multinational financial services company headquartered in San Francisco, CA, realized that in order to measure performance of their myriad applications, they had to have a modern observability and APM solution in place. Eric Chho, vice president of engineering at Wells Fargo, is responsible for providing observability to the many application teams across Wells’ wide-ranging IT organization. Chho says, “being able to measure golden signals [of improving application availability and reducing latency] are what will improve overall customer experience.”
“Ultimately, like the famous quote, ‘data is new oil,’ says Chho, “we’ve got all this raw material but we can’t process it fast enough. That’s where search is so important. You’re only as good or as fast as you can explore [the data].”
"Cost-effective visibility across the whole environment, across applications and a multi-cloud infrastructure stack, is mission critical,” says Sajai Krishnan, General Manager, Observability at Elastic. “Executives don't want to hear about customer issues on social media – they want to proactively address potential trouble spots."
Turning IT operations from reactive to proactive begins with establishing an Observability solution to ingest and analyze incoming telemetry data. An Observability platform can automatically generate alerts when anomalies are detected. Mining this performance data stream allows IT to achieve strategic goals, such as faster mean-time-to-resolution and five-nines (99.999%) systems availability. Proactive monitoring can discover and help prevent issues before they impact end users — a particular concern for a highly utilized provider leveraging a CDN platform.
Proactive action with data
Alert management is a difference maker.
Many monitoring systems fail to adequately scale to exponentially increasing streams of performance data stemming from clouds, services, networks, the Internet of Things, and disparate systems. While using multiple performance monitoring tools may work for some organizations, Wells Fargo needed the ability to analyze data without instituting a unified data format or routing data into siloed solutions.
“Engagement is key,” says Chho. “We can’t just build the technology and expect people to adopt it. We take an approach where it’s a partnership across groups, to understand the use cases and where there’s value.” To reach the “inflection point” where observability usage becomes self-serve, alert management is a key feature to reduce friction and increase adoption. “How can we get our developers as productive as quickly as possible?” is something Chho says he thinks about constantly when managing Wells Fargo’s observability solution. “I basically want to get out of the way and deliver [these services] in an automated fashion.”
Testing is one way organizations act proactively, often testing user journeys such as product search, product checkouts, or even basic logins. Observability systems can help organizations identify the source of problems such as underperforming B2B SaaS-delivered applications (credit checks, etc.). Identifying issues in advance — and receiving instant notifications — helps prevent poor user experiences from escalating into revenue shortfalls.
Ops teams can also become proactive by trending performance data over time. Automating a process such as monitoring CRM application performance can save Ops teams from manually reviewing dashboard reports. If an application fails to meet service-level objectives, observability can help Ops teams quickly identify the source of the problem.
Speed time to resolution
In another example, for Jaguar Land Rover, building an Observability platform meant taking critical Product Lifecycle Management data and creating alerts to keep their product line humming like a finely tuned E-type Jaguar classic sportscar.
"One key thread in the performance of our vehicles and production lines is the quality of the data that is available to our leadership teams and engineers,” says Andy Walker, Senior Project Manager at Jaguar Land Rover. “There's a lot of it, and it needs to be accurate, complete, and available in an instant."
Jaguar Land Rover will deploy Elastic to report on the efficiency and utilization of manufacturing and technology assets, including licensed tools worth hundreds of millions of dollars, infrastructure including data storage, and manufacturing equipment. The system will send alerts proactively when data anomalies are detected.
Achieving a proactive solution requires a real-time understanding of what's happening in your system.
"The difference between monitoring and observability is the difference between a blood pressure monitor and a wearable device connected to cloud analytics,” says Elastic’s Krishnan. “It's the real-time analytics that can be used to ask questions of large volumes of data. For an organization, that observability solution has to be able to scale economically with your growth."