Elastic Observability: Driving mean time to resolution to zero

blog-thumb-release-observability.png

At ElasticON Global 2021, Tanya Bragin, VP Product, Observability, and the Elastic Observability team showed how ongoing innovations continue to deliver actionable insights and faster root cause detection, reducing mean time to resolution (MTTR).

The adoption of cloud, microservices, and ephemeral infrastructure is driving increased complexity, requiring an observability solution to provide end-to-end visibility. Elastic Observability, as recognized by Gartner, GigaOm, and EMA, continues to build a comprehensive solution by delivering functionality related to:

  • Unified agent for ingesting all telemetry data with centralized management
  • Integration with cloud native technologies (e.g., Kubernetes)
  • Native integrations with major cloud providers including Amazon Web Services, Microsoft Azure, and Google Cloud Platform
  • Automated root cause analysis in application performance monitoring (APM) leveraging machine learning
  • Enhanced APM troubleshooting workflows integrating logs, third-party dependencies, and backend services
  • Intuitive service maps for contextual troubleshooting
  • Support for OpenTelemetry (OTel)
  • Synthetics and real user monitoring (RUM) enhancements
Our commitment to open source communities means that Elastic Observability will always be an open and extensible platform. We are committed to adopting and contributing to open standards and open source initiatives. The goal? Delivering customers a comprehensive observability platform that maximizes user flexibility and minimizes vendor lock-in.
Elastic Observability - Overview and components

 

Making data relevant, contextual, and actionable

Operations and development teams are often confronted with siloed tools for metrics, logs, and traces. Even with a single tool, the data is often in silos with no context or is missing relevant metadata (dimensionality), which increases mean time to detection (MTTD) and resolution (MTTR). Elastic Observability seamlessly scales to large amounts of data with high dimensionality and cardinality with little to no performance or cost surprises.

Frictionless onboarding via Elastic Agent and centralized management allows for simplified collection of all telemetry data, including cloud-native technologies such as Kubernetes. We’ve also added integrations with Microsoft Azure and Google Cloud Platform to natively ingest telemetry data, with additional integrations coming.

Elastic Observability - Interface

 

Context is required for efficient and quick troubleshooting of incidents. Elastic APM service maps visualize application topology and accelerate troubleshooting by giving you the ability to see the status of services, anomalies detected, and logs in the context of transactions. They also allow you to compare service performance over any historical baseline, making it easy to detect misbehaving services. Our recent support for performance views into third-party service dependencies eliminates blind spots from your environment. We are further expanding our APM capabilities with support for Mobile iOS agent, in technical preview.

Elastic Observability - Application Performance Management (APM) Interface

 

Our next step on the journey to connect the dots is to deliver context between your application and infrastructure. Often, application performance is degraded due to performance issues in the infrastructure. We will be delivering the ability to view infrastructure performance in context to the application performance and related logs, delivering unified observability. We’ve also had requests for the ability to compare service performance across versions, cloud regions, availability zones, and other metadata. This future capability would help compare performance between A/B or canary deployments and allow for quick troubleshooting of deployment issues.

Elastic Observability - APM Services UI