Automate root cause analysis for an observability alert

This guide walks through building an observability workflow that responds to an alert by running an Elastic Agent Builder agent for root-cause analysis, generating a case title and description from the agent's output, opening the case, and attaching both the alert and the agent's reasoning trace as comments.

The workflow is adapted from root-cause-analysis-rca-workflow.yaml in the elastic/workflows library.

If you're new to workflows, complete Build your first workflow first.

Before you begin

Permissions. All on Analytics > Workflows, Observability > Cases, and whatever Agent Builder privilege is required to invoke agents in your space. Refer to Kibana privileges.
Alerting rule. A configured observability alerting rule that fires on the conditions you want to auto-investigate (metric thresholds, SLO burn rate, anomaly detection, or custom query).
SRE agent. An Elastic Agent Builder agent configured to investigate observability signals. The source workflow uses an agent named sre-agent. Substitute your agent ID.
Attach the workflow to the rule. After saving the workflow, attach it as an action on the alerting rule. Refer to Alert triggers.

How it works

The workflow runs in a single pass when an alert fires:

An alert trigger starts the workflow with the alert payload at event.
An ai.agent step runs an initial analysis of the alert and returns the agent's conversation ID so follow-up steps can continue the same conversation.
Two more ai.agent calls reuse the conversation to generate a case title and a case description.
A cases.createCase step opens the case with the agent-generated title and description.
cases.addAlerts attaches the triggering alert.
A kibana.request step fetches the agent's conversation transcript.
cases.addComment steps append the agent's reasoning trace and the raw analysis as comments for auditability.

Build the workflow

Trigger on observability alerts
```
triggers:
  - type: alert
		
```
Attach the workflow to the alerting rule you want to investigate.

Run initial RCA

Call the agent with the alert payload. Keep create-conversation: true so follow-up steps can continue the conversation and the agent has context when generating the title and description:

		steps:
  - name: rca_analysis
    type: ai.agent
    agent-id: "{{ consts.agent_id }}"
    connector-id: "{{ consts.connector_id }}"
    create-conversation: true
    with:
      prompt: |
        Investigate the following alert and propose root causes.
        Keep your analysis and data exploration brief to preserve context.

        <alert>
        {{ event | json }}
        </alert>
		
	

The agent's response is at steps.rca_analysis.output.message, and the conversation ID is at steps.rca_analysis.output.conversation_id.

Generate a case title and description

Reuse the conversation (so the agent remembers its analysis) and ask it for a title and description:

		- name: case_title
  type: ai.agent
  agent-id: "{{ consts.agent_id }}"
  connector-id: "{{ consts.connector_id }}"
  conversation-id: "{{ steps.rca_analysis.output.conversation_id }}"
  with:
    prompt: "Based on your analysis, produce a clear case title. Output only the title."

- name: case_description
  type: ai.agent
  agent-id: "{{ consts.agent_id }}"
  connector-id: "{{ consts.connector_id }}"
  conversation-id: "{{ steps.rca_analysis.output.conversation_id }}"
  with:
    prompt: "Based on your analysis, produce a clear case description. Output only the description."
		
	

Using a conversation ID keeps tokens cheap and ensures the title and description match the earlier analysis.

Open the case

Create the case with the agent-generated title and description:

		- name: create_case
  type: cases.createCase
  with:
    title: "{{ steps.case_title.output.message }}"
    description: "{{ steps.case_description.output.message }}"
    owner: "observability"
    severity: "medium"
    tags: ["auto-rca", "ai-generated"]
		
	

owner is observability for observability cases.

Attach the alert

		- name: attach_alert
  type: cases.addAlerts
  with:
    case_id: "{{ steps.create_case.output.id }}"
    alerts:
      - alertId: "{{ event.alerts[0]._id }}"
        index: "{{ event.alerts[0]._index }}"
        rule:
          id: "{{ event.rule.id }}"
          name: "{{ event.rule.name }}"
		
	

Attach the agent's analysis and reasoning

Append the raw analysis as one comment and the reasoning trace as another. Fetch the reasoning trace with a kibana.request against the Agent Builder conversations API:

		- name: add_analysis
  type: cases.addComment
  with:
    case_id: "{{ steps.create_case.output.id }}"
    comment: "{{ steps.rca_analysis.output.message }}"

- name: get_conversation
  type: kibana.request
  with:
    method: GET
    path: /api/agent_builder/conversations/{{ steps.rca_analysis.output.conversation_id }}

- name: add_reasoning
  type: cases.addComment
  with:
    case_id: "{{ steps.create_case.output.id }}"
    comment: |
      ## AI investigation summary

      [View full conversation]({{ kibanaUrl }}/app/agent_builder/conversations/{{ steps.rca_analysis.output.conversation_id }})

      {%- for round in steps.get_conversation.output.rounds %}
      {%- for step in round.steps %}
      {%- if step.type == "reasoning" %}
      - **Reasoning:** {{ step.reasoning }}
      {%- elsif step.type == "tool_call" %}
      - **Action:** `{{ step.tool_id }}`
      {%- endif %}
      {%- endfor %}
      {%- endfor %}
		
	

The Liquid loop walks the conversation's rounds and formats each reasoning step and tool call as a bullet. The comment becomes an auditable record of how the agent reached its conclusion.

Complete workflow

		name: observability--root-cause-analysis
description: Investigate an observability alert with an AI agent, then open a case populated with the analysis and reasoning trace.
enabled: true
tags: ["rca", "ai", "observability"]

triggers:
  - type: alert

consts:
  agent_id: "sre-agent"
  connector_id: "your-connector-id"

steps:
  - name: rca_analysis
    type: ai.agent
    agent-id: "{{ consts.agent_id }}"
    connector-id: "{{ consts.connector_id }}"
    create-conversation: true
    with:
      prompt: |
        Investigate the following alert and propose root causes.
        Keep your analysis and data exploration brief.

        <alert>
        {{ event | json }}
        </alert>

  - name: case_title
    type: ai.agent
    agent-id: "{{ consts.agent_id }}"
    connector-id: "{{ consts.connector_id }}"
    conversation-id: "{{ steps.rca_analysis.output.conversation_id }}"
    with:
      prompt: "Based on your analysis, produce a clear case title. Output only the title."

  - name: case_description
    type: ai.agent
    agent-id: "{{ consts.agent_id }}"
    connector-id: "{{ consts.connector_id }}"
    conversation-id: "{{ steps.rca_analysis.output.conversation_id }}"
    with:
      prompt: "Based on your analysis, produce a clear case description. Output only the description."

  - name: create_case
    type: cases.createCase
    with:
      title: "{{ steps.case_title.output.message }}"
      description: "{{ steps.case_description.output.message }}"
      owner: "observability"
      severity: "medium"
      tags: ["auto-rca", "ai-generated"]

  - name: attach_alert
    type: cases.addAlerts
    with:
      case_id: "{{ steps.create_case.output.id }}"
      alerts:
        - alertId: "{{ event.alerts[0]._id }}"
          index: "{{ event.alerts[0]._index }}"
          rule:
            id: "{{ event.rule.id }}"
            name: "{{ event.rule.name }}"

  - name: add_analysis
    type: cases.addComment
    with:
      case_id: "{{ steps.create_case.output.id }}"
      comment: "{{ steps.rca_analysis.output.message }}"

  - name: get_conversation
    type: kibana.request
    with:
      method: GET
      path: /api/agent_builder/conversations/{{ steps.rca_analysis.output.conversation_id }}

  - name: add_reasoning
    type: cases.addComment
    with:
      case_id: "{{ steps.create_case.output.id }}"
      comment: |
        ## AI investigation summary

        [View full conversation]({{ kibanaUrl }}/app/agent_builder/conversations/{{ steps.rca_analysis.output.conversation_id }})

        {%- for round in steps.get_conversation.output.rounds %}
        {%- for step in round.steps %}
        {%- if step.type == "reasoning" %}
        - **Reasoning:** {{ step.reasoning }}
        {%- elsif step.type == "tool_call" %}
        - **Action:** `{{ step.tool_id }}`
        {%- endif %}
        {%- endfor %}
        {%- endfor %}
		
	

Extend this workflow

Route by service. Use a switch step on event.alerts[0].service.name to pick different agents for different services (a database-focused agent for DB alerts, a frontend-focused agent for RUM alerts, and so on).
Summarize before paging. Add an ai.summarize step that turns the analysis into a one-liner and post it to the on-call Slack channel.
Gate destructive remediation. If you want the workflow to trigger remediation, add an if step that only runs when the agent's confidence is high, and invoke a child workflow that handles the remediation in isolation.
Correlate across signals. Add elasticsearch.esql.query steps before the agent call to pull metric and log context in the alert's time window, and feed them into the agent's prompt.

Observability workflows: The outcome this workflow supports.
AI steps reference: Parameters for ai.agent and related AI steps.
Elastic Agent Builder for Observability: How Agent Builder integrates with observability workflows.
Cases action steps: Full reference for cases.* steps.
elastic/workflows examples folder: More end-to-end examples.