The "Build-Push-Deploy" cycle is never simple. High-availability environments require automated guardrails, proactive checks that prevent a deployment from even starting if the target cluster is under stress. Today these are generally performed with APIs and scripts during the CI/CD process. Different gates are initiated during the process to ensure the application tests have passed, the artifact is clean, the infrastructure is stable, and many more.
With AI, and agents, these gates are slowly becoming more sophisticated. More and more these gates are using a Model Context Protocol (MCP) server for this check. This is a newer, more cutting-edge "agentic" approach. It allows your CI/CD pipeline to act as an intelligent agent that "asks" your cluster for its health status before making a change.
A standard Kubernetes deployment workflow generally follows these high-level steps:
-
Verification Gate: Ensuring all automated testing has passed.
-
Artifact Creation: Building the Docker container.
-
Environment Gate: Verifying that the production Kubernetes environment, supporting infrastructure, and existing applications are healthy.
Kubernetes Deployment: Triggering the final release. Modern workflows often use GitOps tools like ArgoCD or Flux, where a simple image tag update in Docker Hub automatically synchronizes the cluster.
Kubernetes health checks can range from simple to complex depending on your Service Level Objectives (SLOs) and operational maturity. Typically, the primary goal is to ensure the cluster is healthy and not nearing a resource bottleneck. Common "red flag" metrics used in these gates include:
| Red Flag | Scenario | SRE Meaning |
| Pod Count > 90% | High pod density | Approaching node-level scheduling limits. |
| CPU Usage > 70% | High real-time load | Risk of CPU throttling during deployment. |
| Memory Usage > 80% | Memory pressure | High risk of Out-of-Memory (OOM) kills. |
| OOM Terminating Processes | Resource limits reached | Inadequate pod configuration or sizing. |
| Available vs. Requested | Capacity imbalance | Risk of deployment failure due to insufficient reserved space. |
I will show you how you can use a CI/CD pipeline that integrates Observability AI Agents with GitHub Actions via an Model Context Protocol (MCP) server, creating automated pre-deployment health checks for Kubernetes clusters.
By introducing an observability checkpoint before deployment, we transform the pipeline into an intelligent system that:
-
Queries real-time metrics from Kubernetes clusters
-
Analyzes capacity using custom ESQL queries
-
Makes autonomous decisions about deployment readiness
-
Prevents failures proactively rather than reacting to them
-
Provides actionable feedback to engineering teams
Here is the “architecture” of what is being deployed and how it works in this blog.
As you can see the flow uses Elastic Observability, which is storing and analyzing Kubernetes OpenTelemetry metrics from the opentelemetry-kube-stack-cluster-stats-collector (deployed via OpenTelemetry Operator).
Github Actions calls the Observability Kubernetes Agent, via the Elastic MCP server, which has tools that help check for some of the “red flag” issues identified in the table above.
Based on the results, Github Actions will either stop the process or continue to deploy the artifact via a trigger for ArgoCD.
The Observability Kubernetes Agent was built using Elastic’s Agent Builder capability, as well as some of the tools it uses. These are then exposed via the MCP server.
Hence the overall set of components used here include:
-
GitHub Actions: Orchestrates the build and deployment workflow
-
Elastic MCP Server: Serverless endpoint that exposes AI agents
-
Observability Kubernetes Agent: Custom agent with specialized ESQL tools
-
Kubernetes Cluster: Target deployment environment with metrics collection
-
ES|QL Query Tools: Precision queries for node and pod resource analysis
What Happens When a Kubernetes Health Check Fails in GitHub Actions?
How the Pipeline Blocks a Deployment Automatically
When the cluster exceeds capacity thresholds, the workflow automatically blocks deployment. In this scenario I didn’t load the cluster, but used a simple check of whether more than 25% of resources were being used to purposely stop the deployment.
The workflow shows:
-
Build Docker Image (28s)
-
Push to Docker Hub (5s)
-
K8s Health via Elastic O11y K8s Agent (16s) - FAILED
-
Deploy to otel-test Cluster - BLOCKED
Annotation: "Cluster has resource issues - blocking deployment"
What Does the AI Agent's Health Check Response Look Like?
The agent provides detailed analysis:
Step 1: Finding Kubernetes analysis agent...
Found agent: Observability Kubernetes Agent (kubernetes_analysis_agent)
Step 2: Querying cluster health...
Prompt: tell me if my cluster otel-test is using more than 25% memory or CPU on any of its nodes
Agent Response:
================================================================
Yes, your cluster "otel-test" has nodes and pods using more than 25% of resources.
++Node exceeding 25%:++
- ip-192-168-165-175.us-west-2.compute.internal
- Memory: 36.44%
- CPU: 7.99% (below threshold)
++All other nodes are below the 25% threshold++ for both CPU and memory.
While the query for pods doesn't show percentage values directly, the data indicates
normal resource usage patterns for the pods in your cluster, with none appearing to
consume excessive resources relative to their allocations.
================================================================
Cluster has resource issues - blocking deployment
Error: Process completed with exit code 1.
As you can see, a prompt was sent to the Observability Kubernetes Agent via MCP vs having to build some logic or call another script etc.
This single check prevented:
-
A deployment that would have failed
-
Wasted CI/CD minutes
-
Potential service degradation
-
Manual SRE intervention
What it provided:
- Provided actionable intelligence for capacity planning
How to Build a Kubernetes Health Check Agent in Elastic
Building the agent isn’t hard, Elastic’s AgentBuilder’s UI makes it easy to create it and have it running in minutes.
How to Configure the Observability Kubernetes Agent
Other than naming the agent, you need to provide it with some instructions.
Custom Instructions:
# Agent Instructions
## Primary Role
You are a Kubernetes monitoring assistant that helps users analyze cluster performance
and resource utilization. Your primary goal is to provide clear, accurate information
about Kubernetes clusters using available data sources.
## Tool Selection Guidelines
1. When users ask about Kubernetes metrics, node performance, or cluster health:
- Use ESQL tools for detailed analysis
- Query metrics from kubeletstatsreceiver.otel-default
2. For alert-related queries:
- Use the alerts tool to check active alerts
3. Always provide context about:
- Time ranges queried
- Cluster names
- Resource thresholds
How to Write ES|QL Queries for Kubernetes Node and Pod Metrics
I created several tools that checked Node CPU and memory, pod CPU and memory, and OOM from pods. Additionally, the Observability Kubernetes Agent utilized a large portion of the OOTB tools like observability_alerts as part of its abilities.
Here is an example of the node CPU and memory tool, which uses a simple ES|QL query against OpenTelemetry metrics to check the CPU and memory utilization in the cluster.
ES|QL Query:
FROM metrics-kubeletstatsreceiver.otel-default
| WHERE resource.attributes.k8s.cluster.name == ?cluster_name
AND @timestamp > NOW() - 3 hours
| STATS
avg_cpu_usage = AVG(metrics.k8s.node.cpu.usage),
avg_memory_usage = AVG(metrics.k8s.node.memory.usage),
avg_memory_available = AVG(metrics.k8s.node.memory.available),
avg_memory_working_set = AVG(metrics.k8s.node.memory.working_set)
BY resource.attributes.k8s.node.name
| EVAL
cpu_usage_pct = avg_cpu_usage * 100,
memory_usage_pct = (avg_memory_working_set / (avg_memory_working_set + avg_memory_available)) * 100
| SORT cpu_usage_pct DESC, memory_usage_pct DESC
| KEEP resource.attributes.k8s.node.name, cpu_usage_pct, memory_usage_pct
| LIMIT 100
Parameters:
- cluster_name(string): Name of the K8s cluster to analyze
How to Expose the Agent via the Elastic MCP Server
Once configured, the agent is automatically available via Elastic's MCP server running in your Observability project. The MCP server provides a standardized interface that any MCP-compatible client can query.
MCP Endpoint:
Authentication: Uses Elastic API keys for secure access
Why Agentic CI/CD Matters for Kubernetes Operations
Agentic CI/CD represents an evolution in proactive deployment strategies. By integrating Elastic Observability AI agents with GitHub Actions via MCP, we've created a system that:
Prevents failures before they happen Provides real-time cluster health insights Makes data-driven deployment decisions Reduces operational burden on SRE teams Improves overall deployment reliability
This approach is at the cutting edge of modern CI/CD practices. While traditional pipelines focus solely on the "Build-Push-Deploy" cycle, agentic pipelines introduce automated pre-deployment guardrails using observability data, transforming your CI/CD infrastructure into an intelligent agent that actively protects production environments.
Resources and Next Steps
Sign up for Elastic Cloud Serverless and try this out with your pipeline.
