KAOS Monkey: Kubernetes Chaos Agent
📓 Try it yourself! This example is available as an executable Jupyter notebook.
This example demonstrates building a "chaos monkey" style agent that can interact with your Kubernetes cluster using the Kubernetes MCP Server. The agent uses MCP tools to execute operations, controlled by deterministic mock responses.
Architecture
WARNING
This example demonstrates powerful capabilities. Use with caution in production environments.
Prerequisites
- KAOS operator installed (Installation Guide)
kaos-cliinstalled- Access to a Kubernetes cluster
Overview
We'll create an agent that can:
- List pods in a namespace using the
pods_listtool - Delete specific pods using the
pods_deletetool - Return results of operations
The agent uses mock responses for deterministic behavior - this means we control exactly what the LLM "decides" to do, making the example reproducible and testable.
Setup
First, let's set up the environment and create a unique namespace for this example:
import os, time
# Set namespace as environment variable for shell commands
ns = os.environ.get("TEST_NAMESPACE", f"kaos-monkey-{int(time.time()) % 10000}")
os.environ["NS"] = ns
print(f"Using namespace: {ns}")!kubectl create namespace $NS --dry-run=client -o yaml | kubectl apply -f -Step 1: Create a ModelAPI
Create a ModelAPI in Proxy mode (we'll use mock responses so no real LLM needed):
!kaos modelapi deploy chaos-api -n $NS --mode ProxyWait for ModelAPI to be ready:
!kubectl wait deployment/modelapi-chaos-api -n $NS --for=condition=available --timeout=120sStep 2: Set Up RBAC for Kubernetes MCP Server
The Kubernetes MCP server needs permissions to interact with the Kubernetes API:
!kaos system create-rbac k8s-mcp -n $NS --resources pods --verbs list,get,deleteStep 3: Create the Kubernetes MCP Server
Deploy the Kubernetes MCP server using the built-in kubernetes runtime with the service account we just created:
!kaos mcp deploy k8s-tools -n $NS --runtime kubernetes --sa k8s-mcpWait for MCP server to be ready:
!kubectl wait deployment/mcpserver-k8s-tools -n $NS --for=condition=available --timeout=120sStep 4: Create a Test Pod
Create a simple test pod that our chaos agent can target:
!kubectl run chaos-victim -n $NS --image=nginx:alpine --restart=NeverWait for pod to be running:
!kubectl wait pod/chaos-victim -n $NS --for=condition=ready --timeout=60sStep 5: Create the Chaos Agent
Create the agent with mock responses. The --mock-response flag can be used multiple times - each response is consumed in sequence:
# Mock responses with namespace interpolation
mock1 = f'I will list the pods first.\n\n```tool_call\n{{"tool": "pods_list", "arguments": {{"namespace": "{ns}"}}}}\n```'
mock2 = f'Found chaos-victim pod. Deleting it now.\n\n```tool_call\n{{"tool": "pods_delete", "arguments": {{"namespace": "{ns}", "name": "chaos-victim"}}}}\n```'
mock3 = "Done! I have deleted the chaos-victim pod to simulate a failure scenario."
os.environ["MOCK1"], os.environ["MOCK2"], os.environ["MOCK3"] = mock1, mock2, mock3!kaos agent deploy kaos-monkey -n $NS \
--modelapi chaos-api \
--model mock-model \
--mcp k8s-tools \
--instructions "You are KAOS Monkey, a chaos engineering agent." \
--mock-response "$MOCK1" \
--mock-response "$MOCK2" \
--mock-response "$MOCK3" \
--exposeWait for agent to be ready:
!kubectl wait deployment/agent-kaos-monkey -n $NS --for=condition=available --timeout=120sStep 6: Unleash the Chaos
Now invoke the chaos agent to delete the pod:
!kaos agent invoke kaos-monkey -n $NS --message "Cause some chaos by deleting a pod"Step 7: Verify the Chaos
Check that the pod was deleted:
import time; time.sleep(2)
!kubectl get pod chaos-victim -n $NS 2>&1 || echo "SUCCESS: Pod was deleted by the chaos agent!"Understanding Mock Responses
The mock responses include tool_call blocks that trigger real MCP tool execution - only the LLM reasoning is mocked.
This is essential for:
- Testing: Deterministic behavior in CI/CD
- Cost savings: No LLM API calls during development
- Reproducibility: Same inputs always produce same outputs
Kubernetes MCP Server Tools
The kubernetes runtime provides many useful tools:
pods_list,pods_get,pods_delete,pods_log,pods_execnamespaces_list,resources_list,resources_create_or_updatehelm_install,helm_list,helm_uninstall- And more! See the kubernetes-mcp-server documentation.
Cleanup
!kubectl delete namespace $NS --ignore-not-found
print(f"Cleaned up namespace: {os.environ['NS']}")Next Steps
- Multi-Agent Telemetry - Add observability
- Gateway API - Secure your agent endpoints
- Agent CRD Reference - Full configuration options