Agent CRD
The Agent custom resource defines an AI agent deployment on Kubernetes.
Full Specification
apiVersion: kaos.tools/v1alpha1
kind: Agent
metadata:
name: my-agent
namespace: my-namespace
spec:
# Required: Reference to ModelAPI for LLM access
modelAPI: my-modelapi
# Optional: List of MCPServer references for tool access
mcpServers:
- echo-tools
- calculator-tools
# Optional: Wait for dependencies to be ready (default: true)
waitForDependencies: true
# Optional: Agent configuration
config:
# Human-readable description for humans and other agents for a2a delegation
description: "My helpful agent that performs tasks X/Y"
# System prompt instructions
instructions: |
You are a helpful assistant.
Be concise and accurate.
# Max reasoning loop iterations (1-20, default: 5)
reasoningLoopMaxSteps: 5
# Memory system configuration
memory:
enabled: true # Enable/disable memory (default: true)
type: local # Memory type (only "local" supported)
contextLimit: 6 # Messages for delegation context
maxSessions: 1000 # Max sessions to keep
maxSessionEvents: 500 # Max events per session
# Additional environment variables
env:
- name: MODEL_NAME
value: "ollama/smollm2:135m"
- name: CUSTOM_VAR
value: "custom-value"
# Optional: Agent-to-Agent networking
agentNetwork:
# Create Service for A2A discovery (default: true)
expose: true
access: # Sub-agents this agent can delegate to
- worker-1
- worker-2
# Optional: PodSpec override using strategic merge patch
podSpec:
containers:
- name: agent
resources:
requests:
memory: "256Mi"
cpu: "200m"
limits:
memory: "512Mi"
cpu: "1000m"
status:
phase: Ready # Pending, Ready, Failed, Waiting
ready: true
endpoint: "http://agent-my-agent.my-namespace.svc.cluster.local:8000"
linkedResources:
modelAPI: my-modelapi
message: "Deployment ready replicas: 1/1"
deployment:
replicas: 1
readyReplicas: 1
availableReplicas: 1
updatedReplicas: 1
conditions:
- type: Available
status: "True"
- type: Progressing
status: "True"Spec Fields
modelAPI (required)
Reference to a ModelAPI resource in the same namespace.
spec:
modelAPI: my-modelapiThe agent waits for the ModelAPI to become Ready before starting (see waitForDependencies).
mcpServers (optional)
List of MCPServer resource names in the same namespace.
spec:
mcpServers:
- echo-tools
- calculator-toolsAll referenced MCPServers must be Ready for the agent to start (see waitForDependencies).
waitForDependencies (optional)
Controls whether the agent waits for ModelAPI and MCPServers to be ready before creating the deployment.
spec:
waitForDependencies: true # Default: true| Value | Behavior |
|---|---|
true (default) | Agent deployment is created only after ModelAPI and all MCPServers are Ready |
false | Agent deployment is created immediately; agent handles unavailable dependencies gracefully at runtime |
Setting to false is useful when:
- Deploying agents in any order without worrying about startup sequence
- Using the Python agent's graceful degradation for unavailable sub-agents/tools
config (optional)
Agent-specific configuration.
config.description
Human-readable description shown in agent card:
config:
description: "A research assistant agent"config.instructions
System prompt for the agent:
config:
instructions: |
You are a research assistant.
When asked to research a topic:
1. Search for relevant information
2. Summarize findings concisely
3. Cite your sourcesconfig.reasoningLoopMaxSteps
Maximum number of reasoning loop iterations:
config:
reasoningLoopMaxSteps: 10 # Default: 5, Range: 1-20The reasoning loop runs tool calls and delegations until the model produces a final response or max steps is reached.
config.memory
Memory system configuration:
config:
memory:
enabled: true # Enable/disable memory (default: true)
type: local # Memory type (default: local, only option)
contextLimit: 6 # Messages for delegation context (default: 6)
maxSessions: 1000 # Max sessions to keep (default: 1000)
maxSessionEvents: 500 # Max events per session (default: 500)| Field | Type | Default | Description |
|---|---|---|---|
enabled | bool | true | Enable memory; when false, uses NullMemory (no-op) |
type | string | local | Memory implementation type (only local supported) |
contextLimit | int | 6 | Messages to include when delegating to sub-agents |
maxSessions | int | 1000 | Maximum sessions before oldest are evicted |
maxSessionEvents | int | 500 | Maximum events per session before eviction |
When to disable memory:
- Stateless agents that don't need conversation history
- Resource-constrained environments
- High-throughput agents where memory overhead matters
config.env
Additional environment variables:
config:
env:
- name: MODEL_NAME
value: "gpt-4"
- name: API_KEY
valueFrom:
secretKeyRef:
name: my-secrets
key: api-keyagentNetwork (optional)
Agent-to-Agent networking configuration.
agentNetwork.expose
Create a Kubernetes Service for this agent (default: true):
agentNetwork:
expose: trueWhen true, creates a Service that exposes:
- Port 8000
- Endpoints:
/health,/ready,/.well-known/agent,/agent/invoke,/v1/chat/completions
agentNetwork.access
List of agent names this agent can delegate to:
agentNetwork:
access:
- worker-1
- worker-2The operator automatically:
- Finds the referenced Agent resources
- Sets
PEER_AGENTS=worker-1,worker-2 - Sets
PEER_AGENT_WORKER_1_CARD_URL=http://agent-worker-1... - Sets
PEER_AGENT_WORKER_2_CARD_URL=http://agent-worker-2...
podSpec (optional)
Override the generated pod spec using Kubernetes strategic merge patch.
spec:
podSpec:
containers:
- name: agent # Must match the generated container name
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "512Mi"
tolerations:
- key: "gpu"
operator: "Exists"
nodeSelector:
accelerator: "nvidia"Strategic Merge Behavior:
- Container fields are merged by name (container
namemust beagent) - New fields are added, existing fields are overwritten
- Useful for: resources, tolerations, nodeSelector, volumes, securityContext
Note: Replicas cannot be set via podSpec; it's a deployment-level setting (currently fixed at 1).
gatewayRoute (optional)
Configure Gateway API routing, including request timeout:
spec:
gatewayRoute:
# Request timeout for the HTTPRoute (Gateway API Duration format)
# Default: "120s" for Agent (to allow multi-step reasoning)
# Set to "0s" to use Gateway's default timeout
timeout: "120s"Status Fields
| Field | Type | Description |
|---|---|---|
phase | string | Current phase: Pending, Ready, Failed, Waiting |
ready | bool | Whether agent is ready to serve |
endpoint | string | Service URL for A2A communication |
linkedResources | map | References to dependencies |
message | string | Additional status information |
deployment | object | Deployment status for rolling update visibility |
deployment (status)
Mirrors key status fields from the underlying Kubernetes Deployment:
| Field | Type | Description |
|---|---|---|
replicas | int32 | Total number of non-terminated pods |
readyReplicas | int32 | Number of pods with Ready condition |
availableReplicas | int32 | Number of available pods (ready for minReadySeconds) |
updatedReplicas | int32 | Number of pods with desired template (rolling update progress) |
conditions | array | Deployment conditions (Available, Progressing, ReplicaFailure) |
Example status during a rolling update:
status:
phase: Pending
ready: false
deployment:
replicas: 2
readyReplicas: 1
availableReplicas: 1
updatedReplicas: 1
conditions:
- type: Progressing
status: "True"
reason: ReplicaSetUpdated
message: "ReplicaSet 'agent-my-agent-xyz' is progressing"
- type: Available
status: "True"
reason: MinimumReplicasAvailableExamples
Simple Agent
apiVersion: kaos.tools/v1alpha1
kind: Agent
metadata:
name: simple-agent
spec:
modelAPI: ollama
config:
description: "A simple chat agent"
instructions: "You are a helpful assistant."Agent with Tools
apiVersion: kaos.tools/v1alpha1
kind: Agent
metadata:
name: tool-agent
spec:
modelAPI: ollama
mcpServers:
- calculator
- web-search
config:
description: "An agent with tools"
instructions: |
You have access to a calculator and web search.
Use them when appropriate.
reasoningLoopMaxSteps: 10Coordinator with Workers
apiVersion: kaos.tools/v1alpha1
kind: Agent
metadata:
name: coordinator
spec:
modelAPI: ollama
config:
description: "Coordinator agent"
instructions: |
You coordinate worker agents.
Delegate research to researcher.
Delegate analysis to analyst.
reasoningLoopMaxSteps: 10
agentNetwork:
access:
- researcher
- analystAgent with Resource Limits
apiVersion: kaos.tools/v1alpha1
kind: Agent
metadata:
name: resource-agent
spec:
modelAPI: ollama
config:
description: "Agent with custom resources"
podSpec:
containers:
- name: agent
resources:
requests:
memory: "512Mi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "2000m"Agent without Waiting for Dependencies
apiVersion: kaos.tools/v1alpha1
kind: Agent
metadata:
name: eager-agent
spec:
modelAPI: ollama
waitForDependencies: false # Start immediately
config:
description: "Agent that handles unavailable dependencies gracefully"Troubleshooting
Agent Stuck in Pending
kubectl describe agent my-agent -n my-namespaceCommon causes:
- ModelAPI not Ready
- MCPServer not Ready
Agent Stuck in Waiting
The agent is waiting for dependencies. Check:
kubectl get modelapi -n my-namespace
kubectl get mcpserver -n my-namespaceSet waitForDependencies: false to allow the agent to start without waiting.
Agent Stuck in Failed
Check pod logs:
kubectl logs -l agent=my-agent -n my-namespaceCommon causes:
- Invalid MODEL_API_URL
- Model not available
- Image pull errors
Sub-Agent Delegation Failing
Verify peer agent is accessible:
# Check if service exists
kubectl get svc agent-worker-1 -n my-namespace
# Check agent card endpoint
kubectl exec -it deploy/agent-coordinator -n my-namespace -- \
curl http://agent-worker-1:8000/.well-known/agent