Agentic Loop

The agentic loop is the two-phase reasoning mechanism that enables agents to collect tool results and delegation responses, then produce a final streamed response.

How It Works

Auto-Detection

The agent automatically detects tool calling support at initialization:

python

# Uses litellm's model registry (no HTTP calls needed)
import litellm
supports_native = litellm.supports_function_calling(model="gpt-4o")  # True
supports_native = litellm.supports_function_calling(model="ollama/smollm2:135m")  # False

Native: OpenAI tools API parameter, structured tool_calls in response
String fallback: Tool descriptions in system prompt, JSON parsed from content text

Both modes use the same unified tool call format.

Configuration

The agentic loop is controlled by the max_steps parameter passed to the Agent:

python

agent = Agent(
    name="my-agent",
    model_api=model_api,
    max_steps=5  # Maximum tool calling iterations
)

max_steps

Prevents infinite loops in Phase 1. When reached, returns message:

"Reached maximum reasoning steps (5)"

Guidelines:

Simple queries: 2-3 steps
Tool-using tasks: 5 steps (default)
Complex multi-step tasks: 10+ steps

Unified Tool Call Format

Both native and string modes use the same tool_calls array format. Delegation tools are registered with a delegate_to_ prefix.

Tool Call

json

{"tool_calls": [{"name": "calculator", "arguments": {"expression": "2 + 2"}}]}

Multiple Tool Calls (Parallel)

json

{"tool_calls": [{"name": "search", "arguments": {"q": "test"}}, {"name": "echo", "arguments": {"msg": "hi"}}]}

Delegation

json

{"tool_calls": [{"name": "delegate_to_researcher", "arguments": {"task": "Find information about quantum computing"}}]}

Tool Call Extraction

_extract_tool_calls() checks response.tool_calls first (works for both native mode and mock responses), then falls back to content JSON parsing for string mode:

python

def _extract_tool_calls(self, response):
    # Structured tool_calls take priority (native API or mock responses)
    if response.tool_calls:
        return response.tool_calls
    # String mode fallback: parse tool_calls array or single tool from content
    if not self._supports_native_tools:
        actions = self._parse_action(response.content or "")
        return [ToolCall(...) for action in actions]
    return []

Progress Blocks

During Phase 1, the agent emits progress blocks when starting tool/delegation execution:

json

{"type": "progress", "step": 1, "action": "tool_call", "target": "calculator"}
{"type": "progress", "step": 2, "action": "delegate", "target": "researcher"}

Execution Flow

Tool Execution

Extract tool calls from response
Emit progress block
Log tool_call event to memory
Execute tool via MCP client
Log tool_result event to memory
Add result to conversation
Continue to next loop iteration

Delegation Execution

Extract delegate_to_{name} tool call from response
Emit progress block
Log delegation_request event to memory
Invoke remote agent via A2A protocol
Log delegation_response event to memory
Add response to conversation
Continue to next loop iteration

Final Response (Phase 2)

When no tool calls detected, exit Phase 1
Add "provide your final response" prompt
Call model with streaming enabled
Stream tokens directly to client
Log agent_response event to memory

Memory Events

The loop logs events for debugging and verification:

python

# After tool execution
events = await agent.memory.get_session_events(session_id)
# Events: [user_message, tool_call, tool_result, agent_response]

# After delegation
events = await agent.memory.get_session_events(session_id)
# Events: [user_message, delegation_request, delegation_response, agent_response]

Testing with Mock Responses

Set DEBUG_MOCK_RESPONSES environment variable to test loop behavior deterministically.

The tool_calls format works for both native and string mode:

bash

# Test tool calling (tool call → no action → final)
export DEBUG_MOCK_RESPONSES='["{\"tool_calls\": [{\"id\": \"call_1\", \"name\": \"echo\", \"arguments\": {\"text\": \"hello\"}}]}", "No more actions.", "The echo returned: hello"]'

# Test delegation (delegation → no action → final)
export DEBUG_MOCK_RESPONSES='["{\"tool_calls\": [{\"id\": \"call_1\", \"name\": \"delegate_to_researcher\", \"arguments\": {\"task\": \"Find quantum info\"}}]}", "No more actions.", "Based on the research, quantum computing uses qubits."]'

# Simple response (no tools → Phase 2 only)
export DEBUG_MOCK_RESPONSES='["Hello! How can I help you?"]'

For Kubernetes E2E tests, configure via the Agent CRD:

yaml

spec:
  container:
    env:
    - name: DEBUG_MOCK_RESPONSES
      value: '["{\"tool_calls\": [{\"id\": \"call_1\", \"name\": \"delegate_to_worker\", \"arguments\": {\"task\": \"process data\"}}]}", "No more actions.", "Done."]'

Best Practices

Set appropriate max_steps - Too low may truncate reasoning, too high wastes resources
Clear instructions - Tell the LLM when to use tools vs. respond directly
Test with mocks - Use DEBUG_MOCK_RESPONSES with tool_calls format
Monitor events - Use memory endpoints to debug complex flows
Handle errors gracefully - Tool failures are fed back to the loop for recovery

Agentic Loop ​

How It Works ​

Auto-Detection ​

Configuration ​

max_steps ​

Unified Tool Call Format ​

Tool Call ​

Multiple Tool Calls (Parallel) ​

Delegation ​

Tool Call Extraction ​

Progress Blocks ​

Execution Flow ​

Tool Execution ​

Delegation Execution ​

Final Response (Phase 2) ​

Memory Events ​

Testing with Mock Responses ​

Best Practices ​

Agentic Loop

How It Works

Auto-Detection

Configuration

max_steps

Unified Tool Call Format

Tool Call

Multiple Tool Calls (Parallel)

Delegation

Tool Call Extraction

Progress Blocks

Execution Flow

Tool Execution

Delegation Execution

Final Response (Phase 2)

Memory Events

Testing with Mock Responses

Best Practices