How to Fix 'agent infinite loop in production' in LlamaIndex (Python)

By Cyprian AaronsUpdated 2026-04-21

agent-infinite-loop-in-productionllamaindexpython

What the error means

agent infinite loop in production usually means your LlamaIndex agent kept calling tools or re-entering its own reasoning loop until it hit a guardrail, timeout, or max-iteration limit. In practice, this shows up when the agent cannot produce a final answer, keeps selecting the same tool, or your tool returns output that pushes the agent back into the same decision path.

You’ll typically see symptoms like repeated tool calls, AgentRunner never finishing, or errors around max_iterations, AgentWorkflow, or a tool call that keeps bouncing between retrieval and synthesis.

The Most Common Cause

The #1 cause is a bad tool contract: the agent calls a tool, but the tool output is not stable enough to let the agent terminate. In LlamaIndex Python, this often happens when you expose a retriever or query engine as a tool and the tool description encourages recursive use, or when the tool returns text that looks like another instruction to call tools again.

Broken vs fixed pattern

Broken pattern	Fixed pattern
Tool returns ambiguous text and agent keeps re-calling it	Tool returns bounded, final content with clear stop conditions
Tool description tells the agent to “keep searching”	Tool description tells the agent exactly when to use it
No max iteration guard	Explicit iteration limits and termination rules

# BROKEN
from llama_index.core.agent import ReActAgent
from llama_index.core.tools import QueryEngineTool, ToolMetadata

tool = QueryEngineTool(
    query_engine=query_engine,
    metadata=ToolMetadata(
        name="policy_search",
        description="Search policies and keep looking until you find everything relevant."
    ),
)

agent = ReActAgent.from_tools([tool], verbose=True)

response = agent.chat("What does our claims policy say about late submissions?")

# FIXED
from llama_index.core.agent import ReActAgent
from llama_index.core.tools import QueryEngineTool, ToolMetadata

tool = QueryEngineTool(
    query_engine=query_engine,
    metadata=ToolMetadata(
        name="policy_search",
        description=(
            "Use this once to retrieve policy text relevant to the user's question. "
            "Do not call repeatedly unless new information is needed."
        ),
    ),
)

agent = ReActAgent.from_tools(
    [tool],
    verbose=True,
    max_iterations=5,
)

response = agent.chat("What does our claims policy say about late submissions?")

The broken version invites recursion. The fixed version gives the model a bounded contract and a hard stop.

Other Possible Causes

1) Recursive tool wiring

If one tool calls an agent that can call the same tool again, you’ve built a loop.

# BAD: nested agent can call back into same tools
inner_agent = ReActAgent.from_tools([tool], verbose=True)
outer_tool = FunctionTool.from_defaults(fn=lambda q: inner_agent.chat(q))
outer_agent = ReActAgent.from_tools([outer_tool], verbose=True)

Fix: keep one direction only. If you need orchestration, separate read-only tools from decision-making agents.

2) Retriever returns near-duplicate chunks

When retrieval keeps surfacing almost identical context, the model sees no new information and repeats itself.

# TUNE RETRIEVAL
query_engine = index.as_query_engine(
    similarity_top_k=3,
    response_mode="compact",
)

If similarity_top_k is too high on repetitive data, reduce it. Also deduplicate source documents before indexing.

3) Missing termination instructions in system prompt

A weak prompt makes the model over-explore.

# BETTER PROMPTING
system_prompt = (
    "Answer using at most one tool call if needed. "
    "If sufficient evidence is found, provide a final answer immediately. "
    "Do not repeat the same tool call with unchanged input."
)

This matters more than people think. A lot of “infinite loops” are just poorly constrained reasoning policies.

4) Tool output includes raw chain-of-thought-like scaffolding

If your tool returns logs, prompts, or internal instructions, the agent may treat them as actionable content.

# BAD TOOL OUTPUT
return f"Thought: search again\nAction: policy_search\nResult: {text}"

Return only factual results:

# GOOD TOOL OUTPUT
return text[:2000]

Keep outputs clean and bounded.

How to Debug It

•
Turn on verbose tracing
- •Use verbose=True on ReActAgent, AgentRunner, or your workflow.
- •Look for repeated lines like Calling tool: policy_search with no progress.
•
Check whether the same input is being sent repeatedly
- •If the exact user query plus context keeps repeating, your prompt or workflow is cycling.
- •If only retrieval context changes slightly, you likely have duplicate chunks or unstable reranking.
•
Inspect tool boundaries
- •Verify each FunctionTool, QueryEngineTool, or custom wrapper has one job.
- •Make sure no tool directly invokes an agent that can invoke that same tool again.
•
Add hard limits
- •Set max_iterations.
- •Set timeouts on upstream services.
- •Cap retrieved context length so one bad document cannot dominate every turn.

A useful test is to replace all tools with a stub that returns static text. If the loop disappears, your issue is in retrieval/tooling rather than core agent logic.

Prevention

•
Keep tools deterministic and narrow.
- •A retriever should retrieve.
- •A formatter should format.
- •An orchestrator should not also be callable as a retriever.
•
Put explicit stop rules in prompts and code.
- •Use max_iterations.
- •Tell the model when to stop calling tools.
- •Reject repeated identical calls if your wrapper supports it.
•
Test with adversarial inputs before production.
- •Long questions.
- •Ambiguous questions.
- •Questions with missing data.

Those are the inputs that expose loops before your users do.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit