How to Fix 'agent infinite loop' in LlamaIndex (Python)

By Cyprian AaronsUpdated 2026-04-21
agent-infinite-loopllamaindexpython

If you’re seeing ValueError: agent infinite loop detected or a similar loop-guard error in LlamaIndex, it means the agent kept calling tools without ever producing a final answer. In practice, this usually happens when the agent is missing a stopping condition, keeps selecting the same tool, or your tool output never gives it enough signal to move forward.

This is not a model problem. It’s almost always an orchestration problem in your Python code, tool design, or agent configuration.

The Most Common Cause

The #1 cause is a tool loop: the agent calls a tool, gets back output, then decides to call the same tool again because the response does not let it resolve the task.

This shows up a lot with ReActAgent, OpenAIAgent, or any AgentRunner setup where the tool returns vague text like “done” or raw data with no final-answer path.

Broken vs fixed pattern

Broken patternFixed pattern
Tool returns ambiguous output and agent re-calls itTool returns structured, task-complete output
Agent has no clear termination behaviorAgent can produce a final response after one tool call
# BROKEN: the agent can keep looping on the same tool
from llama_index.core.agent import ReActAgent
from llama_index.core.tools import FunctionTool

def lookup_customer(customer_id: str) -> str:
    # Returns vague text that doesn't help the agent stop
    return f"Customer {customer_id} found."

tool = FunctionTool.from_defaults(fn=lookup_customer)

agent = ReActAgent.from_tools(
    [tool],
    verbose=True,
)

response = agent.chat("Find customer 12345 and summarize their status.")
print(response)
# FIXED: return structured output and make the task completion obvious
from llama_index.core.agent import ReActAgent
from llama_index.core.tools import FunctionTool
import json

def lookup_customer(customer_id: str) -> str:
    result = {
        "customer_id": customer_id,
        "status": "active",
        "risk_level": "low",
        "last_reviewed": "2026-01-10",
    }
    return json.dumps(result)

tool = FunctionTool.from_defaults(fn=lookup_customer)

agent = ReActAgent.from_tools(
    [tool],
    verbose=True,
)

response = agent.chat("Find customer 12345 and summarize their status.")
print(response)

Why this matters:

  • The model needs enough signal to decide “I have what I need.”
  • If your tool output is generic, the planner often tries again.
  • If you are using function-calling agents, bad schemas can cause repeated retries too.

Other Possible Causes

1. Your tool signature is too loose

If the function accepts *args, **kwargs, or untyped parameters, LlamaIndex may pass arguments in a way that causes repeated failures and retries.

# BAD
def search_docs(query, *args, **kwargs):
    ...

# GOOD
def search_docs(query: str) -> str:
    ...

Use explicit types and stable parameter names. This is especially important with FunctionTool.from_defaults().

2. The agent cannot finish because max iterations are too high

Sometimes you are not fixing the loop; you are just letting it run until it hits the guardrail.

from llama_index.core.agent import ReActAgent

agent = ReActAgent.from_tools(
    tools,
    max_iterations=3,   # keep this bounded during debugging
    verbose=True,
)

If you leave iteration limits too high and your tools are weakly defined, you’ll get long retry chains before hitting:

  • ValueError: Agent reached max iterations
  • ValueError: agent infinite loop detected

3. Your retrieval tool keeps returning near-empty context

This happens when your vector index is poorly chunked or retrieval settings are too narrow. The agent keeps asking because each retrieval result looks incomplete.

from llama_index.core.retrievers import VectorIndexRetriever

retriever = VectorIndexRetriever(
    index=index,
    similarity_top_k=1,  # often too small for real docs
)

Try increasing recall:

retriever = VectorIndexRetriever(
    index=index,
    similarity_top_k=5,
)

If you’re wrapping retrieval in a tool, make sure the returned context includes enough content to answer directly.

4. You have conflicting system instructions

If your system prompt says both “always use tools” and “answer directly when possible,” some agents bounce between behaviors.

system_prompt = """
You must always call a tool.
Never answer without a tool.
Only answer after using tools.
"""

That can trap the planner in repeated tool use. Tighten it up:

system_prompt = """
Use tools only when needed.
If enough information is already available from previous steps, provide a final answer.
Do not repeat the same tool call unless new parameters are required.
"""

How to Debug It

  1. Turn on verbose tracing

    • For most agents:
      agent = ReActAgent.from_tools(tools, verbose=True)
      
    • Watch for repeated identical tool calls with identical arguments.
  2. Inspect each tool’s raw output

    • Print what your function actually returns.
    • Look for vague strings like "ok", "success", "done", or empty lists.
    • If the output does not contain decision-grade data, fix that first.
  3. Check whether the agent is retrying after parsing errors

    • Common signs include repeated attempts after malformed JSON or schema mismatch.
    • If you see parser-related failures around OpenAIAgent or function calling, simplify the schema and test one tool at a time.
  4. Reduce to one tool and one query

    • Remove every extra retriever, memory layer, and secondary tool.
    • Start with:
      response = agent.chat("Call this one function once and stop.")
      
    • If that works, reintroduce complexity incrementally until the loop returns.

Prevention

  • Keep tools deterministic and explicit.

    • Typed inputs.
    • Structured outputs.
    • No hidden side effects unless absolutely necessary.
  • Set sane iteration limits during development.

    • Use max_iterations as a guardrail.
    • Fail fast instead of letting loops burn tokens.
  • Design for termination.

    • Every multi-step workflow should have an obvious end state.
    • If the model needs to decide whether to continue, give it concrete completion signals from your tools.

If you want fewer of these incidents in production, treat agents like orchestration code first and model behavior second. Most agent infinite loop errors in LlamaIndex come from unclear tool contracts, weak retrieval results, or prompts that never let the agent stop.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides