How to Fix 'agent infinite loop in production' in LangGraph (Python)

By Cyprian AaronsUpdated 2026-04-21
agent-infinite-loop-in-productionlanggraphpython

What this error actually means

If you’re seeing agent infinite loop in production, your LangGraph agent is cycling through nodes without ever reaching a terminal state. In practice, this usually means your graph keeps routing back to the same node, or your assistant keeps calling tools without producing a final answer.

The failure often shows up as a GraphRecursionError or a runtime that keeps hitting the recursion limit until LangGraph stops it.

The Most Common Cause

The #1 cause is a bad conditional edge that always routes back to the agent node. In LangGraph, this usually happens when your router never returns an end condition like END, or when your tool execution path never updates state in a way that changes the next route.

Here’s the broken pattern:

BrokenFixed
Always routes back to agentRoutes to tools only when tool calls exist
No explicit stop conditionReturns END when the assistant is done
State doesn’t change meaningfullyState is updated with tool results and final messages
# BROKEN
from langgraph.graph import StateGraph, END
from typing import TypedDict, Annotated
from langchain_core.messages import AnyMessage

class AgentState(TypedDict):
    messages: Annotated[list[AnyMessage], list]

def route(state: AgentState):
    # Always loops back to agent
    return "agent"

builder = StateGraph(AgentState)
builder.add_node("agent", agent_node)
builder.add_node("tools", tool_node)

builder.set_entry_point("agent")
builder.add_conditional_edges("agent", route, {
    "agent": "agent",
    "tools": "tools",
    "end": END,
})

graph = builder.compile()
# FIXED
from langgraph.graph import StateGraph, END
from typing import TypedDict, Annotated
from langchain_core.messages import AnyMessage, AIMessage

class AgentState(TypedDict):
    messages: Annotated[list[AnyMessage], list]

def route(state: AgentState):
    last_msg = state["messages"][-1]

    # If model requested tools, go there
    if getattr(last_msg, "tool_calls", None):
        return "tools"

    # Otherwise stop
    return "end"

builder = StateGraph(AgentState)
builder.add_node("agent", agent_node)
builder.add_node("tools", tool_node)

builder.set_entry_point("agent")
builder.add_conditional_edges("agent", route, {
    "tools": "tools",
    "end": END,
})

graph = builder.compile()

The key difference is simple: the graph must have a real exit path. If your router can only ever return another internal node, LangGraph will keep executing until it hits something like:

langgraph.errors.GraphRecursionError: Recursion limit of 25 reached without hitting a stop condition.

Other Possible Causes

1) Your agent keeps producing tool calls forever

This happens when the model is prompted badly or the tool result doesn’t satisfy the model, so it asks for the same tool again.

# BAD: model can keep asking for the same tool
assistant = llm.bind_tools([search_tool])

# GOOD: add a final-answer instruction and cap retries in state
system_prompt = """
Use tools only when needed.
After you have enough information, answer directly.
Do not call tools again once you have sufficient evidence.
"""

2) Tool node does not append results back into state

If the tool output never gets added to messages, the model sees no progress and repeats itself.

# BROKEN
def tool_node(state):
    result = run_tool(state["messages"][-1])
    return {}  # nothing written back

# FIXED
from langchain_core.messages import ToolMessage

def tool_node(state):
    last = state["messages"][-1]
    result = run_tool(last)
    return {
        "messages": [ToolMessage(content=str(result), tool_call_id=last.tool_calls[0]["id"])]
    }

3) Your conditional edge checks the wrong field

A common bug is checking state["messages"][-1].content instead of tool_calls. That makes routing brittle and often wrong.

# BAD
def route(state):
    if "tool" in state["messages"][-1].content.lower():
        return "tools"
    return "agent"

# GOOD
def route(state):
    last = state["messages"][-1]
    if getattr(last, "tool_calls", None):
        return "tools"
    return "end"

4) You set recursion limits too high and hide the bug

This does not cause the loop, but it makes production failures harder to spot. The graph runs longer before failing.

graph.invoke(
    {"messages": [("user", "Find policy details")]},
    config={"recursion_limit": 100}
)

Lower it during debugging so you catch bad cycles quickly:

graph.invoke(
    {"messages": [("user", "Find policy details")]},
    config={"recursion_limit": 10}
)

How to Debug It

  1. Inspect the last message at every node

    • Print state["messages"][-1]
    • Confirm whether it contains tool_calls
    • Check whether each node actually changes state
  2. Trace routing decisions

    • Log what your router returns on every hop
    • If you see agent -> tools -> agent -> tools, your stop condition is missing
  3. Run with a low recursion limit

    • Use config={"recursion_limit": 5}
    • A fast failure is better than waiting for a long loop in production logs
  4. Test each node independently

    • Call your agent node once and inspect output
    • Call your tool node once and verify it appends a ToolMessage
    • Verify that after one tool round-trip, the next assistant message can terminate

Prevention

  • Always design graphs with an explicit terminal path using END.
  • Make routing depend on structured fields like tool_calls, not string matching on message text.
  • Add unit tests that simulate:
    • one normal completion path
    • one tool call path
    • one repeated-tool-call path that must terminate

If you want one rule to remember: every loop in LangGraph needs a measured exit. If your agent can only bounce between nodes and never produce a final answer or hit END, you will get GraphRecursionError sooner or later.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides