How to Fix 'agent infinite loop in production' in LangGraph (TypeScript)
If you’re seeing agent infinite loop in production in LangGraph, it usually means your graph keeps routing back into the same node without ever reaching a terminal state. In practice, this shows up when a conditional edge always returns the same branch, or when your agent node keeps asking the model to call tools forever.
In LangGraph TypeScript, this is almost always a state transition bug, not an LLM bug. The graph is doing exactly what you told it to do.
The Most Common Cause
The #1 cause is a missing exit condition in your agent/tool loop.
A common pattern is: agent -> tools -> agent -> tools ... with no reliable stop signal. If the model keeps emitting tool calls, and your routing logic always sends control back to tools, LangGraph eventually hits its recursion limit and throws an error like:
- •
Error: GraphRecursionError: Recursion limit of 25 reached without hitting a stop condition - •or
Error: Exceeded max iterations in RunnableSequence - •or a custom production wrapper message like
agent infinite loop in production
Broken vs fixed pattern
| Broken | Fixed |
|---|---|
Routes back to tools whenever the last AI message exists | Routes to END when there are no pending tool calls |
| No termination check | Explicit termination check on tool_calls.length |
| Agent always re-enters itself | Agent only loops when tools are actually needed |
// BROKEN
import { StateGraph, END } from "@langchain/langgraph";
import { AIMessage, HumanMessage } from "@langchain/core/messages";
type State = {
messages: any[];
};
function shouldContinue(state: State) {
const last = state.messages[state.messages.length - 1];
// This is too loose. Any AIMessage sends you back into the loop.
return last instanceof AIMessage ? "tools" : END;
}
const graph = new StateGraph<State>({ channels: {} as any })
.addNode("agent", async (state) => ({ messages: state.messages }))
.addNode("tools", async (state) => ({ messages: state.messages }))
.addConditionalEdges("agent", shouldContinue, {
tools: "tools",
[END]: END,
})
.addEdge("tools", "agent");
// FIXED
import { StateGraph, END } from "@langchain/langgraph";
import { AIMessage } from "@langchain/core/messages";
type ToolCallMessage = AIMessage & {
tool_calls?: Array<{ name: string; args: unknown; id: string }>;
};
type State = {
messages: Array<AIMessage>;
};
function shouldContinue(state: State) {
const last = state.messages[state.messages.length - 1] as ToolCallMessage;
// Only loop if the model actually requested tools.
if (last?.tool_calls && last.tool_calls.length > 0) return "tools";
return END;
}
const graph = new StateGraph<State>({ channels: {} as any })
.addNode("agent", async (state) => ({ messages: state.messages }))
.addNode("tools", async (state) => ({ messages: state.messages }))
.addConditionalEdges("agent", shouldContinue, {
tools: "tools",
[END]: END,
})
.addEdge("tools", "agent");
The important part is that AIMessage existing is not enough. You need to check whether the model is still requesting tool execution.
Other Possible Causes
1) Your tool node never clears or updates state
If your tool executor appends messages but never resolves the action that triggered it, the agent sees the same unresolved context and repeats itself.
// Bad: tool result never changes routing signal
.addNode("tools", async (state) => ({
messages: [...state.messages, new ToolMessage({ content: "done", tool_call_id: "1" })],
}));
Fix by making sure the tool result produces a message that allows the agent to stop or continue correctly.
.addNode("tools", async (state) => ({
messages: [...state.messages, new ToolMessage({ content: "done", tool_call_id: "1" })],
}));
The fix here is usually not the tool output itself, but how your router interprets it afterward.
2) Your router ignores empty or malformed tool calls
Some models return empty arrays, partial calls, or malformed metadata under load. If your code treats “truthy AI message” as “needs tools,” you get loops.
function route(state: State) {
const last = state.messages.at(-1) as any;
return last.tool_calls ? "tools" : END;
}
Make this stricter:
function route(state: State) {
const last = state.messages.at(-1) as AIMessage & { tool_calls?: unknown[] };
return Array.isArray(last.tool_calls) && last.tool_calls.length > 0 ? "tools" : END;
}
3) You accidentally created a cycle with unconditional edges
This happens when you combine conditional edges with an unconditional edge that overrides your intended exit path.
graph.addConditionalEdges("agent", route, {
tools: "tools",
[END]: END,
});
// This can trap you in a cycle if added carelessly.
graph.addEdge("agent", "tools");
Remove the unconditional edge unless you really need it.
4) Your max recursion limit is too high for broken logic
Raising limits does not fix loops. It just makes them more expensive.
const app = graph.compile({
recursionLimit: 100,
});
If production logs show repeated transitions like agent -> tools -> agent -> tools, lower this during debugging so failures surface faster.
How to Debug It
- •
Log every node transition
- •Print node name, step count, and last message type.
- •If you see the same path repeating, your router is wrong.
- •
Inspect the final AI message
- •Check whether
tool_callsis present and non-empty. - •Don’t assume every
AIMessagemeans “call tools.”
- •Check whether
- •
Temporarily remove all tools
- •Run the agent with only one model call.
- •If looping stops, your issue is in tool routing or tool output handling.
- •
Set a low recursion limit
- •Use something like
recursionLimit: 5. - •That makes the failure deterministic and easier to reproduce locally.
- •Use something like
Example debug wrapper:
const result = await app.invoke(
{ messages },
{ recursionLimit: 5 }
);
If you get GraphRecursionError immediately under test but not locally, compare:
- •model version
- •prompt differences
- •tool schema differences
- •production middleware that mutates messages
Prevention
- •Always route on explicit tool-call presence, not just message type.
- •Add a hard stop condition after a maximum number of agent/tool turns.
- •Write one integration test that asserts:
- •one user request
- •at most one tool round-trip
- •final state reaches
END
A simple guard helps:
function safeRoute(state: { messages: any[]; turns?: number }) {
if ((state.turns ?? 0) >= 8) return END;
const last = state.messages.at(-1);
return last?.tool_calls?.length ? "tools" : END;
}
If you build LangGraph agents for production systems, treat loops as a routing bug first and an LLM behavior second. In almost every case I’ve seen, fixing the conditional edge logic removes the “infinite loop” error completely.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit