How to Fix 'agent infinite loop in production' in AutoGen (Python)
What the error means
agent infinite loop in production usually means your AutoGen agents keep handing control back and forth without ever reaching a terminating condition. In practice, you’ll see this when AssistantAgent and UserProxyAgent keep generating replies, or when a tool call path never produces a final answer.
The symptom is usually one of these:
- •repeated
Replying as ... - •the same function call being triggered over and over
- •no
max_turns, no stop condition, or a bad termination message format
The Most Common Cause
The #1 cause is an agent chat with no real termination criteria. In AutoGen, if your assistant can always produce another response and your user proxy keeps auto-replying, the conversation never ends.
Here’s the broken pattern versus the fixed one.
| Broken | Fixed |
|---|---|
| No termination check | Explicit stop condition |
| Unlimited auto-replies | Bounded turns |
| Tool result never marks completion | Tool returns final answer / summary |
# Broken: endless back-and-forth if the assistant never emits a stopping signal
from autogen import AssistantAgent, UserProxyAgent
assistant = AssistantAgent(
name="assistant",
llm_config={"config_list": [{"model": "gpt-4o-mini", "api_key": "YOUR_KEY"}]},
)
user_proxy = UserProxyAgent(
name="user_proxy",
human_input_mode="NEVER",
max_consecutive_auto_reply=1000, # too high, effectively unbounded
)
user_proxy.initiate_chat(
assistant,
message="Analyze this claim and keep going until done.",
)
# Fixed: bounded conversation + explicit termination logic
from autogen import AssistantAgent, UserProxyAgent
def is_termination_msg(msg):
content = msg.get("content", "")
return content.strip().endswith("TERMINATE")
assistant = AssistantAgent(
name="assistant",
llm_config={"config_list": [{"model": "gpt-4o-mini", "api_key": "YOUR_KEY"}]},
)
user_proxy = UserProxyAgent(
name="user_proxy",
human_input_mode="NEVER",
max_consecutive_auto_reply=5,
is_termination_msg=is_termination_msg,
)
user_proxy.initiate_chat(
assistant,
message="Analyze this claim and end with TERMINATE when finished.",
)
If you’re using group chats or multi-agent orchestration, the same rule applies: every path needs a terminal state. Without it, AutoGen will keep routing messages forever.
Other Possible Causes
1) Your tool function always returns something that triggers another tool call
If the model sees a tool result that looks incomplete, it may call the same tool again.
# Bad: ambiguous tool output
def lookup_policy(claim_id: str):
return f"Found claim {claim_id}. More details available."
# Good: return a final, structured result
def lookup_policy(claim_id: str):
return {
"claim_id": claim_id,
"status": "closed",
"final": True,
"summary": "Claim closed on 2026-04-12"
}
2) Your system prompt encourages endless refinement
Prompts like “keep improving until perfect” are dangerous in production.
# Bad
system_message = """
You are an insurance assistant.
Keep refining your answer until it is perfect.
"""
# Good
system_message = """
You are an insurance assistant.
Answer once, cite uncertainties clearly, and stop when complete.
If finished, end with TERMINATE.
"""
3) max_turns or equivalent limits are missing in group chat flows
In GroupChatManager, forgetting to cap turns is a classic way to create loops.
from autogen import GroupChat, GroupChatManager
groupchat = GroupChat(agents=[assistant], messages=[], max_round=50)
manager = GroupChatManager(groupchat=groupchat, llm_config={"config_list": [...]})
Make sure you set:
- •
max_round - •agent-level auto reply limits
- •termination checks for each speaker
4) A bad handoff between agents causes ping-pong behavior
This happens when Agent A expects Agent B to finalize, while Agent B sends it back for clarification.
# Bad handoff logic
if response_needs_review:
return "Please review and send back."
Fix it by making ownership explicit:
- •one agent gathers facts
- •one agent decides
- •one agent terminates
# Better handoff contract
if response_needs_review:
return {"status": "needs_review", "owner": "human", "final": True}
How to Debug It
- •
Print every message and role
- •Log the raw chat transcript.
- •Look for repeated content or alternating messages with no terminal token.
- •
Check termination conditions first
- •Verify
is_termination_msgis actually called. - •Confirm your assistant emits the exact string you expect, such as
TERMINATE.
- •Verify
- •
Lower the turn limits
- •Set
max_consecutive_auto_reply=3. - •Set group chat rounds to something small like
max_round=10. - •If the loop stops earlier, you’ve confirmed it was runaway recursion rather than a single bad tool.
- •Set
- •
Inspect tool outputs
- •Make sure tools return final data, not vague follow-ups.
- •Search logs for repeated function names like
lookup_policy,search_claims, orfetch_customer_profile.
A useful quick test is to disable tools entirely. If the loop disappears, your problem is in function-calling behavior or tool output formatting.
Prevention
- •
Always define a termination contract:
- •exact stop token like
TERMINATE - •explicit JSON field like
"final": true - •bounded retries and bounded turns
- •exact stop token like
- •
Treat agent prompts like API contracts:
- •no vague “keep going” language
- •no open-ended refinement loops
- •specify who owns completion
- •
Add runtime guards in production:
- •conversation length limit
- •repeated-message detector
- •circuit breaker for identical tool calls
If you build AutoGen systems for insurance or banking workflows, assume every agent can fail to stop unless you force it to. The fix is rarely “better prompting” alone; it’s usually termination logic plus hard limits plus clean tool contracts.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit