How to Fix 'agent infinite loop when scaling' in CrewAI (Python)
What the error means
agent infinite loop when scaling in CrewAI usually means one of your agents is stuck re-running tasks without ever reaching a valid stop condition. You’ll see this most often when you scale from a single task to multiple agents, add delegation, or let an agent call tools that keep returning incomplete outputs.
In practice, it’s almost always a control-flow bug: bad task design, missing stop criteria, or an agent that keeps asking for the same missing info.
The Most Common Cause
The #1 cause is an agent being allowed to delegate or retry indefinitely because the task prompt is vague and the output format is not constrained.
A common failure mode looks like this:
| Broken pattern | Fixed pattern |
|---|---|
| Agent can delegate forever | Agent has bounded role and clear stop condition |
| Task asks for “an analysis” | Task asks for a specific deliverable |
| No structured output | Output schema or explicit format |
Broken vs fixed code
# broken.py
from crewai import Agent, Task, Crew, Process
from crewai_tools import SerperDevTool
researcher = Agent(
role="Researcher",
goal="Find everything about the topic",
backstory="You are very thorough.",
tools=[SerperDevTool()],
allow_delegation=True, # can keep bouncing work around
verbose=True,
)
writer = Agent(
role="Writer",
goal="Write a report",
backstory="You write detailed reports.",
allow_delegation=True,
verbose=True,
)
task = Task(
description="Research scaling issues and write something useful.",
expected_output="A good report.",
agent=researcher,
)
crew = Crew(
agents=[researcher, writer],
tasks=[task],
process=Process.sequential,
verbose=True,
)
result = crew.kickoff()
# fixed.py
from crewai import Agent, Task, Crew, Process
researcher = Agent(
role="Researcher",
goal="Identify the top 3 causes of scaling failures in CrewAI",
backstory="You produce concise technical findings.",
allow_delegation=False,
verbose=True,
)
writer = Agent(
role="Writer",
goal="Summarize findings into a troubleshooting guide",
backstory="You write production-focused docs.",
allow_delegation=False,
verbose=True,
)
task = Task(
description=(
"Return exactly 3 causes of CrewAI scaling loops. "
"For each cause include: symptom, root cause, fix."
),
expected_output=(
"Three bullet points with symptom/root cause/fix. "
"No extra commentary."
),
agent=researcher,
)
crew = Crew(
agents=[researcher, writer],
tasks=[task],
process=Process.sequential,
verbose=True,
)
Why this works:
- •
allow_delegation=Falseremoves one major source of recursive handoffs. - •The task has a finite deliverable.
- •The expected output tells the model when it is done.
If you’re using hierarchical workflows, make sure the manager agent also has a strict stopping policy. A manager that keeps reassigning the same unresolved task will produce the same loop.
Other Possible Causes
1) Tool returns ambiguous results
If your tool returns partial data like "try again" or an empty string, the agent may keep calling it.
# bad tool behavior
def lookup_customer(customer_id: str) -> str:
return "not found" # too vague for an agent to terminate cleanly
Fix it by returning structured output:
def lookup_customer(customer_id: str) -> dict:
return {
"found": False,
"customer_id": customer_id,
"reason": "No record in CRM"
}
2) Task dependencies create a cycle
This happens when task A depends on B and task B depends on A through context passing.
task_a = Task(description="Draft summary from task B", context=[task_b], agent=agent_a)
task_b = Task(description="Refine based on task A", context=[task_a], agent=agent_b)
Break the cycle by making one task terminal:
task_a = Task(description="Draft summary", agent=agent_a)
task_b = Task(description="Refine summary from task A", context=[task_a], agent=agent_b)
3) Memory is carrying stale context across runs
Persistent memory can make an agent think an unfinished step still needs work.
crew = Crew(
agents=[agent],
tasks=[task],
memory=True, # can reintroduce old unresolved state
)
Test with memory off first:
crew = Crew(
agents=[agent],
tasks=[task],
memory=False,
)
If disabling memory fixes it, your memory store is feeding back stale or duplicated state.
4) Max iterations is too high or not enforced
CrewAI agents can keep trying if you don’t cap iterations tightly enough.
agent = Agent(
role="Analyst",
goal="Finish analysis",
backstory="...",
max_iter=25,
)
Drop it while debugging:
agent = Agent(
role="Analyst",
goal="Finish analysis",
backstory="...",
max_iter=5,
)
If lowering max_iter stops the loop, your prompt or tool flow needs tightening.
How to Debug It
- •
Turn on verbose logging
- •Set
verbose=Trueon bothAgentandCrew. - •Look for repeated tool calls, repeated “thinking” steps, or the same task being reassigned.
- •Set
- •
Disable delegation
- •Set
allow_delegation=Falseon every agent. - •If the loop disappears, you’ve got recursive handoff logic.
- •Set
- •
Remove tools one by one
- •Start with no tools.
- •Add them back individually until the loop returns.
- •The last tool added is usually returning ambiguous output or causing retries.
- •
Reduce to one task
- •Run a single
Taskwith oneAgent. - •If that works, your issue is in multi-agent context passing or cyclic dependencies.
- •Run a single
A useful debugging checklist:
| Test | What it tells you |
|---|---|
| No delegation | Whether recursion comes from handoffs |
| No tools | Whether tool output is causing retries |
| One task only | Whether your workflow graph has cycles |
Low max_iter | Whether iteration limits are masking bad prompts |
Prevention
- •Keep every task terminal: define exactly what “done” looks like in
expected_output. - •Avoid enabling delegation unless you actually need multi-agent handoffs.
- •Make tool outputs structured (
dict, JSON-like strings, explicit status fields), not free-form text. - •Set conservative
max_itervalues in production and raise them only when you have evidence they’re needed. - •Test new crews with
verbose=Truebefore wiring them into larger pipelines.
If you hit AgentExecutionError, repeated Task retries, or the same crew step looping under Process.sequential, start by removing delegation and tightening output contracts. That fixes most CrewAI infinite-loop failures during scaling.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit