How to Fix 'token limit exceeded' in CrewAI (Python)
What the error means
token limit exceeded in CrewAI usually means one of your agents tried to send too much text to the LLM in a single request. In practice, this shows up when you stuff large documents, long chat history, or verbose tool output into an agent prompt.
You’ll typically hit it when chaining tasks, passing raw PDFs/HTML, or letting an agent keep too much memory across turns. The failure often appears after CrewAI builds the final prompt for LLM.call() or during a tool-backed task where the context ballooned.
The Most Common Cause
The #1 cause is passing large untrimmed input directly into a task description or expected output. CrewAI then concatenates that text with agent instructions, memory, and tool context until the model rejects it.
Here’s the broken pattern:
| Broken | Fixed |
|---|---|
| Injects full document text into the prompt | Pre-summarizes or chunks the document |
| Lets the agent process everything at once | Passes only relevant excerpts |
Causes token limit exceeded in LLM.call() | Keeps prompts bounded |
# Broken: raw document dumped into task description
from crewai import Agent, Task, Crew
researcher = Agent(
role="Researcher",
goal="Analyze the document",
backstory="You are precise and concise."
)
large_text = open("claims_policy.txt").read()
task = Task(
description=f"""
Analyze this policy and extract exclusions:
{large_text}
""",
expected_output="A list of exclusions"
)
crew = Crew(agents=[researcher], tasks=[task])
result = crew.kickoff()
print(result)
The fix is to reduce what you send into the task. Chunk first, summarize first, or extract only relevant sections before calling CrewAI.
# Fixed: trim input before handing it to CrewAI
from crewai import Agent, Task, Crew
from textwrap import shorten
researcher = Agent(
role="Researcher",
goal="Analyze the document",
backstory="You are precise and concise."
)
large_text = open("claims_policy.txt").read()
# Keep only a bounded slice for the prompt
excerpt = shorten(large_text, width=4000, placeholder="\n...[truncated]...")
task = Task(
description=f"""
Analyze this policy excerpt and extract exclusions:
{excerpt}
""",
expected_output="A list of exclusions"
)
crew = Crew(agents=[researcher], tasks=[task])
result = crew.kickoff()
print(result)
If the source is truly large, do not truncate blindly. Split it into chunks and run multiple tasks over smaller windows, then merge results in Python.
Other Possible Causes
1. Agent memory is too large
If you enable memory and keep long conversations around, every new task inherits old context. That’s fine for short sessions and bad for production workflows with repeated iterations.
from crewai import Crew
crew = Crew(
agents=[researcher],
tasks=[task],
memory=True # can grow until token limits break
)
Fix by disabling memory for batch jobs or resetting state between runs.
crew = Crew(
agents=[researcher],
tasks=[task],
memory=False
)
2. Tool output is too verbose
A tool that returns raw HTML, full API payloads, or entire database rows can blow up your prompt on the next step. This is common when using custom tools with @tool or returning unfiltered JSON.
def search_claims(query: str):
return open("search_dump.json").read() # huge payload
Return only what the agent needs.
def search_claims(query: str):
results = load_results(query)
return [
{"id": r["id"], "title": r["title"], "snippet": r["snippet"]}
for r in results[:5]
]
3. Task chaining accumulates too much context
Each task may include previous outputs as input to later tasks. If every task returns a long essay instead of a structured summary, later prompts get bloated fast.
task1 = Task(description="Summarize this report...", expected_output="Long summary")
task2 = Task(description="Use previous summary to find risks...", expected_output="Detailed analysis")
Make outputs compact and structured.
task1 = Task(
description="Summarize this report into 5 bullets.",
expected_output="5 bullets with key facts only"
)
4. Model context window is too small for your workload
Sometimes the code is fine and the model choice is not. If you’re using a smaller-context model through LLM, even moderate prompts can fail once system instructions and tool traces are added.
from crewai import LLM
llm = LLM(model="gpt-4o-mini") # smaller window than some alternatives
Switch to a larger-context model when your workflow genuinely needs it.
llm = LLM(model="gpt-4o")
How to Debug It
- •
Print the exact prompt size
- •Log every task description, tool result, and memory payload before kickoff.
- •If one field is huge, that’s your culprit.
- •
Disable features one at a time
- •Run with
memory=False. - •Remove tools.
- •Replace multi-step crews with a single task.
- •The component that makes the error disappear is usually the cause.
- •Run with
- •
Inspect task outputs between steps
- •Look at what each
Taskreturns. - •If step 1 outputs pages of text and step 2 reuses it verbatim, you found the buildup point.
- •Look at what each
- •
Check model limits
- •Verify which
LLM(model=...)you’re using. - •A prompt that fits on one model may fail on another with a smaller context window.
- •Verify which
Prevention
- •
Keep agent inputs bounded.
- •Summarize documents before sending them into CrewAI.
- •Never pass raw PDFs, logs, or API dumps directly into prompts.
- •
Make every tool return narrow data.
- •Return top-N matches, short snippets, or structured fields.
- •Do not return entire records unless absolutely necessary.
- •
Design tasks for compression.
- •Ask for bullets, tables, IDs, or JSON.
- •Avoid “analyze everything” prompts that encourage long outputs.
If you treat token budget as a hard constraint from day one, this error stops being random. In most CrewAI projects, fixing prompt size solves it faster than changing models or rewriting agents.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit