How to Fix 'token limit exceeded' in CrewAI (Python)

By Cyprian AaronsUpdated 2026-04-21

token-limit-exceededcrewaipython

What the error means

token limit exceeded in CrewAI usually means one of your agents tried to send too much text to the LLM in a single request. In practice, this shows up when you stuff large documents, long chat history, or verbose tool output into an agent prompt.

You’ll typically hit it when chaining tasks, passing raw PDFs/HTML, or letting an agent keep too much memory across turns. The failure often appears after CrewAI builds the final prompt for LLM.call() or during a tool-backed task where the context ballooned.

The Most Common Cause

The #1 cause is passing large untrimmed input directly into a task description or expected output. CrewAI then concatenates that text with agent instructions, memory, and tool context until the model rejects it.

Here’s the broken pattern:

Broken	Fixed
Injects full document text into the prompt	Pre-summarizes or chunks the document
Lets the agent process everything at once	Passes only relevant excerpts
Causes `token limit exceeded` in `LLM.call()`	Keeps prompts bounded

# Broken: raw document dumped into task description
from crewai import Agent, Task, Crew

researcher = Agent(
    role="Researcher",
    goal="Analyze the document",
    backstory="You are precise and concise."
)

large_text = open("claims_policy.txt").read()

task = Task(
    description=f"""
    Analyze this policy and extract exclusions:

    {large_text}
    """,
    expected_output="A list of exclusions"
)

crew = Crew(agents=[researcher], tasks=[task])
result = crew.kickoff()
print(result)

The fix is to reduce what you send into the task. Chunk first, summarize first, or extract only relevant sections before calling CrewAI.

# Fixed: trim input before handing it to CrewAI
from crewai import Agent, Task, Crew
from textwrap import shorten

researcher = Agent(
    role="Researcher",
    goal="Analyze the document",
    backstory="You are precise and concise."
)

large_text = open("claims_policy.txt").read()

# Keep only a bounded slice for the prompt
excerpt = shorten(large_text, width=4000, placeholder="\n...[truncated]...")

task = Task(
    description=f"""
    Analyze this policy excerpt and extract exclusions:

    {excerpt}
    """,
    expected_output="A list of exclusions"
)

crew = Crew(agents=[researcher], tasks=[task])
result = crew.kickoff()
print(result)

If the source is truly large, do not truncate blindly. Split it into chunks and run multiple tasks over smaller windows, then merge results in Python.

Other Possible Causes

1. Agent memory is too large

If you enable memory and keep long conversations around, every new task inherits old context. That’s fine for short sessions and bad for production workflows with repeated iterations.

from crewai import Crew

crew = Crew(
    agents=[researcher],
    tasks=[task],
    memory=True  # can grow until token limits break
)

Fix by disabling memory for batch jobs or resetting state between runs.

crew = Crew(
    agents=[researcher],
    tasks=[task],
    memory=False
)

2. Tool output is too verbose

A tool that returns raw HTML, full API payloads, or entire database rows can blow up your prompt on the next step. This is common when using custom tools with @tool or returning unfiltered JSON.

def search_claims(query: str):
    return open("search_dump.json").read()  # huge payload

Return only what the agent needs.

def search_claims(query: str):
    results = load_results(query)
    return [
        {"id": r["id"], "title": r["title"], "snippet": r["snippet"]}
        for r in results[:5]
    ]

3. Task chaining accumulates too much context

Each task may include previous outputs as input to later tasks. If every task returns a long essay instead of a structured summary, later prompts get bloated fast.

task1 = Task(description="Summarize this report...", expected_output="Long summary")
task2 = Task(description="Use previous summary to find risks...", expected_output="Detailed analysis")

Make outputs compact and structured.

task1 = Task(
    description="Summarize this report into 5 bullets.",
    expected_output="5 bullets with key facts only"
)

4. Model context window is too small for your workload

Sometimes the code is fine and the model choice is not. If you’re using a smaller-context model through LLM, even moderate prompts can fail once system instructions and tool traces are added.

from crewai import LLM

llm = LLM(model="gpt-4o-mini")  # smaller window than some alternatives

Switch to a larger-context model when your workflow genuinely needs it.

llm = LLM(model="gpt-4o")

How to Debug It

•
Print the exact prompt size
- •Log every task description, tool result, and memory payload before kickoff.
- •If one field is huge, that’s your culprit.
•
Disable features one at a time
- •Run with memory=False.
- •Remove tools.
- •Replace multi-step crews with a single task.
- •The component that makes the error disappear is usually the cause.
•
Inspect task outputs between steps
- •Look at what each Task returns.
- •If step 1 outputs pages of text and step 2 reuses it verbatim, you found the buildup point.
•
Check model limits
- •Verify which LLM(model=...) you’re using.
- •A prompt that fits on one model may fail on another with a smaller context window.

Prevention

•
Keep agent inputs bounded.
- •Summarize documents before sending them into CrewAI.
- •Never pass raw PDFs, logs, or API dumps directly into prompts.
•
Make every tool return narrow data.
- •Return top-N matches, short snippets, or structured fields.
- •Do not return entire records unless absolutely necessary.
•
Design tasks for compression.
- •Ask for bullets, tables, IDs, or JSON.
- •Avoid “analyze everything” prompts that encourage long outputs.

If you treat token budget as a hard constraint from day one, this error stops being random. In most CrewAI projects, fixing prompt size solves it faster than changing models or rewriting agents.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit