How to Fix 'context length exceeded' in CrewAI (TypeScript)
If you’re seeing context length exceeded in CrewAI TypeScript, the model is being fed more tokens than it can accept in a single request. In practice, this usually happens when you keep appending task output, chat history, or tool results without trimming them.
The error often shows up after a few agent turns, when a long document gets passed into a tool, or when multiple agents share the same growing context. In CrewAI terms, the failure is usually triggered inside an Agent run or a Task execution when the underlying LLM call hits the model’s token limit.
The Most Common Cause
The #1 cause is passing full conversation or task output back into the next step without truncation. In TypeScript projects, this usually looks like building a prompt from previousResult + newInput on every run.
Here’s the broken pattern:
import { Agent, Task, Crew } from "crewai";
const analyst = new Agent({
name: "Analyst",
role: "Summarize claims docs",
goal: "Extract key fields",
backstory: "You are precise.",
});
const task = new Task({
description: (input: string) => `
Analyze this claims document and return a summary.
Previous output:
${input}
`,
agent: analyst,
});
let context = "";
for (const chunk of largeChunks) {
const result = await task.execute({ input: context + chunk });
context += result; // keeps growing forever
}
This fails because every loop adds more text to context, and then sends all of it back into the next Task.execute() call. Eventually you hit something like:
- •
Error: context length exceeded - •
BadRequestError: This model's maximum context length is ... - •
OpenAIError: InvalidRequestError: maximum context length exceeded
The fixed pattern is to keep only the minimum state you need:
import { Agent, Task } from "crewai";
const analyst = new Agent({
name: "Analyst",
role: "Summarize claims docs",
goal: "Extract key fields",
backstory: "You are precise.",
});
const task = new Task({
description: (input: string) => `
Analyze this claims document and return only:
- claimant name
- policy number
- loss date
- one-line summary
Document:
${input}
`,
agent: analyst,
});
for (const chunk of largeChunks) {
const result = await task.execute({ input: chunk }); // no accumulated history
await saveResult(result);
}
If you need memory, store structured state outside the prompt. Don’t keep re-sending raw transcripts unless you trim them first.
Other Possible Causes
| Cause | What happens | Typical fix |
|---|---|---|
| Tool output is too large | A tool returns a full PDF, HTML page, or JSON blob into the next agent step | Summarize or truncate tool output before returning it |
| Model window is too small | You’re using a smaller-context model for long documents | Switch to a larger context model |
| Verbose system prompt | Your role, goal, and instructions are bloated | Shorten instructions and remove repeated constraints |
| Multi-agent handoff chains | Each agent receives the full transcript from previous agents | Pass only final artifacts or compact summaries |
1) Tool output explosion
A common issue is returning raw data from a tool:
const searchTool = {
name: "search_docs",
execute: async () => {
return await fetch("https://example.com/big-report").then(r => r.text());
},
};
Fix it by returning only relevant sections:
const searchTool = {
name: "search_docs",
execute: async () => {
const text = await fetch("https://example.com/big-report").then(r => r.text());
return text.slice(0, 4000);
},
};
2) Model context window too small
If your crew uses a smaller model, long prompts fail faster:
const llmConfig = {
model: "gpt-4o-mini", // may be too tight for long transcripts
};
Use a larger window where available:
const llmConfig = {
model: "gpt-4o",
};
3) System prompt bloat
This is easy to miss. A long backstory or repeated policy text can eat thousands of tokens before user input even starts.
new Agent({
role: "Claims reviewer",
goal: "Review claims",
backstory: `
You are an expert.
Always be concise.
Always be accurate.
Always cite sources.
Always format in JSON.
Always validate inputs.
...`,
});
Trim it down to what actually changes behavior.
4) Full transcript handoff between agents
If Agent B receives everything Agent A saw and wrote, token usage compounds fast.
const nextInput = `${agentAOutput}\n\n${fullConversationHistory}`;
Instead, pass a compact handoff object:
const nextInput = JSON.stringify({
claimId,
summary: agentAOutput.summary,
extractedFields: agentAOutput.fields,
});
How to Debug It
- •
Log token-sized inputs before each call
- •Print prompt length and approximate token count before
Task.execute(). - •If one input suddenly jumps from small to huge, that’s your culprit.
- •Print prompt length and approximate token count before
- •
Inspect what each tool returns
- •Log raw tool output size.
- •If a tool returns pages of HTML, PDFs converted to text, or massive JSON arrays, trim it.
- •
Check whether you’re accumulating history
- •Look for patterns like
context += result,history.push(...), then sending all history back into the next prompt. - •This is usually the root cause in TypeScript loops.
- •Look for patterns like
- •
Reduce the prompt until it works
- •Remove memory, tools, and extra instructions one by one.
- •Re-run after each change until the error disappears.
Prevention
- •Keep prompts short and structured.
- •Store state outside the LLM prompt; pass IDs, summaries, and extracted fields instead of full transcripts.
- •Add size checks on tool outputs and truncate anything that can grow unbounded.
If you build CrewAI workflows in TypeScript with long-running agents, assume token growth will happen unless you explicitly control it. The fix is almost always about reducing what gets sent into the next LLM call.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit