How to Fix 'context length exceeded' in CrewAI (TypeScript)

By Cyprian AaronsUpdated 2026-04-21
context-length-exceededcrewaitypescript

If you’re seeing context length exceeded in CrewAI TypeScript, the model is being fed more tokens than it can accept in a single request. In practice, this usually happens when you keep appending task output, chat history, or tool results without trimming them.

The error often shows up after a few agent turns, when a long document gets passed into a tool, or when multiple agents share the same growing context. In CrewAI terms, the failure is usually triggered inside an Agent run or a Task execution when the underlying LLM call hits the model’s token limit.

The Most Common Cause

The #1 cause is passing full conversation or task output back into the next step without truncation. In TypeScript projects, this usually looks like building a prompt from previousResult + newInput on every run.

Here’s the broken pattern:

import { Agent, Task, Crew } from "crewai";

const analyst = new Agent({
  name: "Analyst",
  role: "Summarize claims docs",
  goal: "Extract key fields",
  backstory: "You are precise.",
});

const task = new Task({
  description: (input: string) => `
Analyze this claims document and return a summary.

Previous output:
${input}
`,
  agent: analyst,
});

let context = "";

for (const chunk of largeChunks) {
  const result = await task.execute({ input: context + chunk });
  context += result; // keeps growing forever
}

This fails because every loop adds more text to context, and then sends all of it back into the next Task.execute() call. Eventually you hit something like:

  • Error: context length exceeded
  • BadRequestError: This model's maximum context length is ...
  • OpenAIError: InvalidRequestError: maximum context length exceeded

The fixed pattern is to keep only the minimum state you need:

import { Agent, Task } from "crewai";

const analyst = new Agent({
  name: "Analyst",
  role: "Summarize claims docs",
  goal: "Extract key fields",
  backstory: "You are precise.",
});

const task = new Task({
  description: (input: string) => `
Analyze this claims document and return only:
- claimant name
- policy number
- loss date
- one-line summary

Document:
${input}
`,
  agent: analyst,
});

for (const chunk of largeChunks) {
  const result = await task.execute({ input: chunk }); // no accumulated history
  await saveResult(result);
}

If you need memory, store structured state outside the prompt. Don’t keep re-sending raw transcripts unless you trim them first.

Other Possible Causes

CauseWhat happensTypical fix
Tool output is too largeA tool returns a full PDF, HTML page, or JSON blob into the next agent stepSummarize or truncate tool output before returning it
Model window is too smallYou’re using a smaller-context model for long documentsSwitch to a larger context model
Verbose system promptYour role, goal, and instructions are bloatedShorten instructions and remove repeated constraints
Multi-agent handoff chainsEach agent receives the full transcript from previous agentsPass only final artifacts or compact summaries

1) Tool output explosion

A common issue is returning raw data from a tool:

const searchTool = {
  name: "search_docs",
  execute: async () => {
    return await fetch("https://example.com/big-report").then(r => r.text());
  },
};

Fix it by returning only relevant sections:

const searchTool = {
  name: "search_docs",
  execute: async () => {
    const text = await fetch("https://example.com/big-report").then(r => r.text());
    return text.slice(0, 4000);
  },
};

2) Model context window too small

If your crew uses a smaller model, long prompts fail faster:

const llmConfig = {
  model: "gpt-4o-mini", // may be too tight for long transcripts
};

Use a larger window where available:

const llmConfig = {
  model: "gpt-4o",
};

3) System prompt bloat

This is easy to miss. A long backstory or repeated policy text can eat thousands of tokens before user input even starts.

new Agent({
  role: "Claims reviewer",
  goal: "Review claims",
  backstory: `
You are an expert.
Always be concise.
Always be accurate.
Always cite sources.
Always format in JSON.
Always validate inputs.
...`,
});

Trim it down to what actually changes behavior.

4) Full transcript handoff between agents

If Agent B receives everything Agent A saw and wrote, token usage compounds fast.

const nextInput = `${agentAOutput}\n\n${fullConversationHistory}`;

Instead, pass a compact handoff object:

const nextInput = JSON.stringify({
  claimId,
  summary: agentAOutput.summary,
  extractedFields: agentAOutput.fields,
});

How to Debug It

  1. Log token-sized inputs before each call

    • Print prompt length and approximate token count before Task.execute().
    • If one input suddenly jumps from small to huge, that’s your culprit.
  2. Inspect what each tool returns

    • Log raw tool output size.
    • If a tool returns pages of HTML, PDFs converted to text, or massive JSON arrays, trim it.
  3. Check whether you’re accumulating history

    • Look for patterns like context += result, history.push(...), then sending all history back into the next prompt.
    • This is usually the root cause in TypeScript loops.
  4. Reduce the prompt until it works

    • Remove memory, tools, and extra instructions one by one.
    • Re-run after each change until the error disappears.

Prevention

  • Keep prompts short and structured.
  • Store state outside the LLM prompt; pass IDs, summaries, and extracted fields instead of full transcripts.
  • Add size checks on tool outputs and truncate anything that can grow unbounded.

If you build CrewAI workflows in TypeScript with long-running agents, assume token growth will happen unless you explicitly control it. The fix is almost always about reducing what gets sent into the next LLM call.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides