How to Fix 'context length exceeded' in LangGraph (TypeScript)

By Cyprian AaronsUpdated 2026-04-21
context-length-exceededlanggraphtypescript

What the error means

context length exceeded means the model was sent more tokens than its context window allows. In LangGraph, this usually happens after a few turns when your graph keeps appending messages to state and you keep passing the full history into the next LLM call.

The failure often shows up as an OpenAI or Anthropic API error bubbling through LangGraph, for example:

  • BadRequestError: 400 This model's maximum context length is 128000 tokens...
  • 400 context_length_exceeded
  • Error: Prompt too long

The Most Common Cause

The #1 cause is unbounded chat history in graph state.

If you store every message in messages and feed the entire array back into the model on every node execution, your prompt grows forever. That works for a few turns, then breaks hard.

Broken pattern vs fixed pattern

BrokenFixed
Appends every message foreverTrims or summarizes state before each model call
Reuses full messages array as prompt inputPasses only recent messages or a compact summary
No state boundaryUses reducers, checkpoints, or explicit truncation
// BROKEN
import { StateGraph, Annotation } from "@langchain/langgraph";
import { ChatOpenAI } from "@langchain/openai";

const llm = new ChatOpenAI({ model: "gpt-4o-mini" });

const GraphState = Annotation.Root({
  messages: Annotation<any[]>({
    reducer: (left, right) => left.concat(right),
    default: () => [],
  }),
});

const graph = new StateGraph(GraphState)
  .addNode("chat", async (state) => {
    const response = await llm.invoke(state.messages);
    return { messages: [response] };
  })
  .addEdge("__start__", "chat")
  .addEdge("chat", "__end__")
  .compile();
// FIXED
import { StateGraph, Annotation } from "@langchain/langgraph";
import { ChatOpenAI } from "@langchain/openai";
import { trimMessages } from "@langchain/core/messages";

const llm = new ChatOpenAI({ model: "gpt-4o-mini" });

const GraphState = Annotation.Root({
  messages: Annotation<any[]>({
    reducer: (left, right) => left.concat(right),
    default: () => [],
  }),
});

const graph = new StateGraph(GraphState)
  .addNode("chat", async (state) => {
    const trimmed = trimMessages(state.messages, {
      maxTokens: 6000,
      strategy: "last",
      tokenCounter: llm,
    });

    const response = await llm.invoke(trimmed);
    return { messages: [response] };
  })
  .addEdge("__start__", "chat")
  .addEdge("chat", "__end__")
  .compile();

If you want the graph to stay stable in production, do not rely on “the model will just handle it.” It won’t. You need explicit history control.

Other Possible Causes

1. Tool outputs are too large

A common LangGraph failure mode is dumping raw tool output back into state. This happens with search results, PDFs, database rows, or JSON blobs.

// BAD
return {
  messages: [
    {
      role: "tool",
      content: JSON.stringify(hugeApiResponse),
    },
  ],
};

Fix it by extracting only what the LLM needs.

// GOOD
return {
  messages: [
    {
      role: "tool",
      content: summarizeSearchResults(hugeApiResponse),
    },
  ],
};

2. Recursive loops in the graph

If your graph routes back into the same node without a stop condition, token usage grows until the model fails.

// BAD
graph.addConditionalEdges("router", (state) => "agent");

Add a termination rule based on turn count or completion state.

// GOOD
graph.addConditionalEdges("router", (state) =>
  state.turns > 6 ? "__end__" : "agent"
);

3. You are using the wrong model for the prompt size

Some models have much smaller context windows than you think. If you switched providers or downgraded models, your existing prompt may no longer fit.

Model familyTypical issue
Small/cheap chat modelsLower context window than expected
Older Anthropic/OpenAI deploymentsDifferent limits per deployment
Azure/OpenAI wrappersDeployment config may not match docs

Check your actual deployment limit, not just the marketing name.

const llm = new ChatOpenAI({
  model: "gpt-4o-mini",
  maxRetries: 2,
});

4. System prompts and hidden instructions are too large

Teams often build massive system prompts with policy text, examples, schemas, and formatting rules. That cost hits every single request.

// BAD
const systemPrompt = `
You are an assistant.
[200 lines of policy]
[50 examples]
[full JSON schema]
`;

Move static policy into shorter instructions and keep schemas minimal.

// GOOD
const systemPrompt = `
You are a claims assistant.
Return concise answers.
Use tools when needed.
`;

How to Debug It

  1. Log token counts before every LLM call

    • Measure messages.length and estimated tokens.
    • If the count climbs every turn without dropping, you found the problem.
  2. Print the exact payload going into llm.invoke()

    • Inspect what LangGraph is sending.
    • Look for giant tool outputs, repeated system prompts, or duplicated history.
  3. Disable nodes one by one

    • Remove tools first.
    • Then remove router loops.
    • Then remove memory/checkpointing.
    • The node that makes token growth explode is your culprit.
  4. Check provider error details

    • OpenAI usually returns BadRequestError.
    • Anthropic often returns a similar 400 with context window details.
    • The message tells you whether you exceeded input tokens or total tokens.

Prevention

  • Trim messages at every LLM boundary using trimMessages, summaries, or last-N strategy.
  • Keep tool outputs compact; store raw data outside the prompt and pass only references or summaries.
  • Add hard limits in your graph:
    • max turns
    • max tool calls
    • max input tokens per node

If you build LangGraph agents for production systems like banking workflows or claims triage, treat context as a budget. Once that budget is explicit in code, this error stops being mysterious and starts being manageable.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides