How to Fix 'context length exceeded' in LangGraph (TypeScript)
What the error means
context length exceeded means the model was sent more tokens than its context window allows. In LangGraph, this usually happens after a few turns when your graph keeps appending messages to state and you keep passing the full history into the next LLM call.
The failure often shows up as an OpenAI or Anthropic API error bubbling through LangGraph, for example:
- •
BadRequestError: 400 This model's maximum context length is 128000 tokens... - •
400 context_length_exceeded - •
Error: Prompt too long
The Most Common Cause
The #1 cause is unbounded chat history in graph state.
If you store every message in messages and feed the entire array back into the model on every node execution, your prompt grows forever. That works for a few turns, then breaks hard.
Broken pattern vs fixed pattern
| Broken | Fixed |
|---|---|
| Appends every message forever | Trims or summarizes state before each model call |
Reuses full messages array as prompt input | Passes only recent messages or a compact summary |
| No state boundary | Uses reducers, checkpoints, or explicit truncation |
// BROKEN
import { StateGraph, Annotation } from "@langchain/langgraph";
import { ChatOpenAI } from "@langchain/openai";
const llm = new ChatOpenAI({ model: "gpt-4o-mini" });
const GraphState = Annotation.Root({
messages: Annotation<any[]>({
reducer: (left, right) => left.concat(right),
default: () => [],
}),
});
const graph = new StateGraph(GraphState)
.addNode("chat", async (state) => {
const response = await llm.invoke(state.messages);
return { messages: [response] };
})
.addEdge("__start__", "chat")
.addEdge("chat", "__end__")
.compile();
// FIXED
import { StateGraph, Annotation } from "@langchain/langgraph";
import { ChatOpenAI } from "@langchain/openai";
import { trimMessages } from "@langchain/core/messages";
const llm = new ChatOpenAI({ model: "gpt-4o-mini" });
const GraphState = Annotation.Root({
messages: Annotation<any[]>({
reducer: (left, right) => left.concat(right),
default: () => [],
}),
});
const graph = new StateGraph(GraphState)
.addNode("chat", async (state) => {
const trimmed = trimMessages(state.messages, {
maxTokens: 6000,
strategy: "last",
tokenCounter: llm,
});
const response = await llm.invoke(trimmed);
return { messages: [response] };
})
.addEdge("__start__", "chat")
.addEdge("chat", "__end__")
.compile();
If you want the graph to stay stable in production, do not rely on “the model will just handle it.” It won’t. You need explicit history control.
Other Possible Causes
1. Tool outputs are too large
A common LangGraph failure mode is dumping raw tool output back into state. This happens with search results, PDFs, database rows, or JSON blobs.
// BAD
return {
messages: [
{
role: "tool",
content: JSON.stringify(hugeApiResponse),
},
],
};
Fix it by extracting only what the LLM needs.
// GOOD
return {
messages: [
{
role: "tool",
content: summarizeSearchResults(hugeApiResponse),
},
],
};
2. Recursive loops in the graph
If your graph routes back into the same node without a stop condition, token usage grows until the model fails.
// BAD
graph.addConditionalEdges("router", (state) => "agent");
Add a termination rule based on turn count or completion state.
// GOOD
graph.addConditionalEdges("router", (state) =>
state.turns > 6 ? "__end__" : "agent"
);
3. You are using the wrong model for the prompt size
Some models have much smaller context windows than you think. If you switched providers or downgraded models, your existing prompt may no longer fit.
| Model family | Typical issue |
|---|---|
| Small/cheap chat models | Lower context window than expected |
| Older Anthropic/OpenAI deployments | Different limits per deployment |
| Azure/OpenAI wrappers | Deployment config may not match docs |
Check your actual deployment limit, not just the marketing name.
const llm = new ChatOpenAI({
model: "gpt-4o-mini",
maxRetries: 2,
});
4. System prompts and hidden instructions are too large
Teams often build massive system prompts with policy text, examples, schemas, and formatting rules. That cost hits every single request.
// BAD
const systemPrompt = `
You are an assistant.
[200 lines of policy]
[50 examples]
[full JSON schema]
`;
Move static policy into shorter instructions and keep schemas minimal.
// GOOD
const systemPrompt = `
You are a claims assistant.
Return concise answers.
Use tools when needed.
`;
How to Debug It
- •
Log token counts before every LLM call
- •Measure
messages.lengthand estimated tokens. - •If the count climbs every turn without dropping, you found the problem.
- •Measure
- •
Print the exact payload going into
llm.invoke()- •Inspect what LangGraph is sending.
- •Look for giant tool outputs, repeated system prompts, or duplicated history.
- •
Disable nodes one by one
- •Remove tools first.
- •Then remove router loops.
- •Then remove memory/checkpointing.
- •The node that makes token growth explode is your culprit.
- •
Check provider error details
- •OpenAI usually returns
BadRequestError. - •Anthropic often returns a similar
400with context window details. - •The message tells you whether you exceeded input tokens or total tokens.
- •OpenAI usually returns
Prevention
- •Trim messages at every LLM boundary using
trimMessages, summaries, or last-N strategy. - •Keep tool outputs compact; store raw data outside the prompt and pass only references or summaries.
- •Add hard limits in your graph:
- •max turns
- •max tool calls
- •max input tokens per node
If you build LangGraph agents for production systems like banking workflows or claims triage, treat context as a budget. Once that budget is explicit in code, this error stops being mysterious and starts being manageable.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit