How to Fix 'chain execution stuck when scaling' in LangChain (TypeScript)
When LangChain chain execution gets “stuck” during scaling, it usually means your app is not actually dead — it’s blocked on async work, backpressure, or a runaway callback loop. In TypeScript, this shows up most often when you move from a single request to concurrent traffic and the chain stops resolving, times out, or piles up pending promises.
The usual pattern is simple: it works locally with one input, then hangs under load because one step in the chain never completes or the event loop gets saturated.
The Most Common Cause
The #1 cause is mixing synchronous-looking code with async LangChain components and not awaiting the right boundary. In TypeScript, this often happens when you call .invoke() inside a loop without concurrency control, or you forget to await a tool/LLM call inside a custom Runnable.
Here’s the broken pattern:
import { ChatOpenAI } from "@langchain/openai";
import { PromptTemplate } from "@langchain/core/prompts";
const llm = new ChatOpenAI({ model: "gpt-4o-mini" });
const prompt = PromptTemplate.fromTemplate("Summarize: {text}");
const texts = ["doc1", "doc2", "doc3"];
async function run() {
const results = [];
for (const text of texts) {
const chain = prompt.pipe(llm);
// Broken: no concurrency control, and easy to accidentally forget await in real code
results.push(chain.invoke({ text }));
}
return Promise.all(results);
}
And here’s the fixed version:
import { ChatOpenAI } from "@langchain/openai";
import { PromptTemplate } from "@langchain/core/prompts";
const llm = new ChatOpenAI({
model: "gpt-4o-mini",
maxRetries: 2,
});
const prompt = PromptTemplate.fromTemplate("Summarize: {text}");
const chain = prompt.pipe(llm);
const texts = ["doc1", "doc2", "doc3"];
async function run() {
const results = await Promise.all(
texts.map((text) => chain.invoke({ text }))
);
return results;
}
If you need real scaling, don’t fire unlimited promises. Use bounded concurrency:
import pLimit from "p-limit";
const limit = pLimit(5);
const results = await Promise.all(
texts.map((text) => limit(() => chain.invoke({ text })))
);
That pattern prevents the classic failure mode where LangChain logs stop progressing and Node sits there with pending requests.
Other Possible Causes
| Cause | What it looks like | Fix |
|---|---|---|
| Callback handler deadlock | Chain never resolves after logging starts | Make callbacks non-blocking |
| Tool function never returns | AgentExecutor hangs waiting for tool output | Ensure every tool returns or throws |
| Recursive agent loop | Repeated AgentExecutor iterations with no final answer | Set iteration limits and stop conditions |
| Rate limiting / connection saturation | Requests slow down until they appear stuck | Add retries, queueing, and lower concurrency |
1. Blocking callback handlers
If you use custom callbacks and do heavy work inside handleLLMEnd, handleChainEnd, or handleToolEnd, you can block completion.
import { BaseCallbackHandler } from "@langchain/core/callbacks/base";
class BadHandler extends BaseCallbackHandler {
async handleLLMEnd() {
// Bad: synchronous CPU work or long I/O here blocks completion
while (Date.now() % 2 === 0) {}
}
}
Fix it by pushing work to a queue or making it fast:
class GoodHandler extends BaseCallbackHandler {
async handleLLMEnd() {
void fetch("https://metrics.internal/llm-end", { method: "POST" });
}
}
2. A tool that never resolves
This is common with DynamicStructuredTool or custom tools.
import { DynamicStructuredTool } from "@langchain/core/tools";
const badTool = new DynamicStructuredTool({
name: "lookupCustomer",
description: "Fetch customer data",
func: async () => {
return new Promise(() => {}); // never resolves
},
});
Return a value or throw on timeout:
const goodTool = new DynamicStructuredTool({
name: "lookupCustomer",
description: "Fetch customer data",
func: async () => {
const controller = new AbortController();
const timeout = setTimeout(() => controller.abort(), 5000);
try {
const res = await fetch("https://api.internal/customers", {
signal: controller.signal,
});
return await res.text();
} finally {
clearTimeout(timeout);
}
},
});
3. Agent recursion without a stop condition
If you’re using AgentExecutor, an agent can keep looping until it hits limits like:
- •
maxIterations - •
earlyStoppingMethod - •tool output constraints
import { AgentExecutor } from "langchain/agents";
const executor = AgentExecutor.fromAgentAndTools(agent, tools);
// Risky under bad prompts/tools:
await executor.invoke({ input: "Do the task" });
Make the ceiling explicit:
const executor = AgentExecutor.fromAgentAndTools(agent, tools, {
maxIterations: 5,
});
4. Connection pool exhaustion
Under scale, your OpenAI client, vector store client, or internal HTTP client may run out of sockets. The symptom is not always an error; sometimes requests just stall.
Typical signs:
- •many concurrent
.invoke()calls - •slow DNS/connect timeouts
- •Node process memory grows while throughput drops
Fix by reducing concurrency and tuning the underlying HTTP client. If you’re using fetch-based wrappers behind LangChain, make sure keep-alive and timeout settings are sane.
How to Debug It
- •
Turn on LangChain tracing
- •Set
LANGCHAIN_TRACING_V2=true - •Check where execution stops: prompt formatting, model call, tool call, or callback handling
- •Set
- •
Add timestamps around each boundary
- •Log before and after
.invoke() - •Log inside each custom tool and callback
console.time("chain"); await chain.invoke(input); console.timeEnd("chain"); - •Log before and after
- •
Isolate one component at a time
- •Run the LLM alone
- •Run the tool alone
- •Run the full chain last
This tells you whether the hang is in LangChain orchestration or downstream I/O.
- •
Clamp concurrency
- •Replace
Promise.all(texts.map(...))withp-limit - •If the issue disappears, you’re hitting resource saturation rather than a logic bug
- •Replace
Prevention
- •Keep every custom tool strictly bounded:
- •timeout every network call
- •always return or throw
- •Put hard limits on agents:
- •
maxIterations - •request timeouts
- •bounded concurrency per worker
- •
- •Treat callbacks as observability hooks only:
- •no blocking I/O
- •no CPU-heavy parsing inside handlers
If your LangChain TypeScript app only fails under load, assume it’s a scaling bug first, not an LLM bug. In practice, “stuck” almost always means one unresolved promise, one blocked callback, or too many concurrent executions hitting the same bottleneck.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit