How to Fix 'chain execution stuck when scaling' in LangChain (TypeScript)

By Cyprian AaronsUpdated 2026-04-21

chain-execution-stuck-when-scalinglangchaintypescript

When LangChain chain execution gets “stuck” during scaling, it usually means your app is not actually dead — it’s blocked on async work, backpressure, or a runaway callback loop. In TypeScript, this shows up most often when you move from a single request to concurrent traffic and the chain stops resolving, times out, or piles up pending promises.

The usual pattern is simple: it works locally with one input, then hangs under load because one step in the chain never completes or the event loop gets saturated.

The Most Common Cause

The #1 cause is mixing synchronous-looking code with async LangChain components and not awaiting the right boundary. In TypeScript, this often happens when you call .invoke() inside a loop without concurrency control, or you forget to await a tool/LLM call inside a custom Runnable.

Here’s the broken pattern:

import { ChatOpenAI } from "@langchain/openai";
import { PromptTemplate } from "@langchain/core/prompts";

const llm = new ChatOpenAI({ model: "gpt-4o-mini" });
const prompt = PromptTemplate.fromTemplate("Summarize: {text}");

const texts = ["doc1", "doc2", "doc3"];

async function run() {
  const results = [];

  for (const text of texts) {
    const chain = prompt.pipe(llm);
    // Broken: no concurrency control, and easy to accidentally forget await in real code
    results.push(chain.invoke({ text }));
  }

  return Promise.all(results);
}

And here’s the fixed version:

import { ChatOpenAI } from "@langchain/openai";
import { PromptTemplate } from "@langchain/core/prompts";

const llm = new ChatOpenAI({
  model: "gpt-4o-mini",
  maxRetries: 2,
});

const prompt = PromptTemplate.fromTemplate("Summarize: {text}");
const chain = prompt.pipe(llm);

const texts = ["doc1", "doc2", "doc3"];

async function run() {
  const results = await Promise.all(
    texts.map((text) => chain.invoke({ text }))
  );

  return results;
}

If you need real scaling, don’t fire unlimited promises. Use bounded concurrency:

import pLimit from "p-limit";

const limit = pLimit(5);

const results = await Promise.all(
  texts.map((text) => limit(() => chain.invoke({ text })))
);

That pattern prevents the classic failure mode where LangChain logs stop progressing and Node sits there with pending requests.

Other Possible Causes

Cause	What it looks like	Fix
Callback handler deadlock	Chain never resolves after logging starts	Make callbacks non-blocking
Tool function never returns	`AgentExecutor` hangs waiting for tool output	Ensure every tool returns or throws
Recursive agent loop	Repeated `AgentExecutor` iterations with no final answer	Set iteration limits and stop conditions
Rate limiting / connection saturation	Requests slow down until they appear stuck	Add retries, queueing, and lower concurrency

1. Blocking callback handlers

If you use custom callbacks and do heavy work inside handleLLMEnd, handleChainEnd, or handleToolEnd, you can block completion.

import { BaseCallbackHandler } from "@langchain/core/callbacks/base";

class BadHandler extends BaseCallbackHandler {
  async handleLLMEnd() {
    // Bad: synchronous CPU work or long I/O here blocks completion
    while (Date.now() % 2 === 0) {}
  }
}

Fix it by pushing work to a queue or making it fast:

class GoodHandler extends BaseCallbackHandler {
  async handleLLMEnd() {
    void fetch("https://metrics.internal/llm-end", { method: "POST" });
  }
}

2. A tool that never resolves

This is common with DynamicStructuredTool or custom tools.

import { DynamicStructuredTool } from "@langchain/core/tools";

const badTool = new DynamicStructuredTool({
  name: "lookupCustomer",
  description: "Fetch customer data",
  func: async () => {
    return new Promise(() => {}); // never resolves
  },
});

Return a value or throw on timeout:

const goodTool = new DynamicStructuredTool({
  name: "lookupCustomer",
  description: "Fetch customer data",
  func: async () => {
    const controller = new AbortController();
    const timeout = setTimeout(() => controller.abort(), 5000);

    try {
      const res = await fetch("https://api.internal/customers", {
        signal: controller.signal,
      });
      return await res.text();
    } finally {
      clearTimeout(timeout);
    }
  },
});

3. Agent recursion without a stop condition

If you’re using AgentExecutor, an agent can keep looping until it hits limits like:

•maxIterations
•earlyStoppingMethod
•tool output constraints

import { AgentExecutor } from "langchain/agents";

const executor = AgentExecutor.fromAgentAndTools(agent, tools);
// Risky under bad prompts/tools:
await executor.invoke({ input: "Do the task" });

Make the ceiling explicit:

const executor = AgentExecutor.fromAgentAndTools(agent, tools, {
  maxIterations: 5,
});

4. Connection pool exhaustion

Under scale, your OpenAI client, vector store client, or internal HTTP client may run out of sockets. The symptom is not always an error; sometimes requests just stall.

Typical signs:

•many concurrent .invoke() calls
•slow DNS/connect timeouts
•Node process memory grows while throughput drops

Fix by reducing concurrency and tuning the underlying HTTP client. If you’re using fetch-based wrappers behind LangChain, make sure keep-alive and timeout settings are sane.

How to Debug It

•
Turn on LangChain tracing
- •Set LANGCHAIN_TRACING_V2=true
- •Check where execution stops: prompt formatting, model call, tool call, or callback handling
•
Add timestamps around each boundary
- •Log before and after .invoke()
- •Log inside each custom tool and callback
```
console.time("chain");
await chain.invoke(input);
console.timeEnd("chain");
```
•
Isolate one component at a time
- •Run the LLM alone
- •Run the tool alone
- •Run the full chain last
  This tells you whether the hang is in LangChain orchestration or downstream I/O.
•
Clamp concurrency
- •Replace Promise.all(texts.map(...)) with p-limit
- •If the issue disappears, you’re hitting resource saturation rather than a logic bug

Prevention

•
Keep every custom tool strictly bounded:
- •timeout every network call
- •always return or throw
•
Put hard limits on agents:
- •maxIterations
- •request timeouts
- •bounded concurrency per worker
•
Treat callbacks as observability hooks only:
- •no blocking I/O
- •no CPU-heavy parsing inside handlers

If your LangChain TypeScript app only fails under load, assume it’s a scaling bug first, not an LLM bug. In practice, “stuck” almost always means one unresolved promise, one blocked callback, or too many concurrent executions hitting the same bottleneck.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit