How to Fix 'chain execution stuck in production' in LangChain (TypeScript)

By Cyprian AaronsUpdated 2026-04-21

chain-execution-stuck-in-productionlangchaintypescript

What the error means

When a LangChain chain gets “stuck” in production, it usually means the promise never resolves, a step is waiting on an async dependency that never returns, or the chain is blocked by retry/retry-like behavior with no timeout. In TypeScript apps, this shows up most often when you mix callback-style code, missing await, streaming handlers that never close, or external API calls that hang under load.

The symptom is simple: your request starts, logs show the chain entered execution, and then nothing. You won’t always get a clean exception like Error: Chain run timed out; sometimes you just see an open HTTP request and a worker tied up until your platform kills it.

The Most Common Cause

The #1 cause is returning a promise that never resolves because one of the tools, retriever calls, or custom Runnable steps does not return or await correctly.

In LangChain TypeScript, this often happens inside RunnableLambda, custom tools, or wrapper functions around chain.invoke(). The chain looks fine at compile time, but at runtime the execution hangs because the async boundary is broken.

Broken pattern	Fixed pattern
Missing `return` / unresolved promise	Explicit `return await` or direct return of the async result

import { RunnableLambda } from "@langchain/core/runnables";
import { ChatOpenAI } from "@langchain/openai";

const model = new ChatOpenAI({ model: "gpt-4o-mini", temperature: 0 });

const broken = RunnableLambda.from(async (input: string) => {
  // Fire-and-forget bug: nothing is returned to the chain
  model.invoke([{ role: "user", content: input }]);
});

const fixed = RunnableLambda.from(async (input: string) => {
  // Chain can complete because the promise is returned
  return await model.invoke([{ role: "user", content: input }]);
});

A more realistic version is inside an Express handler:

app.post("/ask", async (req, res) => {
  const answerPromise = chain.invoke({ question: req.body.question });

  // Broken: response never waits for the chain result
  res.json({ ok: true });
});

app.post("/ask", async (req, res) => {
  const answer = await chain.invoke({ question: req.body.question });
  res.json({ answer });
});

If you use Promise.all, make sure every branch resolves:

// Broken
await Promise.all([
  retriever.getRelevantDocuments(query),
  model.invoke(messages), // if this hangs, whole request hangs
]);

// Fixed with timeout wrapper
await Promise.all([
  withTimeout(retriever.getRelevantDocuments(query), 5000),
  withTimeout(model.invoke(messages), 15000),
]);

Other Possible Causes

1) A tool function never resolves

This is common with custom tools built using DynamicStructuredTool or tool().

import { tool } from "@langchain/core/tools";
import { z } from "zod";

const brokenTool = tool(
  async ({ customerId }) => {
    fetch(`https://api.example.com/customers/${customerId}`); // missing await/return
    return "done";
  },
  {
    name: "get_customer",
    description: "Fetch customer data",
    schema: z.object({ customerId: z.string() }),
  }
);

const fixedTool = tool(
  async ({ customerId }) => {
    const resp = await fetch(`https://api.example.com/customers/${customerId}`);
    return await resp.text();
  },
  {
    name: "get_customer",
    description: "Fetch customer data",
    schema: z.object({ customerId: z.string() }),
  }
);

2) Streaming handler never closes

If you use stream() or token callbacks and don’t finalize the response, the request stays open.

// Broken
const stream = await chain.stream(input);
for await (const chunk of stream) {
  res.write(chunk.content);
}
// missing res.end()

// Fixed
const stream = await chain.stream(input);
for await (const chunk of stream) {
  res.write(chunk.content);
}
res.end();

3) Retries hide a hanging dependency

LangChain retries can make a bad upstream call look like a stuck chain. If your OpenAI-compatible endpoint or internal model gateway hangs without timeout, retries just extend the wait.

const model = new ChatOpenAI({
  model: "gpt-4o-mini",
  timeout: 10000,
  maxRetries: 2,
});

If you omit timeout handling entirely, you can end up with requests waiting forever behind load balancers or dead sockets.

4) Misconfigured callbacks or tracing handlers block execution

Custom callback handlers in classes like BaseCallbackHandler can accidentally block if they perform slow I/O synchronously.

class BrokenHandler /* extends BaseCallbackHandler */ {
  handleLLMEnd() {
    // Bad idea: synchronous heavy work here blocks completion path
    while (true) {}
  }
}

Keep handlers non-blocking and offload logging to background queues.

How to Debug It

•
Isolate the exact step that hangs
- •
  Add logs before and after each major call:
  - •retriever
  - •prompt formatting
  - •model invocation
  - •parser/tool execution
- •
  Example:
```
console.log("before retriever");
const docs = await retriever.getRelevantDocuments(q);
console.log("after retriever");
```
•
Wrap every external call with a timeout
- •If adding timeouts makes the issue disappear as an error, you found the hanging dependency.
- •Use AbortController for fetch-based tools and SDKs that support abort signals.
•
Disable callbacks and streaming temporarily
- •Remove custom handlers, LangSmith hooks, and streaming response code.
- •If the chain finishes normally after that, your bug is in observability or response handling.
•
Run the same code locally against production config
- •Same environment variables.
- •Same model endpoint.
- •Same vector store credentials.
- •Production-only hangs are often network/DNS/auth issues rather than LangChain itself.

Prevention

•
Always put timeouts on every network-bound step:
- •LLM calls
- •retrievers
- •tools
- •vector DB queries
•
Treat every custom tool and runnable as production code:
- •return promises explicitly
- •avoid fire-and-forget side effects
- •test failure paths
•
Add request-level tracing around chains:

const start = Date.now();
try {
  const result = await chain.invoke(input);
} finally {
console.log("chain_ms", Date.now() - start);
}

If you see “chain execution stuck in production” in LangChain TypeScript, assume one of two things first: a promise isn’t being awaited correctly, or an upstream dependency has no timeout. Fix those before chasing framework-level bugs.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit