How to Fix 'chain execution stuck' in LangChain (Python)

By Cyprian AaronsUpdated 2026-04-21
chain-execution-stucklangchainpython

If you see chain execution stuck in LangChain Python, it usually means the chain started running but never returned control. In practice, that happens when a runnable hangs, a tool call never completes, or your code is waiting on an async/sync mismatch that blocks the event loop.

The error often shows up around LLMChain, SequentialChain, RunnableSequence, or agent/tool execution when one step never finishes. In production, I’ve seen it most often with custom tools, nested chains, and callbacks that accidentally recurse.

The Most Common Cause

The #1 cause is a tool or callable that blocks forever or waits on something that never resolves. In LangChain terms, the chain is fine; one of the steps inside it is not returning.

Here’s the broken pattern:

BrokenFixed
Tool function does blocking I/O without timeoutTool function uses timeout and returns deterministically
Async chain calls sync blocking codeAsync-safe implementation with await or thread offload
# Broken: this can hang forever
from langchain_core.tools import tool
import requests

@tool
def fetch_customer_profile(customer_id: str) -> str:
    # No timeout: if the upstream service stalls, the chain stalls too.
    resp = requests.get(f"https://api.internal/profiles/{customer_id}")
    return resp.text


# Somewhere in your chain/agent:
# AgentExecutor(...) calls the tool and never gets a response.
# Fixed: add timeout and fail fast
from langchain_core.tools import tool
import requests

@tool
def fetch_customer_profile(customer_id: str) -> str:
    resp = requests.get(
        f"https://api.internal/profiles/{customer_id}",
        timeout=10,
    )
    resp.raise_for_status()
    return resp.text

If you’re using an async chain, don’t call blocking libraries directly inside async def. That’s another common way to get a “stuck” execution.

# Broken: blocks the event loop
from langchain_core.runnables import RunnableLambda
import requests

async_chain = RunnableLambda(lambda x: requests.get(x["url"]).text)
# Fixed: use async HTTP client or offload blocking work
from langchain_core.runnables import RunnableLambda
import httpx

async def fetch(url_dict):
    async with httpx.AsyncClient(timeout=10) as client:
        r = await client.get(url_dict["url"])
        r.raise_for_status()
        return r.text

async_chain = RunnableLambda(fetch)

Other Possible Causes

1) Recursive agent/tool loops

An agent can keep calling the same tool if your prompt encourages retries without a stop condition. You’ll see repeated tool calls and no final answer.

# Example symptom: AgentExecutor keeps looping on the same action
from langchain.agents import AgentExecutor

executor = AgentExecutor(
    agent=agent,
    tools=tools,
    max_iterations=3,
    early_stopping_method="force",
)

If max_iterations is too high or unset, a bad prompt/tool combo can look like a stuck chain.

2) Callback handler recursion

A custom callback that triggers another chain run inside on_chain_start or on_llm_end can recurse into itself. That usually ends in hanging behavior or runaway execution.

from langchain_core.callbacks import BaseCallbackHandler

class BadHandler(BaseCallbackHandler):
    def on_chain_start(self, serialized, inputs, **kwargs):
        # Don't invoke another chain here.
        other_chain.invoke(inputs)

Fix it by logging only, or pushing follow-up work to a separate queue/job.

3) Missing stop tokens or unbounded generation

Some models keep generating because nothing tells them to stop. This is common with raw LLM calls or poorly configured agents.

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="gpt-4o-mini",
    temperature=0,
    max_tokens=200,
)

# For some workflows, also set stop sequences explicitly.
result = llm.invoke("Draft the response")

If you’re using an older LLMChain, verify your prompt includes clear termination instructions and your model wrapper supports stop sequences.

4) Deadlocks from sync/async misuse in notebook or server code

Calling .invoke() inside an event loop-heavy environment while also mixing .ainvoke() elsewhere can produce weird stalls. This is especially visible in FastAPI handlers and Jupyter notebooks.

# Bad pattern in async context:
result = chain.invoke({"question": "..."})  # blocking call inside async route

# Better:
result = await chain.ainvoke({"question": "..."})

How to Debug It

  1. Turn on LangChain tracing

    • Set verbose logging and inspect where execution stops.
    • If you use LangSmith, check which node last emitted an event.
    import logging
    logging.basicConfig(level=logging.INFO)
    
  2. Isolate each runnable

    • Run each step independently.
    • If RunnableSequence hangs at step 3, test step 3 alone with static input.
  3. Add hard timeouts

    • Put timeouts on HTTP calls, database queries, and external APIs.
    • A “stuck” chain is often just waiting on upstream I/O.
  4. Reduce agent complexity

    • Temporarily remove tools, memory, and callbacks.
    • If the issue disappears, reintroduce components one by one until it breaks again.

Prevention

  • Put timeouts on every external dependency used by tools: HTTP clients, DB drivers, queues.
  • Keep callbacks side-effect free; never start new chains from inside callback handlers.
  • Cap agent loops with max_iterations, and make termination conditions explicit in prompts.
  • Prefer .ainvoke() end-to-end in async apps instead of mixing sync and async execution paths.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides