How to Fix 'connection timeout during development' in LangGraph (Python)
A connection timeout during development in LangGraph usually means your graph tried to call something external and never got a response before the client or server timed out. In practice, this shows up when you’re running a local graph, invoking a tool, or streaming from an API that is slow, unreachable, misconfigured, or blocked by networking issues.
The annoying part is that the error often looks like a LangGraph problem, but the root cause is usually in your model client, tool code, or runtime configuration.
The Most Common Cause
The #1 cause is an LLM or tool call that hangs because the request has no sane timeout, wrong base URL, or points to a service that isn’t actually reachable from your dev process.
This happens a lot with ChatOpenAI, AzureChatOpenAI, local Ollama endpoints, proxy layers, or custom tools inside a StateGraph. The graph waits, the underlying HTTP client waits longer, and eventually you get something like:
- •
httpx.ConnectTimeout - •
httpx.ReadTimeout - •
openai.APITimeoutError - •
langgraph.errors.GraphExecutionError: Error while executing node ...
Broken vs fixed pattern
| Broken | Fixed |
|---|---|
| No explicit timeout | Set explicit connect/read timeouts |
| Tool/LLM endpoint assumed reachable | Verify base URL and health first |
| Graph node blocks forever | Fail fast and surface the real exception |
# BROKEN: no timeout control, and the endpoint may not be reachable
from langgraph.graph import StateGraph
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
model="gpt-4o-mini",
base_url="http://localhost:11434/v1", # wrong for your environment? it will hang/fail later
)
def call_model(state):
response = llm.invoke(state["messages"])
return {"messages": state["messages"] + [response]}
graph = StateGraph(dict)
graph.add_node("call_model", call_model)
graph.set_entry_point("call_model")
app = graph.compile()
result = app.invoke({"messages": [{"role": "user", "content": "hello"}]})
# FIXED: explicit timeout + validate connectivity before compiling the graph
import httpx
from langgraph.graph import StateGraph
from langchain_openai import ChatOpenAI
client = httpx.Client(timeout=httpx.Timeout(connect=5.0, read=30.0, write=10.0, pool=5.0))
llm = ChatOpenAI(
model="gpt-4o-mini",
base_url="http://localhost:11434/v1",
http_client=client,
)
def call_model(state):
response = llm.invoke(state["messages"])
return {"messages": state["messages"] + [response]}
# Fail fast outside the graph
health = httpx.get("http://localhost:11434/api/tags", timeout=5.0)
health.raise_for_status()
graph = StateGraph(dict)
graph.add_node("call_model", call_model)
graph.set_entry_point("call_model")
app = graph.compile()
result = app.invoke({"messages": [{"role": "user", "content": "hello"}]})
If you’re using LangGraph Cloud or a remote runtime, this same pattern applies. A node that calls out to another service should have its own timeout and retry policy instead of relying on the default socket behavior.
Other Possible Causes
1) Your tool function is blocking on network I/O
A tool that calls an internal API without a timeout will stall the whole node.
# BAD
import requests
def lookup_customer(customer_id: str):
return requests.get(f"https://internal-api/customers/{customer_id}").json()
# GOOD
import requests
def lookup_customer(customer_id: str):
r = requests.get(
f"https://internal-api/customers/{customer_id}",
timeout=(3.0, 10.0),
)
r.raise_for_status()
return r.json()
2) You’re streaming from a model server that doesn’t support your client settings
This often shows up as httpx.ReadTimeout when using local inference servers behind Docker or reverse proxies.
llm = ChatOpenAI(
model="gpt-4o-mini",
base_url="http://127.0.0.1:8000/v1",
streaming=True,
)
If the server buffers responses or drops chunked transfer encoding, streaming can look like a timeout even though the endpoint is alive.
3) Docker networking is wrong during development
If LangGraph runs inside one container and your model server runs on the host machine, localhost points to the container itself.
# docker-compose.yml snippet
services:
app:
build: .
environment:
OPENAI_BASE_URL: http://localhost:11434/v1 # wrong inside container
Use the host gateway or service name instead:
services:
app:
build: .
environment:
OPENAI_BASE_URL: http://host.docker.internal:11434/v1
4) Your graph node does too much work synchronously
Long CPU-bound work inside a node can make it look like a connection issue because nothing returns before upstream timeouts expire.
def expensive_node(state):
# BAD: blocking CPU work before any response path exists
data = run_full_pdf_parse_and_embedding_pipeline(state["file_path"])
return {"data": data}
Move heavy work out of request-time execution or wrap it in background jobs with status polling.
How to Debug It
- •
Isolate the failing node
- •Run each LangGraph node as a plain Python function.
- •If
app.invoke()fails withlanggraph.errors.GraphExecutionError, inspect the nested exception. - •You want the real cause:
httpx.ConnectTimeout,ReadTimeout, or DNS failure.
- •
Test every external dependency directly
- •Call your LLM endpoint with
curlorhttpx. - •Call your tool’s backend API outside LangGraph.
- •If the direct call hangs, LangGraph is just reporting it later.
- •Call your LLM endpoint with
- •
Print timing around each node
import time def timed_node(state): start = time.perf_counter() try: return some_node(state) finally: print(f"some_node took {time.perf_counter() - start:.2f}s")- •This tells you whether the delay is in model inference, network I/O, or post-processing.
- •
Turn on transport-level logging
import logging logging.basicConfig(level=logging.INFO) logging.getLogger("httpx").setLevel(logging.DEBUG) logging.getLogger("openai").setLevel(logging.DEBUG)- •Look for DNS failures, refused connections, proxy errors, and read stalls.
- •If you see retries with no progress, your timeout is too generous or missing entirely.
Prevention
- •Set explicit timeouts on every outbound HTTP client used by tools and model wrappers.
- •Validate service health before compiling or invoking your graph.
- •Keep nodes small and deterministic; push long-running work into background jobs.
- •In Docker-based dev environments, never assume
localhostmeans what you think it means. - •Log node duration and nested exceptions so
langgraph.errors.GraphExecutionErrordoesn’t hide the root cause.
If you fix only one thing first, fix timeouts on every network hop inside your graph. That resolves most “connection timeout during development” cases in LangGraph Python projects.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit