How to Fix 'connection timeout' in LangChain (Python)

By Cyprian AaronsUpdated 2026-04-21
connection-timeoutlangchainpython

If you’re seeing connection timeout in LangChain, the request is leaving your Python process but never getting a response back in time. In practice, this usually happens when LangChain is calling an LLM provider, embedding API, vector store, or internal service over the network and the default timeout is too low, the host is wrong, or the network path is blocked.

The annoying part: the stack trace often points at langchain_core, httpx, requests, or an SDK wrapper like openai, not your actual bug. The fix depends on which client LangChain is using under the hood.

The Most Common Cause

The #1 cause is a timeout that’s too aggressive for the request you’re making. This shows up a lot with long prompts, slow models, large embeddings batches, or first-call cold starts.

In LangChain, people often instantiate a model without setting any timeout or retry policy, then hit a provider default like httpx.ReadTimeout or openai.APITimeoutError.

Broken vs fixed pattern

BrokenFixed
Uses defaultsSets explicit timeout and retries
No control over request durationGives slow requests room to complete
Fails on transient latency spikesRetries before failing hard
# Broken
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o-mini")
response = llm.invoke("Summarize this 20-page policy document.")
print(response.content)
# Fixed
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="gpt-4o-mini",
    timeout=60,      # seconds
    max_retries=3,
)

response = llm.invoke("Summarize this 20-page policy document.")
print(response.content)

If you’re using older LangChain integrations or a different provider, the same idea applies: set the timeout on the underlying client. For example, ChatOpenAI, OpenAIEmbeddings, and many vector store clients all accept timeout-related configuration somewhere in their constructor chain.

Other Possible Causes

1) Wrong base URL or endpoint

This happens when you point LangChain at a local proxy, Azure endpoint, private gateway, or custom OpenAI-compatible server and the URL is wrong.

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    base_url="https://api.mycompany.internal/v1",  # check this
    api_key="...",
)

Common failure mode: DNS resolves, but the path hangs until timeout because /v1/chat/completions isn’t actually exposed there.

2) Network egress blocked by firewall/VPC rules

Your code works locally but times out in ECS, EKS, Lambda, Azure Functions, or a locked-down VM. That usually means outbound traffic to the provider is blocked.

curl -v https://api.openai.com/v1/models

If that hangs from the runtime environment but works on your laptop, it’s not a LangChain bug. It’s network policy.

3) Proxy configuration missing or incorrect

Corporate networks often require HTTP(S) proxy settings. Python clients used by LangChain will hang if they can’t reach the internet directly.

export HTTPS_PROXY=http://proxy.company.local:8080
export HTTP_PROXY=http://proxy.company.local:8080

If your app runs inside Docker or Kubernetes, make sure those env vars are passed into the container.

4) Batch size too large for embeddings or document ingestion

Large batches can trigger slow responses from embedding APIs or vector stores.

from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(
    model="text-embedding-3-small",
    chunk_size=1000,  # too large for some workloads
)

Try smaller batches:

embeddings = OpenAIEmbeddings(
    model="text-embedding-3-small",
    chunk_size=64,
)

How to Debug It

  1. Read the exact exception class

    • Look for httpx.ReadTimeout, httpx.ConnectTimeout, requests.exceptions.Timeout, openai.APITimeoutError, or LangChainError.
    • ConnectTimeout means you couldn’t establish a connection.
    • ReadTimeout means the server accepted it but didn’t respond fast enough.
  2. Isolate LangChain from your app

    • Run one direct call with minimal code.
    • Remove chains, tools, retrievers, agents, and memory.
    • If this still fails, it’s transport/config/network related.
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(timeout=60)
print(llm.invoke("Say hello").content)
  1. Test connectivity outside Python

    • Use curl against the same endpoint.
    • Check DNS resolution and proxy behavior.
    • If you’re using Azure/OpenAI-compatible endpoints, verify the full base URL path.
  2. Turn on verbose logs

    • Set LangChain debug logging and inspect request timing.
    • For HTTP clients like httpx, enable debug logs to see whether it fails on connect or read.
import logging

logging.basicConfig(level=logging.DEBUG)
logging.getLogger("httpx").setLevel(logging.DEBUG)
logging.getLogger("openai").setLevel(logging.DEBUG)

Prevention

  • Set explicit timeouts everywhere you call external services.
    • Don’t rely on library defaults for production workloads.
  • Use retries with backoff for transient failures.
    • A single slow response should not take down an agent flow.
  • Keep batch sizes small for embeddings and ingestion jobs.
    • Large payloads are one of the fastest ways to hit read timeouts.
  • Validate network access in deployment environments before shipping.
    • Local success means nothing if your container can’t reach the provider.

If you want one rule to remember: most LangChain timeout errors are not “LangChain problems.” They’re either bad network reachability or missing timeout tuning on the underlying client.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides