How to Fix 'connection timeout during development' in AutoGen (Python)
If you’re seeing connection timeout during development in AutoGen, it usually means your agent tried to reach an LLM endpoint, tool server, or local backend and never got a response before the timeout window expired. In practice, this shows up most often during local development when the API base URL is wrong, the service isn’t running, or the timeout is too aggressive for the request path.
The annoying part is that AutoGen often wraps the underlying exception, so you may see a generic failure from AssistantAgent, OpenAIChatCompletionClient, or ConversableAgent while the real issue is one layer below.
The Most Common Cause
The #1 cause is a bad endpoint configuration: wrong base_url, wrong port, or pointing AutoGen at a service that isn’t actually listening.
This happens a lot when developers switch between OpenAI, Azure OpenAI, Ollama, LiteLLM, or a local proxy and keep an old config around.
Broken vs fixed
| Broken pattern | Fixed pattern |
|---|---|
| ```python | |
| from autogen_agentchat.agents import AssistantAgent | |
| from autogen_ext.models.openai import OpenAIChatCompletionClient |
model_client = OpenAIChatCompletionClient( model="gpt-4o-mini", api_key="sk-test", base_url="http://localhost:9999/v1", # nothing is listening here )
agent = AssistantAgent(
name="assistant",
model_client=model_client,
)
|python
from autogen_agentchat.agents import AssistantAgent
from autogen_ext.models.openai import OpenAIChatCompletionClient
model_client = OpenAIChatCompletionClient( model="gpt-4o-mini", api_key="sk-test", base_url="http://localhost:11434/v1", # example: Ollama/OpenAI-compatible server )
agent = AssistantAgent( name="assistant", model_client=model_client, )
If the endpoint is wrong, AutoGen may surface errors like:
- `httpx.ConnectTimeout`
- `httpx.ReadTimeout`
- `openai.APIConnectionError`
- `ConnectionError: connection timeout during development`
If you’re using an OpenAI-compatible local server, verify the exact URL with a direct request first:
```bash
curl http://localhost:11434/v1/models
If that hangs or fails, AutoGen will fail too.
Other Possible Causes
1. Your timeout is too low
Some models take longer on first request because they’re cold-starting or loading weights.
from autogen_ext.models.openai import OpenAIChatCompletionClient
model_client = OpenAIChatCompletionClient(
model="gpt-4o-mini",
api_key="sk-test",
base_url="http://localhost:11434/v1",
timeout=5, # too short for cold starts
)
Fix it by increasing the timeout:
model_client = OpenAIChatCompletionClient(
model="gpt-4o-mini",
api_key="sk-test",
base_url="http://localhost:11434/v1",
timeout=60,
)
2. The local model server is overloaded or still starting
This is common with Ollama, vLLM, LM Studio, and custom FastAPI wrappers.
# Server starts but isn't ready yet
ollama serve
# Your AutoGen script runs immediately after this
Wait until the health check passes before creating the agent. For example:
import time
import requests
for _ in range(30):
try:
r = requests.get("http://localhost:11434/v1/models", timeout=2)
if r.ok:
break
except requests.RequestException:
time.sleep(1)
else:
raise RuntimeError("Model server never became ready")
3. DNS, proxy, or firewall issues
If you’re calling a remote endpoint from inside a corporate network or containerized dev environment, traffic may be blocked.
base_url = "https://api.example-llm.com/v1"
# works on laptop browser but fails in Docker due to proxy/firewall rules
Check these variables:
echo $HTTP_PROXY
echo $HTTPS_PROXY
echo $NO_PROXY
Also verify your container can resolve and reach the host:
curl -I https://api.example-llm.com/v1/models
4. You’re mixing incompatible client and provider settings
A common mistake is using an OpenAI client against an Azure endpoint without Azure-specific configuration.
# Wrong: plain OpenAI client pointed at Azure-style deployment URL
OpenAIChatCompletionClient(
model="gpt-4o-mini",
api_key=os.environ["AZURE_OPENAI_API_KEY"],
base_url=os.environ["AZURE_OPENAI_ENDPOINT"],
)
Use the provider-specific client/config instead of forcing everything through one class.
How to Debug It
- •
Reproduce outside AutoGen
- •Hit the same endpoint with
curlor a tiny Python script. - •If this fails, stop blaming AutoGen; it’s network or server-side.
- •Hit the same endpoint with
- •
Print the final resolved config
- •Log
base_url,model, and timeout before instantiatingAssistantAgent. - •I’ve seen plenty of “timeout” bugs caused by loading the wrong
.env.
- •Log
- •
Increase logging around the HTTP client
- •If you’re using
httpx, enable debug logs:
import logging logging.basicConfig(level=logging.DEBUG) logging.getLogger("httpx").setLevel(logging.DEBUG) - •If you’re using
- •
Test with a known-good endpoint
- •Point the same code at a stable provider.
- •If it works there but not locally, your issue is local networking or server readiness.
Prevention
- •Use explicit health checks before creating
AssistantAgentorConversableAgent. - •Keep provider configs separate per environment:
.env.local,.env.dev,.env.prod. - •Set sane defaults for timeouts and retry behavior in your model client wrapper.
- •Validate endpoints in CI with a small smoke test that calls
/v1/modelsor an equivalent health route.
If you want to avoid wasting time on this class of bug entirely, treat your LLM backend like any other dependency: verify it’s reachable, verify it’s ready, then wire AutoGen into it. That’s what keeps “connection timeout during development” from becoming a daily ritual.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit