How to Fix 'connection timeout during development' in AutoGen (Python)

By Cyprian AaronsUpdated 2026-04-21
connection-timeout-during-developmentautogenpython

If you’re seeing connection timeout during development in AutoGen, it usually means your agent tried to reach an LLM endpoint, tool server, or local backend and never got a response before the timeout window expired. In practice, this shows up most often during local development when the API base URL is wrong, the service isn’t running, or the timeout is too aggressive for the request path.

The annoying part is that AutoGen often wraps the underlying exception, so you may see a generic failure from AssistantAgent, OpenAIChatCompletionClient, or ConversableAgent while the real issue is one layer below.

The Most Common Cause

The #1 cause is a bad endpoint configuration: wrong base_url, wrong port, or pointing AutoGen at a service that isn’t actually listening.

This happens a lot when developers switch between OpenAI, Azure OpenAI, Ollama, LiteLLM, or a local proxy and keep an old config around.

Broken vs fixed

Broken patternFixed pattern
```python
from autogen_agentchat.agents import AssistantAgent
from autogen_ext.models.openai import OpenAIChatCompletionClient

model_client = OpenAIChatCompletionClient( model="gpt-4o-mini", api_key="sk-test", base_url="http://localhost:9999/v1", # nothing is listening here )

agent = AssistantAgent( name="assistant", model_client=model_client, ) |python from autogen_agentchat.agents import AssistantAgent from autogen_ext.models.openai import OpenAIChatCompletionClient

model_client = OpenAIChatCompletionClient( model="gpt-4o-mini", api_key="sk-test", base_url="http://localhost:11434/v1", # example: Ollama/OpenAI-compatible server )

agent = AssistantAgent( name="assistant", model_client=model_client, )


If the endpoint is wrong, AutoGen may surface errors like:

- `httpx.ConnectTimeout`
- `httpx.ReadTimeout`
- `openai.APIConnectionError`
- `ConnectionError: connection timeout during development`

If you’re using an OpenAI-compatible local server, verify the exact URL with a direct request first:

```bash
curl http://localhost:11434/v1/models

If that hangs or fails, AutoGen will fail too.

Other Possible Causes

1. Your timeout is too low

Some models take longer on first request because they’re cold-starting or loading weights.

from autogen_ext.models.openai import OpenAIChatCompletionClient

model_client = OpenAIChatCompletionClient(
    model="gpt-4o-mini",
    api_key="sk-test",
    base_url="http://localhost:11434/v1",
    timeout=5,  # too short for cold starts
)

Fix it by increasing the timeout:

model_client = OpenAIChatCompletionClient(
    model="gpt-4o-mini",
    api_key="sk-test",
    base_url="http://localhost:11434/v1",
    timeout=60,
)

2. The local model server is overloaded or still starting

This is common with Ollama, vLLM, LM Studio, and custom FastAPI wrappers.

# Server starts but isn't ready yet
ollama serve
# Your AutoGen script runs immediately after this

Wait until the health check passes before creating the agent. For example:

import time
import requests

for _ in range(30):
    try:
        r = requests.get("http://localhost:11434/v1/models", timeout=2)
        if r.ok:
            break
    except requests.RequestException:
        time.sleep(1)
else:
    raise RuntimeError("Model server never became ready")

3. DNS, proxy, or firewall issues

If you’re calling a remote endpoint from inside a corporate network or containerized dev environment, traffic may be blocked.

base_url = "https://api.example-llm.com/v1"
# works on laptop browser but fails in Docker due to proxy/firewall rules

Check these variables:

echo $HTTP_PROXY
echo $HTTPS_PROXY
echo $NO_PROXY

Also verify your container can resolve and reach the host:

curl -I https://api.example-llm.com/v1/models

4. You’re mixing incompatible client and provider settings

A common mistake is using an OpenAI client against an Azure endpoint without Azure-specific configuration.

# Wrong: plain OpenAI client pointed at Azure-style deployment URL
OpenAIChatCompletionClient(
    model="gpt-4o-mini",
    api_key=os.environ["AZURE_OPENAI_API_KEY"],
    base_url=os.environ["AZURE_OPENAI_ENDPOINT"],
)

Use the provider-specific client/config instead of forcing everything through one class.

How to Debug It

  1. Reproduce outside AutoGen

    • Hit the same endpoint with curl or a tiny Python script.
    • If this fails, stop blaming AutoGen; it’s network or server-side.
  2. Print the final resolved config

    • Log base_url, model, and timeout before instantiating AssistantAgent.
    • I’ve seen plenty of “timeout” bugs caused by loading the wrong .env.
  3. Increase logging around the HTTP client

    • If you’re using httpx, enable debug logs:
    import logging
    
    logging.basicConfig(level=logging.DEBUG)
    logging.getLogger("httpx").setLevel(logging.DEBUG)
    
  4. Test with a known-good endpoint

    • Point the same code at a stable provider.
    • If it works there but not locally, your issue is local networking or server readiness.

Prevention

  • Use explicit health checks before creating AssistantAgent or ConversableAgent.
  • Keep provider configs separate per environment: .env.local, .env.dev, .env.prod.
  • Set sane defaults for timeouts and retry behavior in your model client wrapper.
  • Validate endpoints in CI with a small smoke test that calls /v1/models or an equivalent health route.

If you want to avoid wasting time on this class of bug entirely, treat your LLM backend like any other dependency: verify it’s reachable, verify it’s ready, then wire AutoGen into it. That’s what keeps “connection timeout during development” from becoming a daily ritual.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides