How to Fix 'connection timeout during development' in CrewAI (Python)

By Cyprian AaronsUpdated 2026-04-21
connection-timeout-during-developmentcrewaipython

What the error means

connection timeout during development in CrewAI usually means one of your agents tried to call an external service and never got a response before the request deadline. In practice, this shows up when you’re using Crew, Agent, Task, or tools that depend on network calls like OpenAI, Serper, Firecrawl, or internal APIs.

It typically happens during local development when your network is flaky, your API key is wrong, the model endpoint is slow, or your tool code blocks the event loop with a long-running request.

The Most Common Cause

The #1 cause is a tool or LLM call without an explicit timeout and retry strategy. CrewAI will surface the failure inside a task run, but the real issue is usually a slow downstream dependency.

Here’s the broken pattern I see most often:

BrokenFixed
Tool makes a raw request with no timeoutTool sets a timeout and handles retries
Agent runs against default settings onlyAgent uses explicit model config
Task hangs until connection diesTask fails fast with useful errors
# BROKEN: no timeout, no retry, blocking network call
from crewai import Agent, Task, Crew
from crewai.tools import BaseTool
import requests

class CompanyLookupTool(BaseTool):
    name = "company_lookup"
    description = "Fetch company data"

    def _run(self, domain: str) -> str:
        # This can hang indefinitely if the API is slow
        response = requests.get(f"https://api.example.com კომპანი?domain={domain}")
        return response.text

agent = Agent(
    role="Researcher",
    goal="Find company info",
    backstory="You research companies.",
    tools=[CompanyLookupTool()],
)

task = Task(
    description="Look up company details for example.com",
    agent=agent,
)

crew = Crew(agents=[agent], tasks=[task])
crew.kickoff()
# FIXED: explicit timeout + retries + clearer failure mode
from crewai import Agent, Task, Crew
from crewai.tools import BaseTool
import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry

session = requests.Session()
retries = Retry(
    total=3,
    backoff_factor=0.5,
    status_forcelist=[429, 500, 502, 503, 504],
)
session.mount("https://", HTTPAdapter(max_retries=retries))

class CompanyLookupTool(BaseTool):
    name = "company_lookup"
    description = "Fetch company data"

    def _run(self, domain: str) -> str:
        try:
            response = session.get(
                f"https://api.example.com/company?domain={domain}",
                timeout=10,
            )
            response.raise_for_status()
            return response.text
        except requests.Timeout as e:
            raise RuntimeError("company_lookup timed out after 10s") from e
        except requests.RequestException as e:
            raise RuntimeError(f"company_lookup failed: {e}") from e

agent = Agent(
    role="Researcher",
    goal="Find company info",
    backstory="You research companies.",
    tools=[CompanyLookupTool()],
)

task = Task(
    description="Look up company details for example.com",
    agent=agent,
)

crew = Crew(agents=[agent], tasks=[task])
crew.kickoff()

If you’re calling an LLM through an environment variable like OPENAI_API_KEY, the same pattern applies. A bad key or a dead endpoint often presents as a timeout before you ever see a clean auth error.

Other Possible Causes

1. Wrong model provider or missing API key

If OPENAI_API_KEY, ANTHROPIC_API_KEY, or your provider-specific env vars are missing, CrewAI may fail while trying to resolve the model backend.

export OPENAI_API_KEY="sk-..."
export OPENAI_BASE_URL="https://api.openai.com/v1"

If you’re using Ollama locally:

llm_config = {
    "model": "ollama/llama3.1",
    "base_url": "http://localhost:11434",
}

If Ollama isn’t running, you’ll get a timeout that looks like networking trouble but is really just a dead local server.

2. A tool that blocks too long

A custom tool doing file scans, browser automation, or database work can exceed the task window.

class SlowTool(BaseTool):
    name = "slow_tool"
    description = "Does slow work"

    def _run(self) -> str:
        # bad: sleeps or loops too long without progress
        import time
        time.sleep(120)
        return "done"

Fix it by reducing runtime or moving heavy work outside the agent path.

3. Proxy / VPN / firewall interference

Corporate networks often block outbound calls to model providers.

curl -I https://api.openai.com/v1/models

If this hangs locally but works on mobile hotspot, your network path is the problem. Set proxy variables if needed:

export HTTPS_PROXY=http://proxy.company.local:8080
export HTTP_PROXY=http://proxy.company.local:8080

4. Rate limiting disguised as timeout

Some providers throttle hard enough that your client just waits and eventually times out.

# Example symptom: repeated 429s under load
agent = Agent(
    role="Analyst",
    goal="Summarize data",
    backstory="You analyze reports.",
)

Add backoff and lower concurrency in your app layer. Don’t fire multiple crews at once against the same account limits.

How to Debug It

  1. Isolate the failing layer

    • Run the LLM call outside CrewAI.
    • Run the tool function directly.
    • If requests.get(..., timeout=10) hangs alone, it’s not CrewAI.
  2. Turn on verbose logging

    • Check whether the failure happens in Agent, Task, or inside a custom tool.
    • Look for messages like ReadTimeout, ConnectTimeout, 429 Too Many Requests, or provider-specific SDK errors.
  3. Test connectivity manually

    curl -v https://api.openai.com/v1/models \
      -H "Authorization: Bearer $OPENAI_API_KEY"
    

    If this fails outside Python, fix DNS, proxy, firewall rules, or credentials first.

  4. Reduce to one agent and one task

    • Remove all tools.
    • Use a single simple prompt.
    • Then add components back one by one until it breaks again.

Prevention

  • Always set explicit timeouts on every network call inside tools.
  • Add retries with exponential backoff for transient failures like 429 and 503.
  • Keep local dependencies healthy: verify model servers like Ollama before starting your crew.
  • Treat external APIs as unreliable by default; design tasks to fail fast and surface the exact dependency that broke.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides