CrewAI Tutorial (Python): testing agents locally for intermediate developers

By Cyprian AaronsUpdated 2026-04-21
crewaitesting-agents-locally-for-intermediate-developerspython

This tutorial shows you how to run and test a CrewAI agent locally in Python without wiring up your full production stack. You need this when you want fast feedback on prompts, tools, and task behavior before shipping anything behind an API or into a larger workflow.

What You'll Need

  • Python 3.10+
  • A virtual environment tool like venv
  • crewai
  • crewai-tools
  • python-dotenv
  • An OpenAI API key set as OPENAI_API_KEY
  • Basic familiarity with CrewAI agents, tasks, and crews

Step-by-Step

  1. Start with a clean project and install the packages you need. Keep this isolated so you can change models and tools without breaking other projects.
mkdir crewai-local-testing
cd crewai-local-testing
python -m venv .venv
source .venv/bin/activate

pip install crewai crewai-tools python-dotenv
  1. Add your API key to a local .env file. This keeps credentials out of source control and makes local testing repeatable.
cat > .env << 'EOF'
OPENAI_API_KEY=your_openai_api_key_here
EOF
  1. Create a small agent, task, and crew in one file. The important part for local testing is that the setup is minimal, deterministic, and easy to run from the terminal.
from dotenv import load_dotenv
from crewai import Agent, Task, Crew, Process
from crewai.llm import LLM

load_dotenv()

llm = LLM(model="gpt-4o-mini")

researcher = Agent(
    role="Research Analyst",
    goal="Summarize the given topic clearly",
    backstory="You are precise and concise.",
    llm=llm,
    verbose=True,
)

task = Task(
    description="Explain what CrewAI is in 3 bullet points.",
    expected_output="Three concise bullet points.",
    agent=researcher,
)

crew = Crew(
    agents=[researcher],
    tasks=[task],
    process=Process.sequential,
    verbose=True,
)
  1. Run the crew locally and print the result. For intermediate testing, this is enough to validate prompt quality, model access, and basic execution flow.
if __name__ == "__main__":
    result = crew.kickoff()
    print("\n=== Crew Output ===\n")
    print(result)
  1. Add a local smoke test so you can rerun it quickly after changing prompts or tools. This avoids guessing whether a failure came from your code or from model output drift.
def smoke_test():
    output = crew.kickoff()
    text = str(output).lower()

    assert "crew" in text or "agent" in text or len(text) > 20
    print("Smoke test passed.")

if __name__ == "__main__":
    smoke_test()
  1. If you want to test tool usage locally, attach a simple tool and verify the agent calls it correctly. This is where most real-world CrewAI bugs show up: bad tool signatures, unclear instructions, or outputs that are hard to parse.
from crewai_tools import SerperDevTool

search_tool = SerperDevTool()

tool_agent = Agent(
    role="Search Assistant",
    goal="Find relevant information using search",
    backstory="You verify facts before answering.",
    llm=llm,
    tools=[search_tool],
    verbose=True,
)

tool_task = Task(
    description="Search for recent CrewAI updates and summarize them.",
    expected_output="A short summary with sources.",
    agent=tool_agent,
)

tool_crew = Crew(
    agents=[tool_agent],
    tasks=[tool_task],
    process=Process.sequential,
    verbose=True,
)

Testing It

Run the script with python your_file.py and confirm two things: the process completes without errors, and the output matches the task shape you asked for. If you enabled verbose=True, you should also see step-by-step reasoning traces from the agent execution flow.

When debugging locally, change only one variable at a time: prompt text, model name, or tool list. That makes it obvious whether a failure comes from orchestration or from model behavior.

If the output is unstable, tighten the task instructions and make the expected output more specific. In practice, better task contracts beat longer prompts.

Next Steps

  • Add unit tests around task output structure using pytest
  • Test multiple agents with Process.hierarchical once sequential runs are stable
  • Wrap your crew in a small CLI so you can run local checks before deploying

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides