AutoGen Tutorial (Python): testing agents locally for beginners

By Cyprian AaronsUpdated 2026-04-21
autogentesting-agents-locally-for-beginnerspython

This tutorial shows you how to run and test an AutoGen agent locally in Python without wiring it into a full app first. You need this when you want to validate prompts, tool calls, message flow, and basic multi-agent behavior before you put the agent behind an API or UI.

What You'll Need

  • Python 3.10+
  • autogen-agentchat
  • autogen-ext
  • python-dotenv if you want to load keys from a .env file
  • An LLM API key for the model provider you choose
  • Basic familiarity with Python virtual environments
  • A local terminal and editor

Step-by-Step

  1. Create a clean project and install the packages. Keep this isolated so you can test agent behavior without dependency noise from other projects.
mkdir autogen-local-test
cd autogen-local-test

python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate

pip install autogen-agentchat autogen-ext python-dotenv
  1. Set your API key locally. For beginners, a .env file is the easiest way to keep secrets out of your code while still making the script runnable.
cat > .env << 'EOF'
OPENAI_API_KEY=your_api_key_here
EOF
  1. Create a minimal agent that can answer one question. This uses AutoGen’s modern agent-chat API and a real OpenAI client wrapper.
import asyncio
import os

from dotenv import load_dotenv
from autogen_agentchat.agents import AssistantAgent
from autogen_ext.models.openai import OpenAIChatCompletionClient

load_dotenv()

async def main() -> None:
    model_client = OpenAIChatCompletionClient(
        model="gpt-4o-mini",
        api_key=os.environ["OPENAI_API_KEY"],
    )

    agent = AssistantAgent(
        name="assistant",
        model_client=model_client,
        system_message="You are a concise assistant that answers directly.",
    )

    result = await agent.run(task="Say hello in one short sentence.")
    print(result.messages[-1].content)

if __name__ == "__main__":
    asyncio.run(main())
  1. Run the script and confirm the agent responds locally from your terminal. If this works, your environment, auth, and model client are all wired correctly.
python main.py
  1. Add a second agent so you can test local multi-agent coordination. This is useful when you want to see whether agents hand off work cleanly instead of just answering single prompts.
import asyncio
import os

from dotenv import load_dotenv
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.team import RoundRobinGroupChat
from autogen_ext.models.openai import OpenAIChatCompletionClient

load_dotenv()

async def main() -> None:
    client = OpenAIChatCompletionClient(
        model="gpt-4o-mini",
        api_key=os.environ["OPENAI_API_KEY"],
    )

    planner = AssistantAgent(
        name="planner",
        model_client=client,
        system_message="You create short task plans.",
    )
    writer = AssistantAgent(
        name="writer",
        model_client=client,
        system_message="You turn plans into short final answers.",
    )

    team = RoundRobinGroupChat([planner, writer], max_turns=4)
    result = await team.run(task="Draft a 2-step plan for testing an agent locally.")
    print(result.messages[-1].content)

if __name__ == "__main__":
    asyncio.run(main())
  1. Test a tool call locally so you can verify the agent is not just chatting but also using functions correctly. Start with something deterministic like reading the current directory or doing simple math.
import asyncio
import os

from dotenv import load_dotenv
from autogen_agentchat.agents import AssistantAgent
from autogen_core.tools import FunctionTool
from autogen_ext.models.openai import OpenAIChatCompletionClient

load_dotenv()

def add(a: int, b: int) -> int:
    return a + b

async def main() -> None:
    client = OpenAIChatCompletionClient(
        model="gpt-4o-mini",
        api_key=os.environ["OPENAI_API_KEY"],
    )

    tool = FunctionTool(add, description="Add two integers.")
    agent = AssistantAgent(
        name="calculator",
        model_client=client,
        tools=[tool],
        system_message="Use tools when helpful.",
    )

    result = await agent.run(task="What is 12 + 30?")
    print(result.messages[-1].content)

if __name__ == "__main__":
    asyncio.run(main())

Testing It

Run each script separately and check three things: the process exits cleanly, the output matches what you asked for, and tool-based answers are actually using the function instead of guessing. If you get auth errors, re-check OPENAI_API_KEY; if you get import errors, confirm your virtual environment is active and packages are installed in that environment.

For multi-agent tests, look for turn-taking behavior rather than perfect prose. You want to confirm each agent gets a chance to respond and that the final output reflects collaboration, not just one agent doing all the work.

A good local test loop is: change the system message, rerun, inspect output, repeat. That’s how you catch prompt issues early before they become production bugs.

Next Steps

  • Add structured output validation so your agents return JSON you can parse safely.
  • Test memory and state persistence across multiple turns.
  • Wrap these scripts in pytest so prompt changes become regression tests.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides