AutoGen Tutorial (Python): running agents in parallel for beginners

By Cyprian AaronsUpdated 2026-04-21
autogenrunning-agents-in-parallel-for-beginnerspython

This tutorial shows you how to run multiple AutoGen agents in parallel from Python and collect their outputs in one place. You need this when one task can be split into independent sub-tasks, like comparing vendors, checking multiple documents, or generating several candidate answers at the same time.

What You'll Need

  • Python 3.10+
  • autogen-agentchat
  • autogen-ext
  • An OpenAI API key
  • A terminal and a virtual environment
  • Basic familiarity with AutoGen agents and messages

Install the packages:

pip install autogen-agentchat autogen-ext openai

Set your API key:

export OPENAI_API_KEY="your-key-here"

Step-by-Step

  1. Create a shared model client and define a simple agent factory.
    The important part here is that each agent is independent, but they all use the same model configuration.
import asyncio
import os

from autogen_agentchat.agents import AssistantAgent
from autogen_ext.models.openai import OpenAIChatCompletionClient

MODEL = "gpt-4o-mini"

client = OpenAIChatCompletionClient(
    model=MODEL,
    api_key=os.environ["OPENAI_API_KEY"],
)

def make_agent(name: str, system_message: str) -> AssistantAgent:
    return AssistantAgent(
        name=name,
        model_client=client,
        system_message=system_message,
    )
  1. Define a task runner that executes one agent call.
    Each call is just an async function, which makes it easy to fan out work with asyncio.gather().
async def run_agent(agent: AssistantAgent, task: str) -> str:
    result = await agent.run(task=task)
    return result.messages[-1].content
  1. Build three agents with different roles and run them in parallel.
    Here we ask each agent to produce a different output from the same input so you can see parallelism clearly.
async def main():
    topic = "Design a basic fraud detection workflow for card payments."

    analyst = make_agent(
        "analyst",
        "You are a risk analyst. Give concise operational analysis.",
    )
    architect = make_agent(
        "architect",
        "You are a solutions architect. Focus on system design.",
    )
    writer = make_agent(
        "writer",
        "You are a technical writer. Summarize clearly for engineers.",
    )

    tasks = [
        run_agent(analyst, f"Analyze this topic: {topic}"),
        run_agent(architect, f"Propose an architecture for: {topic}"),
        run_agent(writer, f"Write a short summary for: {topic}"),
    ]

    outputs = await asyncio.gather(*tasks)

    for name, output in zip(["analyst", "architect", "writer"], outputs):
        print(f"\n=== {name.upper()} ===\n{output}")
  1. Add error handling so one failed agent does not kill the whole batch.
    In production, this matters more than the happy path because one timeout or rate limit should not take down every branch of work.
async def safe_run_agent(agent: AssistantAgent, task: str) -> dict:
    try:
        result = await agent.run(task=task)
        return {"agent": agent.name, "ok": True, "output": result.messages[-1].content}
    except Exception as e:
        return {"agent": agent.name, "ok": False, "error": str(e)}

async def main_safe():
    agents = [
        make_agent("a1", "You are concise."),
        make_agent("a2", "You are concise."),
        make_agent("a3", "You are concise."),
    ]

    jobs = [safe_run_agent(agent, "List 3 benefits of parallel execution.") for agent in agents]
    results = await asyncio.gather(*jobs)

    for item in results:
        print(item)
  1. Run everything from one entry point and close the model client cleanly.
    This avoids leaking connections and keeps your script usable in notebooks or standalone files.
if __name__ == "__main__":
    try:
        asyncio.run(main())
    finally:
        asyncio.run(client.close())

Testing It

Run the script and confirm you get three separate outputs back in one execution. The easiest check is to add timestamps before and after asyncio.gather(); if the agents are truly running in parallel, total runtime should be closer to the slowest single call than the sum of all calls.

If you want a stronger test, give each agent a different prompt and verify that each response matches its role. Also try breaking one prompt or temporarily using an invalid model name to confirm your error-handling path returns partial results instead of crashing the whole run.

Next Steps

  • Add a router agent that decides which specialist agents should run for a given request.
  • Persist outputs to JSON so downstream services can consume them.
  • Wrap this pattern in FastAPI so your backend can fan out work on demand.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides