AutoGen Tutorial (Python): parsing structured output for intermediate developers
This tutorial shows you how to make an AutoGen assistant return structured data, then parse that data safely in Python. You need this when you want agents to produce machine-readable output for downstream code like ticket routing, claim triage, or policy extraction.
What You'll Need
- •Python 3.10+
- •
autogen-agentchat - •
autogen-ext - •
pydantic - •An OpenAI-compatible API key
- •Set
OPENAI_API_KEYin your environment - •Basic familiarity with AutoGen agents and async Python
Step-by-Step
- •Start by defining the schema you want the model to return. Use Pydantic so your parser is strict and your downstream code gets typed fields instead of fragile string parsing.
from pydantic import BaseModel, Field
from typing import Literal
class SupportTicket(BaseModel):
category: Literal["billing", "technical", "account", "other"]
priority: Literal["low", "medium", "high"]
summary: str = Field(min_length=10)
action_required: bool
- •Next, create an AutoGen assistant that is instructed to output only JSON matching that schema. The key here is not hoping the model behaves; it’s constraining the prompt so the response is predictable enough to validate.
import asyncio
import os
from autogen_agentchat.agents import AssistantAgent
from autogen_ext.models.openai import OpenAIChatCompletionClient
model_client = OpenAIChatCompletionClient(
model="gpt-4o-mini",
api_key=os.environ["OPENAI_API_KEY"],
)
agent = AssistantAgent(
name="ticket_parser",
model_client=model_client,
system_message=(
"You extract support ticket fields. "
"Return ONLY valid JSON with keys: category, priority, summary, action_required."
),
)
- •Send a realistic input and parse the response into your Pydantic model. In production, this is where you stop treating LLM output as text and start treating it as an untrusted payload that must be validated.
async def main() -> None:
result = await agent.run(task=(
"Customer says their card was charged twice for the same subscription. "
"They want a refund and are upset because support has not replied."
))
raw_text = result.messages[-1].content
print("RAW OUTPUT:", raw_text)
if __name__ == "__main__":
asyncio.run(main())
- •Add a JSON extraction and validation layer. This handles common failures like extra prose, markdown fences, or malformed payloads without crashing your application immediately.
import json
def parse_ticket(raw_text: str) -> SupportTicket:
cleaned = raw_text.strip()
if cleaned.startswith("```"):
cleaned = cleaned.split("\n", 1)[1].rsplit("```", 1)[0].strip()
data = json.loads(cleaned)
return SupportTicket.model_validate(data)
async def main() -> None:
result = await agent.run(task=(
"Customer says their card was charged twice for the same subscription. "
"They want a refund and are upset because support has not replied."
))
ticket = parse_ticket(result.messages[-1].content)
print(ticket.model_dump())
- •If you want stronger reliability, wrap parsing in a retry loop and feed validation errors back into the agent. That gives the model a chance to correct format issues instead of forcing your app to fail on first bad output.
async def generate_ticket(text: str) -> SupportTicket:
result = await agent.run(task=text)
raw_text = result.messages[-1].content
try:
return parse_ticket(raw_text)
except Exception as e:
repair_prompt = (
f"Fix this JSON so it validates against SupportTicket.\n"
f"Error: {e}\n"
f"Output:\n{raw_text}"
)
repaired = await agent.run(task=repair_prompt)
return parse_ticket(repaired.messages[-1].content)
async def main() -> None:
ticket = await generate_ticket(
"Customer says their card was charged twice for the same subscription."
)
print(ticket.category, ticket.priority)
Testing It
Run the script several times with different customer messages and confirm you always get a valid SupportTicket object back. Test one clean case, one ambiguous case, and one case where the model returns extra text or markdown fences.
Check that invalid values fail fast through Pydantic instead of silently passing through your pipeline. If you’re using this in a service, log both the raw model output and the validation error so you can debug prompt drift later.
A good smoke test is to intentionally ask for nonsense input like “banana invoice portal broken” and verify the schema still forces one of your allowed categories. That tells you whether your prompt plus parser are doing real work.
Next Steps
- •Add
TypeAdapteror stricter Pydantic constraints for nested objects and lists - •Move from prompt-only formatting to AutoGen tool use when extraction needs external lookup
- •Add observability around raw output, validation failures, and retry counts
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit