How to Fix 'output parsing error during development' in AutoGen (Python)
What this error means
output parsing error during development in AutoGen usually means the framework expected a structured response from an agent, but the model returned something it could not parse. You’ll see this most often when using tools, function calling, structured outputs, or a UserProxyAgent/AssistantAgent flow that expects a specific format.
In practice, this is rarely a “Python bug.” It’s usually a mismatch between what AutoGen expects and what your LLM actually returned.
The Most Common Cause
The #1 cause is prompting the model to return structured output, but not enforcing the structure. AutoGen agents can be configured to parse tool calls, JSON, or code blocks. If the model responds with extra text, malformed JSON, or the wrong format, you get errors like:
- •
OutputParserException - •
ValueError: Failed to parse LLM output - •
output parsing error during development
Here’s the broken pattern versus the fixed one.
| Broken pattern | Fixed pattern |
|---|---|
| Prompts ask for JSON but don’t constrain the response | Use strict instructions and a parser-friendly format |
| Model returns natural language plus JSON | Model returns only valid JSON |
| Tool call schema doesn’t match agent config | Align function_map, tool signatures, and expected output |
# BROKEN: prompt says "return JSON", but nothing enforces it.
from autogen import AssistantAgent, UserProxyAgent
assistant = AssistantAgent(
name="assistant",
llm_config={"config_list": [{"model": "gpt-4o-mini"}]},
)
user = UserProxyAgent(name="user", human_input_mode="NEVER")
user.initiate_chat(
assistant,
message="""
Give me customer risk data as JSON:
{"customer_id": "...", "risk_score": 0-100}
"""
)
# FIXED: make the output contract explicit and parse-friendly.
from autogen import AssistantAgent, UserProxyAgent
assistant = AssistantAgent(
name="assistant",
llm_config={"config_list": [{"model": "gpt-4o-mini"}]},
system_message=(
"Return ONLY valid JSON. "
"No markdown, no explanation, no code fences."
),
)
user = UserProxyAgent(name="user", human_input_mode="NEVER")
user.initiate_chat(
assistant,
message='Return {"customer_id": string, "risk_score": integer}.'
)
If you’re using tool calling, make sure your function signature matches exactly what AutoGen expects. A common failure is expecting a function call but letting the model free-form its answer.
Other Possible Causes
1) Tool schema mismatch
If you register tools with function_map, but the model emits arguments that don’t match your Python function signature, parsing fails.
def create_policy(customer_id: str, premium: float):
return {"status": "ok"}
assistant = AssistantAgent(
name="assistant",
llm_config={"config_list": [{"model": "gpt-4o-mini"}]},
)
user = UserProxyAgent(
name="user",
human_input_mode="NEVER",
function_map={"create_policy": create_policy},
)
If the model sends premium_amount instead of premium, AutoGen can’t bind the arguments cleanly.
2) Model returns extra text around structured output
This happens when you ask for JSON but the model adds commentary.
Sure — here is the JSON:
{"claim_id":"C123","status":"approved"}
That first line is enough to break strict parsing in many setups. If you need machine-readable output, force “JSON only” and validate before passing it downstream.
3) Wrong LLM config or unsupported model behavior
Some models are weaker at following strict formatting rules. If your llm_config points to a model that doesn’t reliably support tool calls or structured output, parsing errors will show up intermittently.
llm_config = {
"config_list": [
{
"model": "some-chat-model",
"api_key": "...",
}
],
"temperature": 0,
}
If this works sometimes and fails other times, check whether the model supports the exact interaction style you’re using.
4) Code execution output being parsed as agent content
When using UserProxyAgent with code execution enabled, stdout/stderr can get mixed into content that another agent tries to parse.
user_proxy = UserProxyAgent(
name="executor",
human_input_mode="NEVER",
code_execution_config={"work_dir": "tmp", "use_docker": False},
)
If executed code prints debugging lines before emitting structured data, downstream parsing can break. Keep execution output separate from machine-readable payloads.
How to Debug It
- •
Inspect the raw assistant response
- •Log exactly what the model returned before AutoGen parses it.
- •Look for markdown fences, explanations, trailing commas, or malformed JSON.
- •
Turn temperature down
- •Set
temperature=0. - •This reduces formatting drift and makes failures easier to reproduce.
- •Set
- •
Remove tools temporarily
- •Run the same prompt without
function_map, code execution, or nested agents. - •If it stops failing, your issue is likely schema or tool-related.
- •Run the same prompt without
- •
Validate output manually
- •If you expect JSON, run it through
json.loads()before handing it back to AutoGen. - •If parsing fails there too, the model output is invalid; if not, your AutoGen config is likely off.
- •If you expect JSON, run it through
import json
raw = '{"customer_id":"C123","risk_score":82}'
parsed = json.loads(raw) # catches malformed JSON immediately
Prevention
- •Use explicit system messages like: “Return ONLY valid JSON. No prose.”
- •Keep tool signatures simple and stable; avoid optional parameter ambiguity.
- •Add a validation layer before downstream parsing:
- •
json.loads()for JSON - •Pydantic models for structured responses
- •assertions for required keys
- •
If you’re building production agents for banking or insurance workflows, treat LLM output like untrusted input. Parse defensively first; let AutoGen orchestrate second.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit