How to Fix 'output parsing error when scaling' in AutoGen (Python)
What this error means
output parsing error when scaling in AutoGen usually means one agent returned text that the framework expected to parse as structured output, but the response did not match the schema or format required by the downstream agent. You’ll typically see this when using AssistantAgent, UserProxyAgent, or a custom reply function with tool calls, JSON output, or nested agents.
In practice, this shows up during multi-agent orchestration, especially when you add stricter prompts, function calling, or a GroupChatManager that expects consistent message shapes.
The Most Common Cause
The #1 cause is a mismatch between what your prompt asks for and what AutoGen is trying to parse.
If you tell an agent to return JSON, then later feed that output into code expecting valid JSON, one extra sentence or malformed quote is enough to trigger parsing failures. The error often surfaces as something like:
- •
ValueError: Failed to parse model output - •
output parsing error when scaling - •
JSONDecodeError: Expecting value - •
autogen.exception.InvalidChatFormat
Broken vs fixed pattern
| Broken | Fixed |
|---|---|
| Free-form assistant output | Strictly constrained structured output |
| Downstream parser assumes JSON | Output validated before parsing |
| Prompt says “return JSON” but no enforcement | Prompt + parser + fallback handling |
# BROKEN
import json
from autogen import AssistantAgent, UserProxyAgent
assistant = AssistantAgent(
name="assistant",
llm_config={"config_list": [{"model": "gpt-4o-mini", "api_key": "YOUR_KEY"}]},
)
user_proxy = UserProxyAgent(name="user")
msg = user_proxy.initiate_chat(
assistant,
message="Return customer risk data as JSON only."
)
# This breaks if the model adds markdown fences or extra text.
data = json.loads(msg.chat_history[-1]["content"])
print(data["risk_score"])
# FIXED
import json
from autogen import AssistantAgent, UserProxyAgent
assistant = AssistantAgent(
name="assistant",
llm_config={"config_list": [{"model": "gpt-4o-mini", "api_key": "YOUR_KEY"}]},
)
user_proxy = UserProxyAgent(name="user")
msg = user_proxy.initiate_chat(
assistant,
message=(
"Return ONLY valid JSON with keys: risk_score (int), reason (string). "
"No markdown, no commentary."
),
)
raw = msg.chat_history[-1]["content"].strip()
try:
data = json.loads(raw)
except json.JSONDecodeError:
raise ValueError(f"Assistant returned invalid JSON: {raw}")
print(data["risk_score"])
If you’re using AutoGen’s structured outputs or tools, make the contract explicit. Don’t rely on “please return JSON” and hope the model obeys under load.
Other Possible Causes
1) Mixed message types in group chat
If one agent emits a message shape that another agent can’t consume, the manager may fail while trying to scale messages across participants.
# Problematic: custom content shape sneaks into chat history
groupchat.messages.append({
"role": "assistant",
"content": {"risk_score": 7} # not a string in many flows
})
Fix it by keeping content consistently serializable:
groupchat.messages.append({
"role": "assistant",
"content": json.dumps({"risk_score": 7})
})
2) Tool/function output not matching declared schema
When using function calling, the tool result must match what your prompt and downstream code expect. A common failure is returning a plain string where your parser expects an object.
def get_policy_status(policy_id: str):
return "active" # too vague if downstream expects structured fields
Use a stable schema:
def get_policy_status(policy_id: str):
return {
"policy_id": policy_id,
"status": "active",
"effective_date": "2026-01-01"
}
3) Prompt injection from previous turns
A previous assistant turn may include markdown fences, explanation text, or a half-finished object. When the next step tries to parse it, you get a scaling/parsing failure.
Here is the result:
```json
{"score": 9}
That looks harmless to humans. It breaks strict parsers expecting raw JSON only.
### 4) Model configuration mismatch
If one agent uses a model that supports tool calls and another doesn’t, you can get inconsistent behavior during orchestration.
```python
llm_config = {
"config_list": [
{"model": "gpt-4o-mini", "api_key": "..."},
{"model": "text-davinci-003", "api_key": "..."} # incompatible with your flow
]
}
Keep models aligned across agents in the same workflow unless you’ve tested mixed capability behavior.
How to Debug It
- •
Print the exact raw assistant output
- •Don’t inspect the parsed object first.
- •Log
chat_history[-1]["content"]before anyjson.loads()or schema validation.
- •
Check whether the failure happens before or after tool execution
- •If it fails immediately after an assistant response, it’s usually formatting.
- •If it fails after a tool call, inspect the tool return value and serialization.
- •
Reduce to two agents
- •Strip out
GroupChatManager, nested chats, and extra tools. - •Reproduce with one
AssistantAgentand oneUserProxyAgent.
- •Strip out
- •
Validate against the exact expected schema
- •If you expect:
then reject anything else.{"risk_score": 5, "reason": "..."} - •Add strict checks before passing data into the next agent.
- •If you expect:
Prevention
- •
Make output contracts explicit in prompts and enforce them in code.
- •If you need JSON, validate JSON.
- •If you need fields, validate fields.
- •
Keep agent message content consistent.
- •Use strings for chat content unless your flow explicitly supports richer objects.
- •Serialize dictionaries with
json.dumps()before storing them in message history.
- •
Add guardrails around every parsing boundary.
- •Parse once.
- •Validate once.
- •Fail fast with a useful error message instead of letting AutoGen scale bad content through the workflow.
If you’re seeing output parsing error when scaling, assume the problem is not “AutoGen being flaky.” In most cases it’s a contract mismatch between agents, tools, and parsers. Fix that contract first.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit