How to Fix 'output parsing error' in AutoGen (Python)

By Cyprian AaronsUpdated 2026-04-21
output-parsing-errorautogenpython

What the error means

output parsing error in AutoGen usually means the framework asked an agent for a structured response, then failed to parse the model output into the format it expected. You’ll see this most often with AssistantAgent, UserProxyAgent, tool calls, JSON mode, or when you’ve attached an output schema and the LLM returns extra text.

In practice, this is rarely a model problem. It’s usually a prompt/format mismatch, a schema mismatch, or a tool-response shape that AutoGen can’t deserialize.

The Most Common Cause

The #1 cause is asking for structured output but letting the model return free-form text. AutoGen expects strict JSON or a tool-call payload, then gets something like markdown, explanation text, or malformed JSON.

Broken vs fixed pattern

BrokenFixed
Model returns prose plus JSONModel returns only valid JSON
Prompt says “respond in JSON” but doesn’t enforce itUse explicit format constraints and parseable schema
Tool/function signature doesn’t match returned shapeAlign schema, prompt, and function return type
# BROKEN: the assistant is asked for JSON, but nothing enforces it.
from autogen import AssistantAgent

assistant = AssistantAgent(
    name="assistant",
    llm_config={
        "config_list": [{"model": "gpt-4o-mini", "api_key": "YOUR_KEY"}],
        "temperature": 0,
    },
)

message = """
Return user profile as JSON:
name, age, country
"""

result = assistant.generate_reply(messages=[{"role": "user", "content": message}])
print(result)

A typical failure looks like this:

ValueError: output parsing error: Expecting value: line 1 column 1 (char 0)

Or in tool-driven flows:

autogen.oai.client_utils.OpenAIClientError: output parsing error

The fix is to make the output contract explicit and keep the response strictly machine-readable.

# FIXED: enforce strict JSON-shaped output in the prompt and keep temperature low.
from autogen import AssistantAgent

assistant = AssistantAgent(
    name="assistant",
    llm_config={
        "config_list": [{"model": "gpt-4o-mini", "api_key": "YOUR_KEY"}],
        "temperature": 0,
    },
)

message = """
Return ONLY valid JSON with these keys:
{
  "name": string,
  "age": number,
  "country": string
}
No markdown. No explanation.
"""

result = assistant.generate_reply(messages=[{"role": "user", "content": message}])
print(result)

If you’re using a schema-based setup, make sure the schema matches exactly what you expect back. Don’t ask for age as a number in one place and parse it as a string elsewhere.

Other Possible Causes

1) Tool/function signature mismatch

If your function expects one shape and your agent returns another, AutoGen will fail during parsing or tool dispatch.

# BROKEN
def create_ticket(title: str, priority: int):
    return {"ticket_id": "T-1001"}

# The model returns:
# {"title":"Login issue","priority":"high"}
# but priority must be int.

Fix by aligning types and constraints:

# FIXED
def create_ticket(title: str, priority: int):
    return {"ticket_id": "T-1001"}

# Prompt:
# priority must be one of: 1, 2, 3

2) Extra commentary around JSON

The model outputs something like:

Here is the result:
{"name":"Ada","age":34,"country":"KE"}

That breaks parsers expecting raw JSON. Tighten the instruction and remove examples that include prose.

llm_config = {
    "config_list": [{"model": "gpt-4o-mini", "api_key": "YOUR_KEY"}],
    "temperature": 0,
}

3) Invalid nested structure from multi-agent handoff

In GroupChat or routed workflows, one agent may emit text another agent expects as structured data. That’s common when an analyst agent talks to a worker agent through plain language instead of a contract.

# Example of bad handoff content
{
    "role": "assistant",
    "content": "Use customer_id=123 and status=open."
}

If downstream code expects JSON:

{"customer_id":123,"status":"open"}

you’ll get parsing failures.

4) Model settings encourage creative formatting

High temperature increases the chance of markdown wrappers, explanations, or malformed objects.

llm_config = {
    "config_list": [{"model": "gpt-4o-mini", "api_key": "YOUR_KEY"}],
    "temperature": 0.8,  # risky for structured output
}

For structured tasks, keep it at 0 or very close to it.

How to Debug It

  1. Print the raw model output

    • Don’t inspect only the parsed object.
    • Log exactly what AutoGen received before parsing.
  2. Check whether you’re expecting text or structure

    • If your code uses json.loads(), Pydantic validation, or tool dispatching, then free-form responses will fail.
    • Verify whether the agent is supposed to return plain chat text or machine-readable data.
  3. Validate your schema against real outputs

    • Compare field names, types, optional fields, and enums.
    • A single mismatch like "high" vs 3 can trigger parsing errors.
  4. Reduce variables

    • Set temperature=0.
    • Remove memory/retrieval layers temporarily.
    • Test with one message and one agent before reintroducing GroupChat, tools, or nested agents.

Prevention

  • Keep structured-output prompts strict:

    • “Return ONLY valid JSON”
    • “No markdown”
    • “No explanation”
  • Match prompt contract to parser contract:

    • If downstream code parses JSON, make sure upstream agents emit exact JSON.
    • If you use tools/functions, keep signatures and enums aligned.
  • Add validation early:

    • Validate with Pydantic or json.loads() right after generation.
    • Fail fast with clear logs instead of letting bad payloads flow deeper into your pipeline.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides