How to Fix 'output parsing error in production' in CrewAI (Python)
What this error means
output parsing error in production usually means CrewAI got a response back from an agent, but couldn’t convert it into the structured output your task expected. In practice, this shows up when you use output_json, output_pydantic, or a task description that implies a strict format, and the LLM returns extra text, invalid JSON, or fields that don’t match the schema.
The stack trace often includes something like:
ValueError: Invalid response format
OutputParserException: Could not parse LLM output into expected schema
The Most Common Cause
The #1 cause is asking the model for structured output, then giving it room to freestyle. CrewAI is strict here: if your task expects JSON or a Pydantic model, the response must match exactly.
Here’s the broken pattern:
from crewai import Agent, Task, Crew
from pydantic import BaseModel
class LeadScore(BaseModel):
company: str
score: int
researcher = Agent(
role="Researcher",
goal="Analyze leads",
backstory="You are precise and concise."
)
task = Task(
description="""
Analyze this lead and return JSON with company and score.
Also explain your reasoning.
Lead: Acme Corp
""",
agent=researcher,
output_pydantic=LeadScore,
)
And here’s the fixed version:
from crewai import Agent, Task, Crew
from pydantic import BaseModel, Field
class LeadScore(BaseModel):
company: str = Field(..., description="Company name")
score: int = Field(..., ge=0, le=100)
researcher = Agent(
role="Researcher",
goal="Analyze leads",
backstory="You return only valid structured data."
)
task = Task(
description="""
Analyze this lead and return ONLY valid JSON matching this schema:
{
"company": "Acme Corp",
"score": 85
}
Rules:
- No markdown
- No explanation
- No extra keys
""",
agent=researcher,
output_pydantic=LeadScore,
)
The difference is simple:
- •The broken version asks for JSON and explanation at the same time
- •The fixed version forces one output shape only
- •The schema is explicit enough for the model to follow
If you want a stronger guardrail, use an explicit parser-friendly instruction like “return ONLY valid JSON” and keep the task description short.
Other Possible Causes
1. Extra prose around valid JSON
This is common when the model returns something like:
Sure — here is the result:
{"company":"Acme Corp","score":85}
CrewAI will often fail because of the leading text.
Fix it by tightening the prompt:
task = Task(
description="Return ONLY JSON. No prose. No markdown.",
agent=researcher,
output_json=LeadScore.model_json_schema(),
)
2. Schema mismatch with output_pydantic
If your Pydantic model says score: int, but the LLM returns "score": "high", parsing fails.
Broken:
class LeadScore(BaseModel):
company: str
score: int
# Model returns:
# {"company": "Acme Corp", "score": "high"}
Fixed:
class LeadScore(BaseModel):
company: str
score: int = Field(..., description="Integer from 0 to 100")
If the field can be uncertain, make it explicit:
from typing import Optional
class LeadScore(BaseModel):
company: str
score: Optional[int] = None
3. Using tools that inject non-JSON output
Some tools return verbose text or logs that get mixed into the final answer. If your agent uses tools, keep tool outputs separate from final structured responses.
Example:
agent = Agent(
role="Analyst",
goal="Summarize customer data",
backstory="Return only structured outputs.",
tools=[some_tool],
)
If some_tool emits raw text like:
Tool result: customer segment = enterprise; confidence = high
the agent may echo that instead of clean JSON. Make tool instructions explicit and keep final response formatting strict.
4. Too much context in one task
Long prompts increase drift. If you stuff instructions, examples, business rules, and edge cases into one task, the model may ignore formatting constraints.
Bad pattern:
Task(
description="""
Analyze churn risk.
Use these 12 business rules...
Consider all CRM fields...
Return JSON with exact keys...
Also provide reasoning in plain English...
""",
)
Better pattern:
Task(
description="""
Analyze churn risk for one customer record.
Return ONLY JSON with:
- customer_id
- churn_risk_score
- reason_code
No reasoning text outside JSON.
""",
)
How to Debug It
- •
Print the raw LLM output before parsing
In CrewAI runs, inspect what came back from the agent before it hit
output_pydanticoroutput_json. You’re looking for extra text, code fences, or malformed JSON. - •
Remove structure enforcement temporarily
Run the same task without
output_pydanticoroutput_json. If the model response looks fine as plain text but breaks under parsing, you’ve confirmed it’s a format issue. - •
Validate against your schema locally
Take the raw string and run it through Pydantic yourself:
from pydantic import ValidationError try: parsed = LeadScore.model_validate_json(raw_output) except ValidationError as e: print(e)This tells you whether the issue is invalid JSON or schema mismatch.
- •
Reduce prompt complexity
Strip your task down to one instruction:
- •input data
- •required fields
- •exact output format
If it starts working after simplification, your prompt was causing format drift.
Prevention
- •Use strict schemas with clear field types and constraints in
BaseModel - •Tell agents to return only valid JSON when using
output_pydanticoroutput_json - •Keep tasks small and avoid mixing “explain your reasoning” with structured output requirements
If you want reliability in production crews, treat formatting as an API contract. The model is not “trying its best”; it either matches your schema or it doesn’t.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit