How to Fix 'output parsing error' in LangChain (Python)
When LangChain throws an output parsing error, it means the model returned text that your parser could not convert into the structure your code expected. In practice, this usually happens when you ask an LLM for JSON, a Pydantic object, or a specific schema, and the model returns extra prose, malformed JSON, or the wrong fields.
You’ll see this most often with StructuredOutputParser, PydanticOutputParser, agents, and chains that rely on OutputParserException.
The Most Common Cause
The #1 cause is asking the model for structured output but not constraining it tightly enough. The model then responds with natural language instead of valid JSON or the exact schema your parser expects.
Here’s the broken pattern:
| Broken | Fixed |
|---|---|
| You tell the model “return JSON” in plain English | You pass format instructions from the parser |
| You parse raw text directly | You make the prompt match the parser contract |
from langchain_openai import ChatOpenAI
from langchain_core.prompts import PromptTemplate
from langchain.output_parsers import PydanticOutputParser
from pydantic import BaseModel, Field
class UserInfo(BaseModel):
name: str = Field(description="User full name")
age: int = Field(description="User age")
parser = PydanticOutputParser(pydantic_object=UserInfo)
# BROKEN
broken_prompt = PromptTemplate.from_template(
"Extract the user's name and age from this text:\n{text}\nReturn JSON."
)
# FIXED
fixed_prompt = PromptTemplate.from_template(
"Extract the user's name and age from this text.\n"
"{format_instructions}\n"
"Text: {text}"
)
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
# BROKEN usage
broken_chain = broken_prompt | llm | parser
# FIXED usage
fixed_chain = fixed_prompt.partial(
format_instructions=parser.get_format_instructions()
) | llm | parser
text = "John Doe is 34 years old."
# This often raises:
# langchain_core.exceptions.OutputParserException:
# Failed to parse UserInfo from completion ...
result = fixed_chain.invoke({"text": text})
print(result)
The important part is parser.get_format_instructions(). Without it, you are hoping the model guesses your schema correctly. In production, that’s not a strategy.
Other Possible Causes
1. The model returned extra text around valid JSON
This is common when the response starts with explanations like “Sure, here is the JSON:”.
# BROKEN
response = """
Sure — here is the result:
{"name": "John Doe", "age": 34}
"""
# FIXED: enforce raw structured output only
prompt = PromptTemplate.from_template(
"{format_instructions}\n{text}"
)
If you’re using JsonOutputParser or PydanticOutputParser, any extra prefix/suffix can trigger:
- •
OutputParserException - •
Invalid json output - •
Expecting value: line 1 column 1
2. Your schema does not match what the model can infer
If your Pydantic model requires fields that aren’t present in the source text, parsing fails.
class Invoice(BaseModel):
invoice_id: str
total_amount: float
currency: str # required
# If the source text doesn't mention currency,
# parsing can fail because currency is missing.
Fix it by making fields optional when appropriate:
from typing import Optional
class Invoice(BaseModel):
invoice_id: str
total_amount: float
currency: Optional[str] = None
3. Tool / agent output is being parsed as if it were plain completion text
Agents often return intermediate tool calls or final answers in a format different from what your parser expects.
from langchain.agents import AgentExecutor
# BROKEN pattern:
# Parsing agent output with a strict JSON parser after tool use.
If you are using agents, inspect whether you want:
- •final answer text only
- •structured tool call output
- •intermediate steps
Do not attach a strict output parser to an agent unless you know exactly what it emits.
4. Temperature is too high for structured extraction
A high temperature increases formatting drift.
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.7) # more likely to break parsing
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0) # better for extraction
For extraction tasks, keep temperature at 0 unless you have a very specific reason not to.
How to Debug It
- •Print the raw LLM response before parsing
- •Don’t guess.
- •Inspect exactly what came back from the model.
raw = (prompt | llm).invoke({"text": "John Doe is 34 years old."})
print(raw.content if hasattr(raw, "content") else raw)
- •
Check whether your prompt includes parser instructions
- •If you’re using
PydanticOutputParser, confirmget_format_instructions()is injected into the prompt. - •Missing instructions are one of the most common causes of malformed output.
- •If you’re using
- •
Validate your schema against real inputs
- •Look at required fields.
- •Make optional anything that may be absent in source text.
- •If possible, test with known examples before wiring into a chain.
- •
Reduce complexity
- •Remove tools, memory, retries, and multi-step prompts.
- •Reproduce with a minimal chain first.
chain = prompt.partial(
format_instructions=parser.get_format_instructions()
) | llm | parser
print(chain.invoke({"text": sample_text}))
If that works but your full app fails, the issue is usually not LangChain itself. It’s usually surrounding prompt logic, agent behavior, or post-processing.
Prevention
- •Use explicit format instructions from
PydanticOutputParserorStructuredOutputParser. - •Keep extraction prompts short and deterministic; set
temperature=0. - •Make schemas realistic:
- •required only for fields guaranteed to exist
- •optional for everything else
- •Log raw responses in non-production environments so you can see parse failures immediately.
- •If you need strict JSON every time, prefer models and APIs that support native structured output instead of relying only on prompt discipline.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit