How to Fix 'output parsing error' in LangChain (Python)

By Cyprian AaronsUpdated 2026-04-21
output-parsing-errorlangchainpython

When LangChain throws an output parsing error, it means the model returned text that your parser could not convert into the structure your code expected. In practice, this usually happens when you ask an LLM for JSON, a Pydantic object, or a specific schema, and the model returns extra prose, malformed JSON, or the wrong fields.

You’ll see this most often with StructuredOutputParser, PydanticOutputParser, agents, and chains that rely on OutputParserException.

The Most Common Cause

The #1 cause is asking the model for structured output but not constraining it tightly enough. The model then responds with natural language instead of valid JSON or the exact schema your parser expects.

Here’s the broken pattern:

BrokenFixed
You tell the model “return JSON” in plain EnglishYou pass format instructions from the parser
You parse raw text directlyYou make the prompt match the parser contract
from langchain_openai import ChatOpenAI
from langchain_core.prompts import PromptTemplate
from langchain.output_parsers import PydanticOutputParser
from pydantic import BaseModel, Field

class UserInfo(BaseModel):
    name: str = Field(description="User full name")
    age: int = Field(description="User age")

parser = PydanticOutputParser(pydantic_object=UserInfo)

# BROKEN
broken_prompt = PromptTemplate.from_template(
    "Extract the user's name and age from this text:\n{text}\nReturn JSON."
)

# FIXED
fixed_prompt = PromptTemplate.from_template(
    "Extract the user's name and age from this text.\n"
    "{format_instructions}\n"
    "Text: {text}"
)

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

# BROKEN usage
broken_chain = broken_prompt | llm | parser

# FIXED usage
fixed_chain = fixed_prompt.partial(
    format_instructions=parser.get_format_instructions()
) | llm | parser

text = "John Doe is 34 years old."

# This often raises:
# langchain_core.exceptions.OutputParserException:
# Failed to parse UserInfo from completion ...
result = fixed_chain.invoke({"text": text})
print(result)

The important part is parser.get_format_instructions(). Without it, you are hoping the model guesses your schema correctly. In production, that’s not a strategy.

Other Possible Causes

1. The model returned extra text around valid JSON

This is common when the response starts with explanations like “Sure, here is the JSON:”.

# BROKEN
response = """
Sure — here is the result:
{"name": "John Doe", "age": 34}
"""

# FIXED: enforce raw structured output only
prompt = PromptTemplate.from_template(
    "{format_instructions}\n{text}"
)

If you’re using JsonOutputParser or PydanticOutputParser, any extra prefix/suffix can trigger:

  • OutputParserException
  • Invalid json output
  • Expecting value: line 1 column 1

2. Your schema does not match what the model can infer

If your Pydantic model requires fields that aren’t present in the source text, parsing fails.

class Invoice(BaseModel):
    invoice_id: str
    total_amount: float
    currency: str  # required

# If the source text doesn't mention currency,
# parsing can fail because currency is missing.

Fix it by making fields optional when appropriate:

from typing import Optional

class Invoice(BaseModel):
    invoice_id: str
    total_amount: float
    currency: Optional[str] = None

3. Tool / agent output is being parsed as if it were plain completion text

Agents often return intermediate tool calls or final answers in a format different from what your parser expects.

from langchain.agents import AgentExecutor

# BROKEN pattern:
# Parsing agent output with a strict JSON parser after tool use.

If you are using agents, inspect whether you want:

  • final answer text only
  • structured tool call output
  • intermediate steps

Do not attach a strict output parser to an agent unless you know exactly what it emits.

4. Temperature is too high for structured extraction

A high temperature increases formatting drift.

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.7)  # more likely to break parsing

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)      # better for extraction

For extraction tasks, keep temperature at 0 unless you have a very specific reason not to.

How to Debug It

  1. Print the raw LLM response before parsing
    • Don’t guess.
    • Inspect exactly what came back from the model.
raw = (prompt | llm).invoke({"text": "John Doe is 34 years old."})
print(raw.content if hasattr(raw, "content") else raw)
  1. Check whether your prompt includes parser instructions

    • If you’re using PydanticOutputParser, confirm get_format_instructions() is injected into the prompt.
    • Missing instructions are one of the most common causes of malformed output.
  2. Validate your schema against real inputs

    • Look at required fields.
    • Make optional anything that may be absent in source text.
    • If possible, test with known examples before wiring into a chain.
  3. Reduce complexity

    • Remove tools, memory, retries, and multi-step prompts.
    • Reproduce with a minimal chain first.
chain = prompt.partial(
    format_instructions=parser.get_format_instructions()
) | llm | parser

print(chain.invoke({"text": sample_text}))

If that works but your full app fails, the issue is usually not LangChain itself. It’s usually surrounding prompt logic, agent behavior, or post-processing.

Prevention

  • Use explicit format instructions from PydanticOutputParser or StructuredOutputParser.
  • Keep extraction prompts short and deterministic; set temperature=0.
  • Make schemas realistic:
    • required only for fields guaranteed to exist
    • optional for everything else
  • Log raw responses in non-production environments so you can see parse failures immediately.
  • If you need strict JSON every time, prefer models and APIs that support native structured output instead of relying only on prompt discipline.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides