How to Fix 'JSON parsing error in production' in LangChain (Python)

By Cyprian AaronsUpdated 2026-04-21

json-parsing-error-in-productionlangchainpython

When you see JSON parsing error in production in a LangChain Python app, it usually means one thing: an LLM returned text that your parser expected to be valid JSON, but it wasn’t. This typically shows up when using JsonOutputParser, StructuredOutputParser, tool calling wrappers, or any chain that assumes strict machine-readable output.

In production, this is rarely a model bug. It’s usually a prompt contract problem, a schema mismatch, or a streaming/transport issue that turns valid JSON into invalid JSON by the time your code reads it.

The Most Common Cause

The #1 cause is prompting the model to return JSON without actually enforcing it. The model adds markdown fences, extra commentary, trailing commas, or partial objects, and LangChain raises errors like:

•OutputParserException: Invalid json output
•json.decoder.JSONDecodeError: Expecting value
•langchain_core.exceptions.OutputParserException

Here’s the broken pattern and the fixed pattern side by side.

Broken	Fixed
```python
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import JsonOutputParser
from langchain_core.prompts import PromptTemplate

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0) parser = JsonOutputParser()

prompt = PromptTemplate.from_template( "Return a JSON object with name and age.\n{format_instructions}\nUser: {user_input}" )

chain = prompt | llm | parser

result = chain.invoke({ "user_input": "John is 32", "format_instructions": parser.get_format_instructions(), }) |python from pydantic import BaseModel, Field from langchain_openai import ChatOpenAI from langchain_core.output_parsers import PydanticOutputParser from langchain_core.prompts import PromptTemplate

class Person(BaseModel): name: str = Field(...) age: int = Field(...)

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0) parser = PydanticOutputParser(pydantic_object=Person)

prompt = PromptTemplate.from_template( "Extract the person data.\n{format_instructions}\nUser: {user_input}" )

chain = prompt | llm | parser

result = chain.invoke({ "user_input": "John is 32", "format_instructions": parser.get_format_instructions(), })


The broken version asks for JSON, but doesn’t constrain the schema enough. The fixed version uses `PydanticOutputParser`, which gives the model a tighter contract and gives you validation for free.

If you want even stronger guarantees, use OpenAI-style structured output where available:

```python
structured_llm = llm.with_structured_output(Person)
result = structured_llm.invoke("John is 32")

That removes a lot of parsing ambiguity.

Other Possible Causes

1) Markdown fences around the JSON

A lot of models return this:

{
  "name": "John",
  "age": 32
}

That looks fine to a human, but raw parsers often choke on the triple backticks.

# Broken output from model:
"""```json
{"name":"John","age":32}
```"""

# Fix: strip fences before parsing if you control the post-processing layer
import re, json

text = re.sub(r"^```json\s*|\s*```$", "", text.strip())
data = json.loads(text)

Better fix: stop asking for fenced JSON in the prompt.

2) Partial streaming output being parsed too early

If you parse while tokens are still streaming, you’ll get incomplete JSON and errors like:

•JSONDecodeError: Unterminated string starting at
•OutputParserException: Invalid json output

# Broken: parsing before stream completion
chunks = []
async for chunk in chain.astream(input_data):
    chunks.append(chunk)
    parser.parse("".join(chunks))  # too early

# Fix: parse only after full completion
full_text = ""
async for chunk in chain.astream(input_data):
    full_text += chunk.content

parsed = parser.parse(full_text)

If you need streaming + structure, use a parser designed for incremental handling rather than naive concatenation.

3) Tool/function call schema mismatch

With tool calling, the model may produce arguments that don’t match your expected schema. LangChain then throws validation-related failures rather than plain JSON errors.

from pydantic import BaseModel

class Ticket(BaseModel):
    id: int
    priority: str

# Broken if model emits {"id":"abc","priority":1}
structured_llm = llm.with_structured_output(Ticket)

Fix by tightening the schema and making field types explicit. If your upstream system sends strings for numeric IDs, coerce them before validation or change the schema to match reality.

4) Bad prompt formatting or variable injection

Sometimes your own template injects braces or malformed text into what should be valid JSON instructions.

# Broken: unescaped braces inside template content
template = """
Return JSON:
{{
  "query": "{user_query}"
}}
"""

# If user_query contains quotes/newlines/braces, parsing can fail downstream.

Fix by separating instructions from user content and never embedding raw user text inside a fake JSON example unless you escape it properly.

How to Debug It

•
Log the exact raw model output
- •Don’t log only parsed objects.
- •Capture the response before JsonOutputParser or PydanticOutputParser touches it.
- •You want to see whether the issue is fences, commentary, truncation, or invalid types.
•
Check whether the failure is parse-time or validation-time
- •json.decoder.JSONDecodeError means invalid JSON syntax.
- •pydantic.ValidationError means valid JSON with wrong field types/shape.
- •langchain_core.exceptions.OutputParserException often wraps both.
•
Disable streaming temporarily
- •If the error disappears when streaming is off, you’re probably parsing partial output.
- •Run the same prompt with .invoke() instead of .stream() or .astream().
•
Reduce temperature to zero and simplify the prompt
- •Set temperature=0.
- •Remove examples, extra prose, and nested formatting.
- •Test with one minimal input until you get stable output.

Prevention

•
Use schema-backed outputs instead of “please return JSON.”
- •Prefer PydanticOutputParser or .with_structured_output(...).
•
Keep user text out of JSON examples unless escaped.
- •Separate instructions from payload data.
•
Add a raw-output fallback path in production.
- •If parsing fails, store the raw completion for inspection and retry with stricter instructions.
•
Write one integration test per critical parser.
- •Feed it real prompts and assert on both valid and invalid outputs.

If this error only happens in production and not locally, assume one of three things first: longer inputs causing truncation, streaming being parsed too early, or prompts being altered by upstream middleware. Those are the usual suspects in LangChain Python systems.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit