How to Fix 'JSON parsing error' in LangChain (Python)
If you’re seeing JSON parsing error in LangChain, it usually means the model returned text that your parser expected to be valid JSON, but it wasn’t. This shows up a lot with JsonOutputParser, structured outputs, and agents that expect tool-call style responses.
In practice, the failure is rarely “JSON is broken” in the abstract. It’s usually one of a few concrete issues: the model added extra prose, returned malformed JSON, or your prompt didn’t constrain the output tightly enough.
The Most Common Cause
The #1 cause is asking an LLM for JSON without enforcing a strict schema or format instructions.
A typical failure looks like this:
| Broken | Fixed |
|---|---|
| Model returns free-form text plus JSON | Model is explicitly instructed to return only valid JSON |
| Parser receives invalid payload | Parser receives schema-aligned output |
Broken pattern
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import JsonOutputParser
from langchain_core.prompts import PromptTemplate
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
parser = JsonOutputParser()
prompt = PromptTemplate(
template="Return customer info as JSON: {text}",
input_variables=["text"],
)
chain = prompt | llm | parser
result = chain.invoke({"text": "Alice is 32 and lives in Nairobi."})
print(result)
This often fails with errors like:
- •
langchain_core.exceptions.OutputParserException: Invalid json output - •
JSONDecodeError: Expecting value - •
Could not parse LLM output
The issue is that the model may respond with:
Sure — here you go:
{"name": "Alice", "age": 32, "city": "Nairobi"}
That leading text breaks strict JSON parsing.
Fixed pattern
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import JsonOutputParser
from langchain_core.prompts import PromptTemplate
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
parser = JsonOutputParser()
prompt = PromptTemplate(
template="""
Return ONLY valid JSON.
Do not include markdown, code fences, or commentary.
{format_instructions}
Input: {text}
""",
input_variables=["text"],
partial_variables={"format_instructions": parser.get_format_instructions()},
)
chain = prompt | llm | parser
result = chain.invoke({"text": "Alice is 32 and lives in Nairobi."})
print(result)
If you’re using structured data, this is even better:
from pydantic import BaseModel, Field
from langchain_openai import ChatOpenAI
class Customer(BaseModel):
name: str
age: int = Field(ge=0)
city: str
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
structured_llm = llm.with_structured_output(Customer)
result = structured_llm.invoke("Alice is 32 and lives in Nairobi.")
print(result)
That removes most of the parsing ambiguity.
Other Possible Causes
1) The model wrapped JSON in markdown fences
Some models return fenced blocks even when asked not to.
# Broken output:
```json
{"name":"Alice"}
If your parser expects raw JSON, those backticks can trigger:
- `JSONDecodeError`
- `OutputParserException: Invalid json output`
Fix by tightening instructions or stripping fences before parsing.
### 2) Temperature is too high
Higher temperature increases formatting drift.
```python
# Risky
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.7)
# Safer
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
For parsing tasks, keep temperature at 0 unless you have a strong reason not to.
3) You used the wrong parser for the output shape
JsonOutputParser expects valid JSON. If your agent/tool chain emits partial tool calls or non-JSON text, use the right abstraction.
# Wrong if output isn't strict JSON
parser = JsonOutputParser()
# Better for schema-bound outputs
structured_llm = llm.with_structured_output(Customer)
If you are using tools, make sure you are not trying to parse natural language as if it were JSON.
4) Your prompt includes examples that are invalid JSON
This happens more than people think. A single trailing comma or comment in an example can confuse the model.
# Bad prompt example
"""
Example:
{
"name": "Alice",
"age": 32,
}
"""
JSON does not allow trailing commas. If your few-shot examples are malformed, the model may imitate them.
How to Debug It
- •
Print the raw model response before parsing
raw = llm.invoke("Return customer info as JSON") print(raw.content)If you see prose, markdown fences, or malformed syntax, the bug is upstream of the parser.
- •
Remove the parser temporarily Run only
prompt | llmand inspect exact output. If raw output is bad, fix prompting or structured output first. - •
Check the exception class Look for:
- •
langchain_core.exceptions.OutputParserException - •
json.decoder.JSONDecodeError - •provider-specific tool call validation errors
The exception type tells you whether this is prompt formatting, schema mismatch, or transport-level corruption.
- •
- •
Test with a minimal schema Reduce your target shape to something tiny:
class TestModel(BaseModel): name: strIf that works, your full schema may be too complex or inconsistent with the prompt.
Prevention
- •Use
with_structured_output()or function/tool calling when you need machine-readable data. - •Set
temperature=0for all extraction and classification chains. - •Always include explicit format instructions from
JsonOutputParser.get_format_instructions(). - •Log raw LLM output in non-production environments so you can see exactly what broke.
- •Keep prompts free of malformed JSON examples and avoid mixing prose with expected structured output.
If you want reliable parsing in LangChain, don’t treat JSON as a “best effort” format. Make it a contract between your prompt and your parser.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit