How to Fix 'JSON parsing error' in LangChain (Python)

By Cyprian AaronsUpdated 2026-04-21
json-parsing-errorlangchainpython

If you’re seeing JSON parsing error in LangChain, it usually means the model returned text that your parser expected to be valid JSON, but it wasn’t. This shows up a lot with JsonOutputParser, structured outputs, and agents that expect tool-call style responses.

In practice, the failure is rarely “JSON is broken” in the abstract. It’s usually one of a few concrete issues: the model added extra prose, returned malformed JSON, or your prompt didn’t constrain the output tightly enough.

The Most Common Cause

The #1 cause is asking an LLM for JSON without enforcing a strict schema or format instructions.

A typical failure looks like this:

BrokenFixed
Model returns free-form text plus JSONModel is explicitly instructed to return only valid JSON
Parser receives invalid payloadParser receives schema-aligned output

Broken pattern

from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import JsonOutputParser
from langchain_core.prompts import PromptTemplate

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

parser = JsonOutputParser()

prompt = PromptTemplate(
    template="Return customer info as JSON: {text}",
    input_variables=["text"],
)

chain = prompt | llm | parser

result = chain.invoke({"text": "Alice is 32 and lives in Nairobi."})
print(result)

This often fails with errors like:

  • langchain_core.exceptions.OutputParserException: Invalid json output
  • JSONDecodeError: Expecting value
  • Could not parse LLM output

The issue is that the model may respond with:

Sure — here you go:
{"name": "Alice", "age": 32, "city": "Nairobi"}

That leading text breaks strict JSON parsing.

Fixed pattern

from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import JsonOutputParser
from langchain_core.prompts import PromptTemplate

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
parser = JsonOutputParser()

prompt = PromptTemplate(
    template="""
Return ONLY valid JSON.
Do not include markdown, code fences, or commentary.

{format_instructions}

Input: {text}
""",
    input_variables=["text"],
    partial_variables={"format_instructions": parser.get_format_instructions()},
)

chain = prompt | llm | parser

result = chain.invoke({"text": "Alice is 32 and lives in Nairobi."})
print(result)

If you’re using structured data, this is even better:

from pydantic import BaseModel, Field
from langchain_openai import ChatOpenAI

class Customer(BaseModel):
    name: str
    age: int = Field(ge=0)
    city: str

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
structured_llm = llm.with_structured_output(Customer)

result = structured_llm.invoke("Alice is 32 and lives in Nairobi.")
print(result)

That removes most of the parsing ambiguity.

Other Possible Causes

1) The model wrapped JSON in markdown fences

Some models return fenced blocks even when asked not to.

# Broken output:
```json
{"name":"Alice"}

If your parser expects raw JSON, those backticks can trigger:

- `JSONDecodeError`
- `OutputParserException: Invalid json output`

Fix by tightening instructions or stripping fences before parsing.

### 2) Temperature is too high

Higher temperature increases formatting drift.

```python
# Risky
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.7)

# Safer
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

For parsing tasks, keep temperature at 0 unless you have a strong reason not to.

3) You used the wrong parser for the output shape

JsonOutputParser expects valid JSON. If your agent/tool chain emits partial tool calls or non-JSON text, use the right abstraction.

# Wrong if output isn't strict JSON
parser = JsonOutputParser()

# Better for schema-bound outputs
structured_llm = llm.with_structured_output(Customer)

If you are using tools, make sure you are not trying to parse natural language as if it were JSON.

4) Your prompt includes examples that are invalid JSON

This happens more than people think. A single trailing comma or comment in an example can confuse the model.

# Bad prompt example
"""
Example:
{
  "name": "Alice",
  "age": 32,
}
"""

JSON does not allow trailing commas. If your few-shot examples are malformed, the model may imitate them.

How to Debug It

  1. Print the raw model response before parsing

    raw = llm.invoke("Return customer info as JSON")
    print(raw.content)
    

    If you see prose, markdown fences, or malformed syntax, the bug is upstream of the parser.

  2. Remove the parser temporarily Run only prompt | llm and inspect exact output. If raw output is bad, fix prompting or structured output first.

  3. Check the exception class Look for:

    • langchain_core.exceptions.OutputParserException
    • json.decoder.JSONDecodeError
    • provider-specific tool call validation errors

    The exception type tells you whether this is prompt formatting, schema mismatch, or transport-level corruption.

  4. Test with a minimal schema Reduce your target shape to something tiny:

    class TestModel(BaseModel):
        name: str
    

    If that works, your full schema may be too complex or inconsistent with the prompt.

Prevention

  • Use with_structured_output() or function/tool calling when you need machine-readable data.
  • Set temperature=0 for all extraction and classification chains.
  • Always include explicit format instructions from JsonOutputParser.get_format_instructions().
  • Log raw LLM output in non-production environments so you can see exactly what broke.
  • Keep prompts free of malformed JSON examples and avoid mixing prose with expected structured output.

If you want reliable parsing in LangChain, don’t treat JSON as a “best effort” format. Make it a contract between your prompt and your parser.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides