How to Fix 'JSON parsing error in production' in LlamaIndex (Python)

By Cyprian AaronsUpdated 2026-04-21

json-parsing-error-in-productionllamaindexpython

What this error usually means

JSON parsing error in production in LlamaIndex almost always means one of your components expected structured JSON, but the model returned something else: extra prose, malformed JSON, truncated output, or a tool call payload that didn’t match the schema.

You’ll typically see it when using StructuredOutputParser, PydanticProgramExtractor, function calling, or any agent workflow that depends on strict JSON from the LLM.

The Most Common Cause

The #1 cause is asking the model for JSON without forcing a strict structured-output path. In practice, people prompt for “return JSON” and then parse the raw text with json.loads(), or they use a plain LLM.predict() call where the model adds markdown fences, explanations, or trailing commas.

Here’s the broken pattern and the fixed pattern side by side:

Broken	Fixed
```python
from llama_index.llms.openai import OpenAI
import json

llm = OpenAI(model="gpt-4o-mini")

prompt = """ Extract customer details as JSON: name, policy_number, claim_amount """

response = llm.complete(prompt) data = json.loads(response.text) # JSONDecodeError in production |python from pydantic import BaseModel from llama_index.core.program import LLMTextCompletionProgram from llama_index.llms.openai import OpenAI

class CustomerClaim(BaseModel): name: str policy_number: str claim_amount: float

llm = OpenAI(model="gpt-4o-mini")

program = LLMTextCompletionProgram.from_defaults( output_cls=CustomerClaim, prompt_template_str=( "Extract customer details from this text:\n" "{input}\n" ), llm=llm, )

result = program(input="Jane Doe, policy number P-1234, claim amount 1200.50") print(result.model_dump())


The broken version fails because `response.text` is not guaranteed to be pure JSON. In production you’ll often see errors like:

- `json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)`
- `ValueError: Invalid JSON format`
- `ValidationError` from Pydantic after partial parsing

If you need structured output, use LlamaIndex’s schema-backed APIs instead of hand-parsing raw text.

## Other Possible Causes

### 1) The model returned markdown fences or extra text

This is common when your prompt says “return only JSON,” but the model still wraps it in code fences.

```python
raw = """
```json
{"name":"Jane Doe","policy_number":"P-1234","claim_amount":1200.5}

"""


Fix by stripping fences before parsing, or better, use structured output classes so you never parse raw text manually.

### 2) Truncated responses from token limits

If the response gets cut off mid-object, parsing fails immediately.

```python
from llama_index.core import Settings

Settings.llm.max_tokens = 64  # too low for your schema-rich output

Typical symptom:

•JSONDecodeError: Unterminated string starting at...
•Output ends halfway through an object

Increase token budget or reduce schema size. If you’re using agents with multi-step tool calls, watch cumulative context growth.

3) Tool/function schema mismatch

When using tool calling with FunctionTool, OpenAIAgent, or similar classes, the function signature must match what the runtime expects.

from llama_index.core.tools import FunctionTool

def create_claim(name: str, policy_number: str):
    return {"ok": True}

tool = FunctionTool.from_defaults(fn=create_claim)

If your function returns non-serializable objects like dataclasses, custom classes without .dict()/.model_dump(), or bytes, downstream JSON serialization can fail.

4) Provider-specific response formatting issues

Some providers are stricter than others. A config that works with one model may fail with another because of different tool-call formatting or unsupported structured-output behavior.

from llama_index.llms.anthropic import Anthropic

llm = Anthropic(model="claude-3-5-sonnet-latest")

If you switch providers and start seeing parsing failures, check whether that provider supports the exact structured-output mode you’re using in LlamaIndex.

How to Debug It

•
Print the raw model output before parsing
- •Log response.text or the agent/tool payload.
- •Look for markdown fences, commentary, missing braces, or truncation.
•
Check which LlamaIndex class is failing
- •If it’s LLMTextCompletionProgram, inspect the prompt and schema.
- •If it’s PydanticProgramExtractor, verify the source text contains enough signal.
- •If it’s an agent path like OpenAIAgent, inspect tool call arguments and returned values.
•
Validate against a minimal schema
- •Replace your full Pydantic model with two fields.
- •If that works, your original schema is too large or ambiguous.
•
Run the same input outside production
- •Replay one failing request locally with identical prompt and model settings.
- •Compare temperature, max tokens, provider version, and system prompt.

A practical debug loop looks like this:

print("RAW OUTPUT:")
print(response.text)

try:
    parsed = MySchema.model_validate_json(response.text)
except Exception as e:
    print(type(e).__name__, str(e))

That tells you whether this is a prompt issue, a truncation issue, or a serialization issue.

Prevention

•Use schema-backed generation paths like LLMTextCompletionProgram or Pydantic-based extractors instead of manual json.loads() on free-form completions.
•Keep prompts explicit: say what fields to return, forbid extra prose, and set temperature low for extraction tasks.
•
Add a contract test for every production parser:
- •feed it real sample inputs
- •assert valid JSON/schema compliance
- •fail CI if the model output drifts

If you’re building bank or insurance workflows, treat LLM output like an untrusted integration boundary. Parse defensively, validate strictly, and log raw outputs whenever a parse step fails.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit