How to Fix 'JSON parsing error during development' in LlamaIndex (Python)

By Cyprian AaronsUpdated 2026-04-21
json-parsing-error-during-developmentllamaindexpython

When you see JSON parsing error during development in LlamaIndex, it usually means one of the components in your pipeline expected valid JSON but got plain text, malformed JSON, or an empty response instead. In practice, this shows up when you use structured outputs, tool calling, response synthesizers, or a custom LLM wrapper that returns the wrong format.

The fix is usually not in LlamaIndex itself. It’s almost always in the prompt, parser, model config, or a wrapper that’s stripping or corrupting the JSON before LlamaIndex can read it.

The Most Common Cause

The #1 cause is asking an LLM to return JSON without enforcing a strict schema or output format. LlamaIndex then tries to parse something like:

ResponseValidationError: Invalid JSON object: Expecting value at line 1 column 1

or

json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

This happens a lot with PydanticOutputParser, StructuredOutputParser, JSONQueryEngine, or any custom prompt that says “return JSON” but doesn’t constrain the model enough.

Broken vs fixed pattern

BrokenFixed
LLM returns free-form textLLM is forced into a schema
Parser expects JSONParser gets valid structured output
Prompt says “respond in JSON” onlyPrompt includes explicit formatting instructions
# BROKEN
from llama_index.core import Settings
from llama_index.core.prompts import PromptTemplate
from llama_index.core.output_parsers import PydanticOutputParser
from pydantic import BaseModel

class TicketSummary(BaseModel):
    priority: str
    category: str

parser = PydanticOutputParser(output_cls=TicketSummary)

prompt = PromptTemplate(
    "Summarize this support ticket and return JSON:\n{ticket}"
)

response = llm.predict(prompt, ticket="Customer can't log in.")
data = parser.parse(response)  # often fails with JSONDecodeError
# FIXED
from llama_index.core.prompts import PromptTemplate
from llama_index.core.output_parsers import PydanticOutputParser
from pydantic import BaseModel

class TicketSummary(BaseModel):
    priority: str
    category: str

parser = PydanticOutputParser(output_cls=TicketSummary)

prompt = PromptTemplate(
    "Summarize this support ticket.\n"
    "{format_instructions}\n"
    "Ticket: {ticket}"
)

response = llm.predict(
    prompt,
    ticket="Customer can't log in.",
    format_instructions=parser.format_string(),
)
data = parser.parse(response)

If you’re using an OpenAI-compatible chat model through LlamaIndex, prefer structured output APIs where available instead of hoping the model follows a loose instruction.

Other Possible Causes

1) Your custom LLM wrapper strips the raw JSON

If you wrapped CustomLLM or FunctionCallingLLM, check whether your wrapper returns only part of the response or adds extra commentary.

# broken
class MyLLM(CustomLLM):
    def complete(self, prompt, **kwargs):
        raw = call_model(prompt)
        return raw["text"]  # may include markdown fences or preamble

# fixed
class MyLLM(CustomLLM):
    def complete(self, prompt, **kwargs):
        raw = call_model(prompt)
        return raw["content"].strip()

If the model returns:

{"priority":"high","category":"auth"}

but your wrapper turns it into:

Sure — here is the JSON:
```json
{"priority":"high","category":"auth"}

the parser will fail unless you strip fences first.

### 2) You are passing non-JSON text into a parser expecting JSON

This often happens with `StructuredOutputParser`, `PydanticOutputParser`, or custom extraction logic.

```python
# broken input to parser
response_text = "Priority: high, Category: auth"
parser.parse(response_text)

Fix it by either:

  • changing the prompt to emit valid JSON only
  • or using a looser text parser if you do not need strict structure
# fixed input to parser
response_text = '{"priority": "high", "category": "auth"}'
parser.parse(response_text)

3) Your tool/function calling setup is mismatched

LlamaIndex agents can fail when the model supports tool calling but your wrapper/config doesn’t expose it correctly. You’ll often see errors around FunctionAgent, ReActAgent, or tool metadata serialization.

# broken: model does not actually support function calling here
llm = OpenAI(model="gpt-3.5-turbo")  # depending on config/version, may not be enough for tools

# fixed: use a tool-capable model/config and proper agent class
llm = OpenAI(model="gpt-4o-mini", temperature=0)
agent = FunctionAgent.from_tools(tools=[...], llm=llm)

If the agent expects tool-call JSON and gets plain assistant text instead, parsing breaks downstream.

4) The response contains invalid escape characters or truncated JSON

This happens when responses are cut off by token limits or contain unescaped quotes/newlines from source data.

# broken example from truncation / bad escaping
response_text = '{"title": "Login issue", "details": "User said "password reset" failed"}'

Fix:

  • increase max_tokens
  • sanitize source strings before prompting
  • ensure quotes inside strings are escaped properly
import json

payload = {
    "title": "Login issue",
    "details": 'User said "password reset" failed',
}
response_text = json.dumps(payload)

How to Debug It

  1. Print the raw model output before parsing

    • Don’t inspect only the parsed object.
    • Log the exact string passed into parser.parse().
  2. Check whether the output is fenced markdown

    • Look for:
      • explanatory prose before/after JSON
      • empty responses
    • If present, strip them before parsing or fix the prompt.
  3. Verify which class is failing

    • Common failure points:
      • PydanticOutputParser
      • StructuredOutputParser
      • JSONQueryEngine
      • agent/tool execution paths like FunctionAgent
    • The stack trace tells you whether parsing failed at generation time or during post-processing.
  4. Test the same prompt outside LlamaIndex

    • Call your model directly.
    • If direct output is also invalid JSON, the problem is prompting/model behavior.
    • If direct output is valid but LlamaIndex fails, inspect wrappers and parsers.

Prevention

  • Use schema-backed output whenever possible:

    • PydanticOutputParser
    • structured response models
    • tool/function calling instead of free-form JSON prompts
  • Keep prompts explicit:

    • say “return only valid JSON”
    • include format instructions from the parser itself
  • Add a validation step before parsing:

    • log raw output
    • run json.loads() in tests against sample completions
    • reject fenced code blocks and repair them early

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides