How to Fix 'JSON parsing error during development' in LlamaIndex (Python)
When you see JSON parsing error during development in LlamaIndex, it usually means one of the components in your pipeline expected valid JSON but got plain text, malformed JSON, or an empty response instead. In practice, this shows up when you use structured outputs, tool calling, response synthesizers, or a custom LLM wrapper that returns the wrong format.
The fix is usually not in LlamaIndex itself. It’s almost always in the prompt, parser, model config, or a wrapper that’s stripping or corrupting the JSON before LlamaIndex can read it.
The Most Common Cause
The #1 cause is asking an LLM to return JSON without enforcing a strict schema or output format. LlamaIndex then tries to parse something like:
ResponseValidationError: Invalid JSON object: Expecting value at line 1 column 1
or
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
This happens a lot with PydanticOutputParser, StructuredOutputParser, JSONQueryEngine, or any custom prompt that says “return JSON” but doesn’t constrain the model enough.
Broken vs fixed pattern
| Broken | Fixed |
|---|---|
| LLM returns free-form text | LLM is forced into a schema |
| Parser expects JSON | Parser gets valid structured output |
| Prompt says “respond in JSON” only | Prompt includes explicit formatting instructions |
# BROKEN
from llama_index.core import Settings
from llama_index.core.prompts import PromptTemplate
from llama_index.core.output_parsers import PydanticOutputParser
from pydantic import BaseModel
class TicketSummary(BaseModel):
priority: str
category: str
parser = PydanticOutputParser(output_cls=TicketSummary)
prompt = PromptTemplate(
"Summarize this support ticket and return JSON:\n{ticket}"
)
response = llm.predict(prompt, ticket="Customer can't log in.")
data = parser.parse(response) # often fails with JSONDecodeError
# FIXED
from llama_index.core.prompts import PromptTemplate
from llama_index.core.output_parsers import PydanticOutputParser
from pydantic import BaseModel
class TicketSummary(BaseModel):
priority: str
category: str
parser = PydanticOutputParser(output_cls=TicketSummary)
prompt = PromptTemplate(
"Summarize this support ticket.\n"
"{format_instructions}\n"
"Ticket: {ticket}"
)
response = llm.predict(
prompt,
ticket="Customer can't log in.",
format_instructions=parser.format_string(),
)
data = parser.parse(response)
If you’re using an OpenAI-compatible chat model through LlamaIndex, prefer structured output APIs where available instead of hoping the model follows a loose instruction.
Other Possible Causes
1) Your custom LLM wrapper strips the raw JSON
If you wrapped CustomLLM or FunctionCallingLLM, check whether your wrapper returns only part of the response or adds extra commentary.
# broken
class MyLLM(CustomLLM):
def complete(self, prompt, **kwargs):
raw = call_model(prompt)
return raw["text"] # may include markdown fences or preamble
# fixed
class MyLLM(CustomLLM):
def complete(self, prompt, **kwargs):
raw = call_model(prompt)
return raw["content"].strip()
If the model returns:
{"priority":"high","category":"auth"}
but your wrapper turns it into:
Sure — here is the JSON:
```json
{"priority":"high","category":"auth"}
the parser will fail unless you strip fences first.
### 2) You are passing non-JSON text into a parser expecting JSON
This often happens with `StructuredOutputParser`, `PydanticOutputParser`, or custom extraction logic.
```python
# broken input to parser
response_text = "Priority: high, Category: auth"
parser.parse(response_text)
Fix it by either:
- •changing the prompt to emit valid JSON only
- •or using a looser text parser if you do not need strict structure
# fixed input to parser
response_text = '{"priority": "high", "category": "auth"}'
parser.parse(response_text)
3) Your tool/function calling setup is mismatched
LlamaIndex agents can fail when the model supports tool calling but your wrapper/config doesn’t expose it correctly. You’ll often see errors around FunctionAgent, ReActAgent, or tool metadata serialization.
# broken: model does not actually support function calling here
llm = OpenAI(model="gpt-3.5-turbo") # depending on config/version, may not be enough for tools
# fixed: use a tool-capable model/config and proper agent class
llm = OpenAI(model="gpt-4o-mini", temperature=0)
agent = FunctionAgent.from_tools(tools=[...], llm=llm)
If the agent expects tool-call JSON and gets plain assistant text instead, parsing breaks downstream.
4) The response contains invalid escape characters or truncated JSON
This happens when responses are cut off by token limits or contain unescaped quotes/newlines from source data.
# broken example from truncation / bad escaping
response_text = '{"title": "Login issue", "details": "User said "password reset" failed"}'
Fix:
- •increase
max_tokens - •sanitize source strings before prompting
- •ensure quotes inside strings are escaped properly
import json
payload = {
"title": "Login issue",
"details": 'User said "password reset" failed',
}
response_text = json.dumps(payload)
How to Debug It
- •
Print the raw model output before parsing
- •Don’t inspect only the parsed object.
- •Log the exact string passed into
parser.parse().
- •
Check whether the output is fenced markdown
- •Look for:
- •
- •explanatory prose before/after JSON
- •empty responses
- •
- •If present, strip them before parsing or fix the prompt.
- •Look for:
- •
Verify which class is failing
- •Common failure points:
- •
PydanticOutputParser - •
StructuredOutputParser - •
JSONQueryEngine - •agent/tool execution paths like
FunctionAgent
- •
- •The stack trace tells you whether parsing failed at generation time or during post-processing.
- •Common failure points:
- •
Test the same prompt outside LlamaIndex
- •Call your model directly.
- •If direct output is also invalid JSON, the problem is prompting/model behavior.
- •If direct output is valid but LlamaIndex fails, inspect wrappers and parsers.
Prevention
- •
Use schema-backed output whenever possible:
- •
PydanticOutputParser - •structured response models
- •tool/function calling instead of free-form JSON prompts
- •
- •
Keep prompts explicit:
- •say “return only valid JSON”
- •include format instructions from the parser itself
- •
Add a validation step before parsing:
- •log raw output
- •run
json.loads()in tests against sample completions - •reject fenced code blocks and repair them early
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit