How to Fix 'JSON parsing error when scaling' in LlamaIndex (Python)
When LlamaIndex throws a JSON parsing error when scaling, it usually means the framework tried to parse structured output from an LLM, tool, or node payload and got back malformed JSON. In practice, this shows up during response synthesis, structured extraction, agent tool calls, or when you scale from a single test prompt to batch processing or larger documents.
The key point: this is rarely a “JSON is broken” problem. It’s usually a prompt, model-output, or schema mismatch problem that only becomes visible once the workload gets bigger.
The Most Common Cause
The #1 cause is asking an LLM to return JSON but not constraining the output tightly enough. LlamaIndex then tries to parse something that looks like JSON but contains markdown fences, extra commentary, trailing commas, or truncated content.
Typical failure pattern:
- •
ValueError: Could not parse output as JSON - •
json.decoder.JSONDecodeError: Expecting value - •
ResponseValidationErrorwhen using structured outputs
Broken vs fixed pattern
| Broken code | Fixed code |
|---|---|
| ```python | |
| from llama_index.core import Settings | |
| from llama_index.llms.openai import OpenAI |
llm = OpenAI(model="gpt-4o-mini") Settings.llm = llm
prompt = """
Return JSON for this customer:
Name: Alice
Balance: 1200
"""
resp = llm.complete(prompt)
data = resp.text # later parsed with json.loads(...)
|python
import json
from pydantic import BaseModel
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
class Customer(BaseModel): name: str balance: int
llm = OpenAI(model="gpt-4o-mini", temperature=0) Settings.llm = llm
prompt = """ Return ONLY valid JSON matching this schema: {"name": string, "balance": number}
Customer: Name: Alice Balance: 1200 """
resp = llm.complete(prompt) data = json.loads(resp.text) customer = Customer.model_validate(data)
If you’re using LlamaIndex structured prediction APIs, prefer schema-first output instead of free-form text. For example:
```python
from pydantic import BaseModel
from llama_index.core.program import PydanticProgram
class Customer(BaseModel):
name: str
balance: int
program = PydanticProgram.from_defaults(
output_cls=Customer,
prompt_template_str="Extract customer data from: {text}",
)
result = program(text="Name: Alice. Balance: 1200.")
That removes most parsing drift because the model is guided toward a typed object instead of raw text.
Other Possible Causes
1) The model is returning markdown fences or extra prose
This is common when the prompt says “return JSON” but doesn’t forbid explanation.
# Problematic output:
# ```json
# {"name": "Alice", "balance": 1200}
# ```
Fix by forcing strict output:
prompt = """
Return ONLY raw JSON.
No markdown.
No explanation.
No code fences.
"""
If you still see fenced output, strip it before parsing:
text = resp.text.strip()
text = text.removeprefix("```json").removesuffix("```").strip()
2) Temperature is too high for structured extraction
Higher temperature increases formatting variance. That’s fine for creative writing and bad for parsers.
llm = OpenAI(model="gpt-4o-mini", temperature=0)
If you’re doing extraction, classification, or routing, keep temperature at 0 or very close to it.
3) Context window truncation during scaling
When you scale from one document to many chunks, the model may truncate its response or lose part of the schema. That often produces errors like:
- •
json.decoder.JSONDecodeError: Unterminated string starting at - •
Expecting ',' delimiter - •Partial object output missing closing braces
Mitigation:
from llama_index.core import PromptTemplate
template = PromptTemplate(
"Extract fields from the text below.\n"
"Return compact JSON only.\n\n"
"{text}"
)
Also reduce chunk size and avoid stuffing too much source text into one call.
4) Tool calling / function schema mismatch
If you’re using agents and tools, the tool signature must match what the model emits. A mismatch between expected arguments and actual keys can surface as parsing failures.
def create_claim(claim_id: str, amount: float):
return {"claim_id": claim_id, "amount": amount}
Bad tool descriptions often confuse the model into emitting wrong keys:
# Bad: vague description leads to wrong args in tool call payloads
Tool.from_defaults(fn=create_claim, description="Create a claim somehow")
Use explicit parameter names and descriptions so the tool-call payload matches exactly.
How to Debug It
- •
Print the raw model output before parsing
- •Don’t inspect only the final exception.
- •Log
resp.textor the agent/tool payload exactly as returned.
print(repr(resp.text)) - •
Validate against a schema outside LlamaIndex
- •Use
json.loads()first. - •Then validate with Pydantic.
- •This tells you whether the issue is malformed JSON or schema mismatch.
import json data = json.loads(resp.text) - •Use
- •
Check whether the failure happens only on larger inputs
- •If small prompts work and large batches fail, suspect truncation.
- •Reduce chunk size and inspect one failing chunk directly.
- •
Turn off variability
- •Set
temperature=0. - •Remove chat history.
- •Disable retries temporarily so you can see the first bad payload instead of a masked failure.
- •Set
Prevention
- •Use schema-first generation for anything that must be parsed later.
- •Prefer
PydanticProgram, structured prediction, or explicit response schemas over free-form completion text.
- •Prefer
- •Keep extraction prompts strict.
- •Say “ONLY valid JSON” and forbid markdown fences, explanations, and trailing text.
- •Log raw outputs in staging.
- •Store failed completions so you can inspect exact malformed payloads when scaling breaks.
If this error appears only after moving from a single request to batch jobs or multi-document ingestion, assume it’s an output-shape problem first. In LlamaIndex workflows, that’s usually where JSON parsing failures start.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit