How to Fix 'output parsing error during development' in LlamaIndex (Python)
When you see ValueError: output parsing error during development, LlamaIndex is telling you the LLM returned text that did not match the structure the framework expected. This usually shows up when you use structured outputs, query engines with response schemas, or agents/tools that expect JSON-like data and the model returns extra prose instead.
In practice, this error is almost always a contract mismatch: your prompt, parser, or response model expects one shape, and the model produced another.
The Most Common Cause
The #1 cause is asking LlamaIndex to parse structured output while the prompt still leaves room for free-form text. The model then returns something like "Sure, here's the answer..." instead of valid JSON or a schema-compliant object.
Here’s the broken pattern versus the fixed pattern:
| Broken | Fixed |
|---|---|
| Prompt says “answer naturally” but parser expects JSON | Prompt explicitly requires strict JSON matching the schema |
Uses PydanticOutputParser or structured response mode without constraints | Uses a tight system prompt plus schema-aligned instructions |
# BROKEN
from pydantic import BaseModel
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
from llama_index.core.output_parsers import PydanticOutputParser
class ClaimDecision(BaseModel):
approved: bool
reason: str
llm = OpenAI(model="gpt-4o-mini")
parser = PydanticOutputParser(output_cls=ClaimDecision)
prompt = """
Review this insurance claim and explain your decision naturally.
Claim: customer reports water damage after a storm.
"""
response = llm.complete(prompt)
parsed = parser.parse(response.text) # ValueError: output parsing error during development
# FIXED
from pydantic import BaseModel, Field
from llama_index.llms.openai import OpenAI
from llama_index.core.output_parsers import PydanticOutputParser
class ClaimDecision(BaseModel):
approved: bool = Field(description="True if claim should be approved")
reason: str = Field(description="Short explanation")
llm = OpenAI(model="gpt-4o-mini")
parser = PydanticOutputParser(output_cls=ClaimDecision)
prompt = f"""
You must return ONLY valid JSON matching this schema:
{parser.format_string()}
No markdown. No extra text.
Claim: customer reports water damage after a storm.
"""
response = llm.complete(prompt)
parsed = parser.parse(response.text)
If you are using QueryEngine or ResponseSynthesizer, the same issue applies. The retriever may be fine; the failure happens at the final parsing step when LlamaIndex tries to coerce the answer into your expected format.
Other Possible Causes
1) Tool output does not match the tool schema
If you are using an agent with tools, make sure the tool returns what the agent expects. A common failure is returning a plain string when downstream logic expects structured data.
# BAD: tool returns ambiguous text
def get_policy_status(policy_id: str):
return f"Policy {policy_id} is active"
# BETTER: return explicit structured content
def get_policy_status(policy_id: str):
return {
"policy_id": policy_id,
"status": "active",
}
2) Streaming responses are parsed too early
If you parse before the full completion arrives, you can end up feeding incomplete text to the parser.
# BAD
stream = llm.stream_complete("Return JSON for claim decision")
partial_text = next(stream).delta # incomplete chunk
parser.parse(partial_text)
# BETTER
full_text = "".join(chunk.delta for chunk in stream)
parser.parse(full_text)
3) The model is not following formatting instructions
Some models are weaker at strict formatting. If your prompt is brittle, switch models or tighten instructions.
llm = OpenAI(
model="gpt-4o-mini",
temperature=0,
)
Low temperature helps, but it does not guarantee valid structure. If parsing matters, prefer models with strong instruction following and keep prompts explicit.
4) Your schema changed but cached code still uses the old parser
This happens when you update a BaseModel but keep an old PydanticOutputParser, old prompt template, or stale app state.
class OldSchema(BaseModel):
answer: str
class NewSchema(BaseModel):
answer: str
confidence: float # new required field
# If prompt still asks only for "answer", parsing can fail.
When fields become required, every caller needs to know about them. Otherwise LlamaIndex will try to validate missing keys and throw parsing errors.
How to Debug It
- •
Inspect raw model output first
- •Print
response.textbefore parsing. - •Look for markdown fences, leading prose, trailing commentary, or malformed JSON.
- •Print
- •
Check which parser is failing
- •Search for
PydanticOutputParser,StructuredOutputParser, or custom parse hooks. - •The stack trace usually points to
parse()or a response synthesis step inside LlamaIndex.
- •Search for
- •
Reduce temperature and remove ambiguity
- •Set
temperature=0. - •Remove phrases like “explain briefly” if you need strict machine-readable output.
- •Force “ONLY valid JSON” in the prompt.
- •Set
- •
Validate against the schema outside LlamaIndex
- •Take
response.textand run it through your Pydantic model directly. - •If Pydantic rejects it, LlamaIndex will reject it too.
- •Take
print("RAW OUTPUT:")
print(response.text)
try:
parsed = ClaimDecision.model_validate_json(response.text)
except Exception as e:
print("SCHEMA ERROR:", e)
Prevention
- •Use explicit structured-output prompts whenever you expect JSON or Pydantic objects.
- •Keep
temperature=0for any workflow that depends on deterministic parsing. - •Add a test that asserts raw LLM output validates against your schema before shipping.
If this error appears during development, treat it as a contract bug, not an LLM mystery. The fix is usually in one of three places: stricter prompting, correct schema alignment, or stopping premature parsing of partial output.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit