How to Fix 'output parsing error' in LlamaIndex (Python)
If you’re seeing ValueError: Error parsing output or OutputParserException in LlamaIndex, the model returned text that did not match the schema LlamaIndex expected. This usually happens when you use structured outputs, Pydantic models, query engines with response synthesis, or any parser that expects strict formatting.
The good news: this is usually not a “LlamaIndex is broken” problem. It’s almost always a mismatch between the prompt, the output parser, and the model response.
The Most Common Cause
The #1 cause is asking the LLM for structured output but not constraining it tightly enough. In practice, this shows up when you use PydanticOutputParser, StructuredOutputParser, or a QueryEngine configured to return JSON-like data, and the model adds extra prose, markdown fences, or missing fields.
Here’s the broken pattern:
| Broken | Fixed |
|---|---|
| ```python | |
| from pydantic import BaseModel | |
| from llama_index.core.output_parsers import PydanticOutputParser | |
| from llama_index.llms.openai import OpenAI |
class Invoice(BaseModel): invoice_id: str total: float
llm = OpenAI(model="gpt-4o-mini") parser = PydanticOutputParser(output_cls=Invoice)
prompt = """ Extract invoice data from this text: Invoice #A123 total $42.50 Return JSON. """
response = llm.complete(prompt)
invoice = parser.parse(response.text)
print(invoice)
|python
from pydantic import BaseModel
from llama_index.core.output_parsers import PydanticOutputParser
from llama_index.llms.openai import OpenAI
class Invoice(BaseModel): invoice_id: str total: float
llm = OpenAI(model="gpt-4o-mini") parser = PydanticOutputParser(output_cls=Invoice)
prompt = f""" Extract invoice data from this text.
Text: Invoice #A123 total $42.50
{parser.format(instr='')} Return ONLY valid JSON matching the schema. """
response = llm.complete(prompt) invoice = parser.parse(response.text) print(invoice)
The failure mode is usually one of these:
- The model returns markdown fences like ```json
- The model adds explanation text before or after JSON
- A required field is missing
- A field has the wrong type, like `"total": "42.50 USD"` instead of a float
Typical error messages look like:
```text
ValueError: Error parsing output: Expecting value: line 1 column 1 (char 0)
or:
llama_index.core.output_parsers.base.OutputParserException: Failed to parse output into Invoice
If you’re using an agent, you may also see tool-related parsing failures such as:
ValueError: Could not parse LLM output into a valid tool call
Other Possible Causes
1. Your prompt allows extra natural language
Even if the schema is correct, the model may prepend commentary.
# Bad
prompt = "Give me JSON for this customer."
# Better
prompt = """
Return ONLY valid JSON.
No markdown.
No explanation.
No code fences.
"""
When using parsers, be explicit about output constraints. Models will happily violate weak instructions.
2. You are parsing a response stream too early
If you call .stream_complete() and try to parse partial chunks, you’ll get malformed output.
# Bad
stream = llm.stream_complete(prompt)
partial_text = next(stream).text
parser.parse(partial_text)
# Better
full_text = "".join(chunk.text for chunk in llm.stream_complete(prompt))
parser.parse(full_text)
Parse only after the full response is assembled.
3. The model is not strong enough for strict formatting
Smaller models often drift from exact schemas under load or ambiguous prompts.
# Risky for strict parsing
llm = OpenAI(model="gpt-4o-mini")
# More reliable for structured extraction
llm = OpenAI(model="gpt-4.1")
If your task needs exact JSON, use a stronger model or a native structured-output path if your provider supports it.
4. Your schema does not match the actual data
This happens when your BaseModel expects fields that don’t exist in source text.
class Customer(BaseModel):
name: str
email: str # but source text has no email
# Parser fails because required field is missing.
Fix it by making optional fields truly optional:
from typing import Optional
class Customer(BaseModel):
name: str
email: Optional[str] = None
5. Tool calling and output parsing are mixed up
Some agents expect tool calls; others expect plain text. If you wrap an agent response in a parser meant for raw JSON, it fails.
# Bad: parsing agent narration as JSON
agent_response = agent.chat("Find invoice A123")
parser.parse(agent_response.response)
# Better: use the right agent/response mode for structured extraction,
# or extract from tool outputs directly.
In LlamaIndex, check whether your component returns:
- •plain text completion,
- •tool call objects,
- •or structured response objects.
How to Debug It
- •
Print the raw model output
- •Before parsing anything, log
response.text. - •Look for markdown fences, extra prose, truncated content, or invalid JSON.
- •Before parsing anything, log
- •
Compare output against the expected schema
- •Check every required field in your
Pydanticmodel. - •Verify types exactly:
- •
floatvs"float" - •
intvs"12 items" - •
List[str]vs comma-separated string
- •
- •Check every required field in your
- •
Disable streaming and retry with deterministic settings
- •Set low temperature:
llm = OpenAI(model="gpt-4o-mini", temperature=0) - •Remove streaming while debugging so you see full responses.
- •Set low temperature:
- •
Test the parser independently
- •Hardcode a known-good payload:
good_json = '{"invoice_id":"A123","total":42.5}' print(parser.parse(good_json)) - •If this passes, your parser is fine and the issue is upstream in prompting or generation.
- •Hardcode a known-good payload:
Prevention
- •Use explicit format instructions every time you expect structured output.
- •Prefer optional fields where source data may be missing.
- •Keep temperature low for extraction tasks and avoid streaming until the workflow is stable.
- •Validate raw LLM output before parsing in production logs so failures are diagnosable fast.
If you want this to stop happening in real systems, treat parsing as an interface contract. The prompt generates data; the parser enforces shape; your code should assume both will fail unless you test them separately.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit