How to Fix 'output parsing error' in LlamaIndex (Python)

By Cyprian AaronsUpdated 2026-04-21
output-parsing-errorllamaindexpython

If you’re seeing ValueError: Error parsing output or OutputParserException in LlamaIndex, the model returned text that did not match the schema LlamaIndex expected. This usually happens when you use structured outputs, Pydantic models, query engines with response synthesis, or any parser that expects strict formatting.

The good news: this is usually not a “LlamaIndex is broken” problem. It’s almost always a mismatch between the prompt, the output parser, and the model response.

The Most Common Cause

The #1 cause is asking the LLM for structured output but not constraining it tightly enough. In practice, this shows up when you use PydanticOutputParser, StructuredOutputParser, or a QueryEngine configured to return JSON-like data, and the model adds extra prose, markdown fences, or missing fields.

Here’s the broken pattern:

BrokenFixed
```python
from pydantic import BaseModel
from llama_index.core.output_parsers import PydanticOutputParser
from llama_index.llms.openai import OpenAI

class Invoice(BaseModel): invoice_id: str total: float

llm = OpenAI(model="gpt-4o-mini") parser = PydanticOutputParser(output_cls=Invoice)

prompt = """ Extract invoice data from this text: Invoice #A123 total $42.50 Return JSON. """

response = llm.complete(prompt) invoice = parser.parse(response.text) print(invoice) |python from pydantic import BaseModel from llama_index.core.output_parsers import PydanticOutputParser from llama_index.llms.openai import OpenAI

class Invoice(BaseModel): invoice_id: str total: float

llm = OpenAI(model="gpt-4o-mini") parser = PydanticOutputParser(output_cls=Invoice)

prompt = f""" Extract invoice data from this text.

Text: Invoice #A123 total $42.50

{parser.format(instr='')} Return ONLY valid JSON matching the schema. """

response = llm.complete(prompt) invoice = parser.parse(response.text) print(invoice)


The failure mode is usually one of these:

- The model returns markdown fences like ```json
- The model adds explanation text before or after JSON
- A required field is missing
- A field has the wrong type, like `"total": "42.50 USD"` instead of a float

Typical error messages look like:

```text
ValueError: Error parsing output: Expecting value: line 1 column 1 (char 0)

or:

llama_index.core.output_parsers.base.OutputParserException: Failed to parse output into Invoice

If you’re using an agent, you may also see tool-related parsing failures such as:

ValueError: Could not parse LLM output into a valid tool call

Other Possible Causes

1. Your prompt allows extra natural language

Even if the schema is correct, the model may prepend commentary.

# Bad
prompt = "Give me JSON for this customer."

# Better
prompt = """
Return ONLY valid JSON.
No markdown.
No explanation.
No code fences.
"""

When using parsers, be explicit about output constraints. Models will happily violate weak instructions.

2. You are parsing a response stream too early

If you call .stream_complete() and try to parse partial chunks, you’ll get malformed output.

# Bad
stream = llm.stream_complete(prompt)
partial_text = next(stream).text
parser.parse(partial_text)

# Better
full_text = "".join(chunk.text for chunk in llm.stream_complete(prompt))
parser.parse(full_text)

Parse only after the full response is assembled.

3. The model is not strong enough for strict formatting

Smaller models often drift from exact schemas under load or ambiguous prompts.

# Risky for strict parsing
llm = OpenAI(model="gpt-4o-mini")

# More reliable for structured extraction
llm = OpenAI(model="gpt-4.1")

If your task needs exact JSON, use a stronger model or a native structured-output path if your provider supports it.

4. Your schema does not match the actual data

This happens when your BaseModel expects fields that don’t exist in source text.

class Customer(BaseModel):
    name: str
    email: str   # but source text has no email

# Parser fails because required field is missing.

Fix it by making optional fields truly optional:

from typing import Optional

class Customer(BaseModel):
    name: str
    email: Optional[str] = None

5. Tool calling and output parsing are mixed up

Some agents expect tool calls; others expect plain text. If you wrap an agent response in a parser meant for raw JSON, it fails.

# Bad: parsing agent narration as JSON
agent_response = agent.chat("Find invoice A123")
parser.parse(agent_response.response)

# Better: use the right agent/response mode for structured extraction,
# or extract from tool outputs directly.

In LlamaIndex, check whether your component returns:

  • plain text completion,
  • tool call objects,
  • or structured response objects.

How to Debug It

  1. Print the raw model output

    • Before parsing anything, log response.text.
    • Look for markdown fences, extra prose, truncated content, or invalid JSON.
  2. Compare output against the expected schema

    • Check every required field in your Pydantic model.
    • Verify types exactly:
      • float vs "float"
      • int vs "12 items"
      • List[str] vs comma-separated string
  3. Disable streaming and retry with deterministic settings

    • Set low temperature:
      llm = OpenAI(model="gpt-4o-mini", temperature=0)
      
    • Remove streaming while debugging so you see full responses.
  4. Test the parser independently

    • Hardcode a known-good payload:
      good_json = '{"invoice_id":"A123","total":42.5}'
      print(parser.parse(good_json))
      
    • If this passes, your parser is fine and the issue is upstream in prompting or generation.

Prevention

  • Use explicit format instructions every time you expect structured output.
  • Prefer optional fields where source data may be missing.
  • Keep temperature low for extraction tasks and avoid streaming until the workflow is stable.
  • Validate raw LLM output before parsing in production logs so failures are diagnosable fast.

If you want this to stop happening in real systems, treat parsing as an interface contract. The prompt generates data; the parser enforces shape; your code should assume both will fail unless you test them separately.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides