How to Fix 'JSON parsing error in production' in LlamaIndex (TypeScript)

By Cyprian AaronsUpdated 2026-04-21

json-parsing-error-in-productionllamaindextypescript

If you’re seeing JSON parsing error in production with LlamaIndex TypeScript, it usually means the model returned text that was supposed to be strict JSON, but wasn’t. In practice, this shows up when you use structured extraction, function calling, or any parser that expects machine-readable output and the LLM drifts into prose, markdown, or malformed JSON.

The error often appears only in production because prompts change slightly, context gets longer, temperature is higher than your local test, or the model you deployed behaves differently under load.

The Most Common Cause

The #1 cause is simple: you asked LlamaIndex to parse JSON, but your prompt did not force strict JSON output strongly enough.

In TypeScript, this usually happens with structured output helpers like StructuredOutputParser, PydanticProgram-style flows, or custom response parsing around OpenAI / LLM calls. The model returns something like:

Here is the result:
{
  "status": "approved",
  "amount": 1200
}

That extra text breaks parsing and triggers errors such as:

•SyntaxError: Unexpected token H in JSON at position 0
•Error: Failed to parse response as JSON
•JSON parsing error in production

Broken vs fixed pattern

Broken	Fixed
Loose prompt, no strict format instruction	Explicit “return only valid JSON” instruction
Free-form completion settings	Lower temperature and structured output enforcement
Parsing raw assistant text directly	Validate and sanitize before parsing

// BROKEN
import { OpenAI } from "@llamaindex/openai";
import { Settings } from "llamaindex";

Settings.llm = new OpenAI({
  model: "gpt-4o-mini",
  temperature: 0.7,
});

const prompt = `
Extract invoice data:
customer: Acme Corp
amount: 1200
status: approved
`;

const response = await Settings.llm.complete(prompt);

// This will fail if the model adds markdown or explanation.
const data = JSON.parse(response.text);
console.log(data);

// FIXED
import { OpenAI } from "@llamaindex/openai";
import { Settings } from "llamaindex";

Settings.llm = new OpenAI({
  model: "gpt-4o-mini",
  temperature: 0,
});

const prompt = `
Extract invoice data and return ONLY valid JSON.

Rules:
- No markdown
- No explanation
- No code fences
- Output must be a single JSON object

Schema:
{
  "customer": string,
  "amount": number,
  "status": string
}

Input:
customer: Acme Corp
amount: 1200
status: approved
`;

const response = await Settings.llm.complete(prompt);

// Still validate before trusting it.
const data = JSON.parse(response.text.trim());
console.log(data);

If you’re using LlamaIndex’s structured output helpers, prefer them over hand-rolled JSON.parse() on raw completions. Raw parsing is where most production failures start.

Other Possible Causes

1) You are using a model that does not reliably follow JSON formatting

Some smaller models or non-tool-aware endpoints are inconsistent with strict formatting.

const llm = new OpenAI({
  model: "gpt-4o-mini",
  temperature: 0,
});

If the issue persists on one provider but not another, test with a stronger model first. This is especially common when migrating from local dev models to hosted inference.

2) Your context window is too full

When the prompt gets long, models often degrade formatting discipline first. That means your extraction task starts returning commentary instead of clean JSON.

const chunks = await splitter.splitText(longDocument);
// Too much context stuffed into one request can break structure.

Fix by chunking more aggressively and extracting per chunk instead of asking for a giant object in one shot.

3) You are mixing markdown instructions with raw JSON output

This is a classic failure mode:

const prompt = `
Respond in markdown.
Then include this object:
{ "name": "...", "risk": "..." }
`;

That instruction conflicts with strict parsing. If you need JSON, do not ask for markdown at the same time.

4) Your parser expects a different shape than the model returns

Sometimes the output is valid JSON, just not the shape your code expects.

// Model returns:
// { "result": { "status": "approved" } }

// But your code expects:
// { status: string }
type Approval = {
  status: string;
};

This often surfaces as downstream runtime errors after a successful parse. Check your schema and your access path carefully.

How to Debug It

•
Log the raw LLM output before parsing
- •Don’t inspect only the parsed object.
- •Print the exact string returned by response.text.
- •Look for code fences, prefixes like “Sure”, trailing commas, or nested markdown.
•
Reproduce with temperature set to zero
- •Use deterministic settings first.
- •If the error disappears at temperature: 0, you’re dealing with formatting drift rather than bad logic.
•
Compare production prompts to local prompts
- •Log the final rendered prompt in prod.
- •Look for hidden template variables, extra system messages, or truncated context.
- •A missing “return only JSON” line is enough to break parsing.
•
Validate against a schema before business logic
- •Parse first.
- •Validate second.
- •Process third.

import { z } from "zod";

const InvoiceSchema = z.object({
  customer: z.string(),
  amount: z.number(),
  status: z.string(),
});

const parsed = JSON.parse(response.text);
const validated = InvoiceSchema.safeParse(parsed);

if (!validated.success) {
  console.error("Invalid schema:", validated.error.flatten());
}

If raw parsing succeeds but validation fails, your issue is schema mismatch. If raw parsing fails immediately, your issue is malformed output from the model.

Prevention

•Use structured output helpers instead of manual string parsing wherever possible.
•Set temperature: 0 for extraction tasks that must return valid JSON.
•
Keep prompts short and explicit:
- •no markdown
- •no prose
- •no extra keys unless required by schema

A good rule in production LlamaIndex TypeScript apps: treat every LLM response as untrusted input until it passes both syntax and schema validation. That one habit eliminates most "JSON parsing error" incidents before they reach users.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit