LlamaIndex Tutorial (TypeScript): parsing structured output for intermediate developers

By Cyprian AaronsUpdated 2026-04-21
llamaindexparsing-structured-output-for-intermediate-developerstypescript

This tutorial shows you how to take free-form LLM text and turn it into typed, validated JSON using LlamaIndex TypeScript. You need this when your app can’t afford loose strings from the model and must reliably extract fields like dates, amounts, entities, or classifications.

What You'll Need

  • Node.js 18+
  • A TypeScript project with ts-node or a build step
  • llamaindex installed
  • An OpenAI API key set in OPENAI_API_KEY
  • Basic familiarity with async/await and TypeScript types

Install the package:

npm install llamaindex

Step-by-Step

  1. Start by defining the shape of the output you want. In production, this is the contract between your prompt and your downstream code, so keep it strict and minimal.
type InvoiceExtraction = {
  vendor: string;
  invoiceNumber: string;
  totalAmount: number;
  dueDate: string;
};
  1. Create a structured extraction prompt and an output parser. LlamaIndex will use the schema to constrain the model response, which is much safer than parsing raw prose with regex.
import { OpenAI } from "llamaindex";
import { StructuredOutputParser } from "llamaindex/output_parsers";

type InvoiceExtraction = {
  vendor: string;
  invoiceNumber: string;
  totalAmount: number;
  dueDate: string;
};

const parser = StructuredOutputParser.fromZodSchema(
  (await import("zod")).z.object({
    vendor: (await import("zod")).z.string(),
    invoiceNumber: (await import("zod")).z.string(),
    totalAmount: (await import("zod")).z.number(),
    dueDate: (await import("zod")).z.string(),
  })
);
  1. Build a prompt that includes format instructions from the parser. This is the part most people skip; without explicit formatting instructions, you’ll get inconsistent output under load.
const llm = new OpenAI({ model: "gpt-4o-mini" });

const text = `
Invoice from Acme Supplies.
Invoice #INV-1042.
Total due is $1834.50.
Payment due by 2026-02-15.
`;

const prompt = `
Extract invoice data from the text below.

Text:
${text}

${parser.getFormatInstructions()}
`;
  1. Call the model and parse the response into typed data. If the model returns malformed JSON, fail fast instead of silently accepting bad data.
const response = await llm.complete({
  prompt,
});

const parsed = parser.parse(response.text);

console.log(parsed.vendor);
console.log(parsed.invoiceNumber);
console.log(parsed.totalAmount);
console.log(parsed.dueDate);
  1. Wrap it in a reusable function so your app can call it anywhere. This is the version you want in a service layer, not scattered across route handlers.
import { OpenAI } from "llamaindex";
import { StructuredOutputParser } from "llamaindex/output_parsers";
import { z } from "zod";

const schema = z.object({
  vendor: z.string(),
  invoiceNumber: z.string(),
  totalAmount: z.number(),
  dueDate: z.string(),
});

type InvoiceExtraction = z.infer<typeof schema>;

const llm = new OpenAI({ model: "gpt-4o-mini" });
const parser = StructuredOutputParser.fromZodSchema(schema);

export async function extractInvoice(text: string): Promise<InvoiceExtraction> {
  const prompt = `
Extract invoice data from the text below.

Text:
${text}

${parser.getFormatInstructions()}
`;

  const response = await llm.complete({ prompt });
  return parser.parse(response.text);
}
  1. Add basic error handling around parsing failures. In real systems, you want to log the raw model output and either retry with tighter instructions or route to manual review.
async function run() {
  try {
    const result = await extractInvoice(
      "Invoice from Acme Supplies. Invoice #INV-1042. Total due is $1834.50. Payment due by 2026-02-15."
    );

    console.log("Parsed invoice:", result);
  } catch (error) {
    console.error("Structured parsing failed:");
    console.error(error);
  }
}

run();

Testing It

Run the script against a few different invoice texts and check that every field comes back with the right type and value. Then test edge cases like missing totals, alternate date formats, or extra noise in the input.

If parsing fails, inspect the raw completion text before changing your schema. Most issues come from weak prompts or schemas that are too permissive.

For bank or insurance workflows, add assertions on required fields before persisting anything downstream. Typed parsing is only useful if bad outputs are rejected early.

Next Steps

  • Add enum fields for controlled classifications like riskLevel, documentType, or claimStatus
  • Use a retry loop that reprompts with the validation error when parsing fails
  • Store both raw text and parsed JSON for auditability and debugging

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides