How to Build a document extraction Agent Using LangChain in TypeScript for payments

By Cyprian AaronsUpdated 2026-04-21
document-extractionlangchaintypescriptpayments

A document extraction agent for payments reads invoices, bank statements, remittance advice, or payment instructions and turns them into structured fields your system can trust. That matters because every bad extraction becomes a reconciliation error, a failed payout, or a compliance issue that ops has to clean up later.

Architecture

  • Document ingestion layer

    • Accept PDFs, images, and scanned statements from S3, Blob Storage, or an internal upload service.
    • Normalize files before extraction so OCR sees consistent inputs.
  • OCR / text extraction layer

    • Use a deterministic OCR step for scanned documents.
    • Keep the raw text alongside the original file for audit and dispute handling.
  • LangChain extraction chain

    • Use ChatOpenAI with structured output to map text into a strict payment schema.
    • Keep the schema small and explicit: invoice number, amount, currency, payer, payee, due date, reference IDs.
  • Validation and rules engine

    • Validate extracted fields against payment rules: currency format, IBAN/SWIFT patterns, amount thresholds, duplicate references.
    • Reject or flag low-confidence outputs before they hit downstream payment rails.
  • Persistence and audit trail

    • Store raw document hash, extracted JSON, model version, prompt version, timestamps, and human override history.
    • This is non-negotiable for compliance and post-incident review.

Implementation

1) Define the extraction schema

Keep the schema strict. Payments systems need predictable output, not free-form summaries.

import { z } from "zod";

export const PaymentDocumentSchema = z.object({
  documentType: z.enum(["invoice", "remittance_advice", "bank_statement", "payment_instruction"]),
  invoiceNumber: z.string().optional(),
  amount: z.number(),
  currency: z.string().length(3),
  payerName: z.string(),
  payeeName: z.string().optional(),
  dueDate: z.string().optional(), // ISO-8601 string
  reference: z.string().optional(),
  iban: z.string().optional(),
  swiftBic: z.string().optional(),
});

export type PaymentDocument = z.infer<typeof PaymentDocumentSchema>;

2) Build the LangChain extractor with structured output

Use ChatOpenAI plus withStructuredOutput. This is the cleanest pattern in TypeScript when you need typed JSON back from an LLM.

import { ChatOpenAI } from "@langchain/openai";
import { HumanMessage } from "@langchain/core/messages";
import { PaymentDocumentSchema } from "./schema.js";

const model = new ChatOpenAI({
  model: "gpt-4o-mini",
  temperature: 0,
});

const extractor = model.withStructuredOutput(PaymentDocumentSchema);

export async function extractPaymentFields(rawText: string) {
  const prompt = `
Extract payment-related fields from the document text below.

Rules:
- Return only fields supported by the schema.
- If a field is missing, omit it.
- Amount must be numeric.
- Currency must be ISO-4217 uppercase.
- Dates must be ISO-8601 if present.

Document:
${rawText}
`;

  const result = await extractor.invoke([new HumanMessage(prompt)]);
  return result;
}

This pattern gives you typed output without writing brittle regex chains. It also makes validation explicit at the boundary instead of buried in downstream code.

3) Add validation and payment-specific guards

Do not let the model decide whether an extraction is acceptable. Validate it after generation and reject anything that fails policy checks.

import { PaymentDocumentSchema } from "./schema.js";

function validatePaymentExtraction(input: unknown) {
  const parsed = PaymentDocumentSchema.safeParse(input);

  if (!parsed.success) {
    return {
      ok: false as const,
      reason: "schema_validation_failed",
      errors: parsed.error.flatten(),
    };
    }

  const doc = parsed.data;

  if (doc.amount <= 0) {
    return { ok: false as const, reason: "invalid_amount" };
  }

  if (!/^[A-Z]{3}$/.test(doc.currency)) {
    return { ok: false as const, reason: "invalid_currency" };
  }

  if (doc.iban && !/^[A-Z]{2}[0-9A-Z]{13,32}$/.test(doc.iban.replace(/\s+/g, ""))) {
    return { ok: false as const, reason: "invalid_iban_format" };
    }

  return { ok: true as const, data: doc };
}

For payments workflows, this step is where you enforce operational policy. For example:

  • block cross-border payments if required fields are missing
  • route suspicious amounts to manual review
  • compare extracted references against existing transaction IDs to detect duplicates

4) Wire it into an ingestion pipeline

In production you usually want OCR first, then extraction. The agent should operate on normalized text and emit a traceable record.

import crypto from "node:crypto";
import fs from "node:fs/promises";
import { extractPaymentFields } from "./extractor.js";

async function readTextFromFile(path: string): Promise<string> {
  // Replace this with OCR output for scanned PDFs/images.
  return fs.readFile(path, "utf8");
}

export async function processPaymentDocument(filePath: string) {
  const rawText = await readTextFromFile(filePath);
  const documentHash = crypto.createHash("sha256").update(rawText).digest("hex");

   const extracted = await extractPaymentFields(rawText);
   const validated = validatePaymentExtraction(extracted);

   return {
     documentHash,
     extracted,
     validated,
     model: "gpt-4o-mini",
     timestamp: new Date().toISOString(),
   };
}

That response object is what you persist to your audit store. Include prompt version and source file metadata so finance ops can reconstruct how a value was produced months later.

Production Considerations

  • Deploy close to your data boundary

    • If documents contain PII or banking details, keep processing inside approved regions.
    • For regulated environments, pin storage and inference to the same residency zone where possible.
  • Log for audit without leaking sensitive data

    • Store hashes, field-level diffs, confidence flags, and model versions.
    • Avoid logging full raw documents unless your retention policy explicitly allows it.
  • Add human-in-the-loop thresholds

    • Route low-confidence extractions or mismatched totals to operations review.
    • Typical triggers include missing currency codes, invalid IBANs, or invoice totals that do not match line-item sums.
  • Monitor drift and failure modes

    • Track extraction accuracy by document type and vendor template.
    • Watch for OCR regressions after scanner changes or vendor layout changes.

Common Pitfalls

  1. Using free-form text output instead of strict schemas

    • This creates parsing bugs and inconsistent downstream behavior.
    • Fix it by using withStructuredOutput plus Zod validation on every request.
  2. Skipping payment-specific validation

    • An LLM can extract a number that looks right but is wrong for settlement.
    • Fix it with deterministic checks for amount positivity, currency format, IBAN/SWIFT structure, and duplicate reference detection.
  3. Treating auditability as optional

    • In payments you need traceability for disputes, compliance reviews, and internal controls.
    • Fix it by storing raw document hashes, model versions, prompt versions, validation results, and human overrides in an immutable audit log.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides