How to Build a document extraction Agent Using LangGraph in TypeScript for payments

By Cyprian AaronsUpdated 2026-04-21
document-extractionlanggraphtypescriptpayments

A document extraction agent for payments reads invoices, bank statements, remittance advice, and payment instructions, then turns them into structured fields your downstream systems can trust. It matters because payments workflows fail hard when extraction is wrong: duplicate payouts, misapplied remittances, compliance gaps, and manual review bottlenecks all cost money.

Architecture

  • Document intake node

    • Accepts PDFs, images, or text from S3, blob storage, email ingestion, or API uploads.
    • Normalizes the file into a canonical document payload with metadata like tenant, jurisdiction, and source.
  • OCR / text extraction node

    • Converts scans into text when the source is image-based.
    • Keeps page-level references so extracted fields can be traced back to evidence.
  • LLM extraction node

    • Uses a structured output schema to extract payment-relevant fields:
      • invoice number
      • supplier name
      • amount
      • currency
      • due date
      • bank account / IBAN
      • remittance reference
    • Produces machine-readable JSON only.
  • Validation and policy node

    • Checks schema validity, field confidence, duplicate invoice detection, amount sanity checks, and jurisdiction rules.
    • Flags risky records for human review instead of auto-posting them.
  • Human review / exception node

    • Routes low-confidence or policy-violating documents to an ops queue.
    • Stores reviewer decisions for audit and model improvement.
  • Persistence / audit node

    • Writes extraction results, evidence pointers, and decision logs to an immutable store.
    • Supports traceability for SOX-style controls, payment audits, and dispute handling.

Implementation

1) Define the state and extraction schema

Start with a typed state object and a strict schema. For payments, don’t let the model invent fields; keep it constrained to what your AP or treasury system actually needs.

import { z } from "zod";
import { Annotation } from "@langchain/langgraph";

const PaymentExtractionSchema = z.object({
  documentType: z.enum(["invoice", "bank_statement", "remittance", "payment_instruction"]),
  supplierName: z.string().optional(),
  invoiceNumber: z.string().optional(),
  amount: z.number().optional(),
  currency: z.string().length(3).optional(),
  dueDate: z.string().optional(),
  iban: z.string().optional(),
  accountNumber: z.string().optional(),
  routingNumber: z.string().optional(),
  remittanceReference: z.string().optional(),
  confidence: z.number().min(0).max(1),
});

export type PaymentExtraction = z.infer<typeof PaymentExtractionSchema>;

export const GraphState = Annotation.Root({
  documentId: Annotation<string>(),
  tenantId: Annotation<string>(),
  jurisdiction: Annotation<string>(),
  rawText: Annotation<string>(),
  extraction: Annotation<PaymentExtraction | null>(),
  needsReview: Annotation<boolean>(),
});

2) Build LangGraph nodes with real StateGraph APIs

Use StateGraph to wire document parsing, extraction, validation, and routing. The pattern below is production-friendly because each step is isolated and testable.

import { StateGraph, START, END } from "@langchain/langgraph";
import { ChatOpenAI } from "@langchain/openai";

const llm = new ChatOpenAI({
  model: "gpt-4o-mini",
  temperature: 0,
});

async function extractFields(state: typeof GraphState.State) {
  const prompt = `
Extract payment fields from this document text.
Return only JSON matching the schema.
Document text:
${state.rawText}
`;

  const response = await llm.invoke(prompt);
  const parsed = JSON.parse(response.content as string);
  const extraction = PaymentExtractionSchema.parse(parsed);

  return {
    extraction,
    needsReview: extraction.confidence < 0.85,
  };
}

function validatePaymentData(state: typeof GraphState.State) {
  const e = state.extraction;
  if (!e) return { needsReview: true };

  const invalidAmount = typeof e.amount === "number" && e.amount <= 0;
  const missingCritical =
    e.documentType === "invoice" && (!e.invoiceNumber || !e.supplierName || !e.amount);

  return {
    needsReview:
      state.needsReview ||
      invalidAmount ||
      missingCritical ||
      (state.jurisdiction === "EU" && !e.currency),
  };
}

function routeAfterValidation(state: typeof GraphState.State) {
  
}

Complete the graph with conditional routing:

function routeAfterValidation(state: typeof GraphState.State) {
  return state.needsReview ? "humanReview" : "persist";
}

async function humanReviewNode(state: typeof GraphState.State) {
  
 
   return {
    needsReview: true,
    
   };
}

async function persistNode(state: typeof GraphState.State) {
  
   // write to DB / audit log / queue here
   return {};
}

const graph = new StateGraph(GraphState)
  

 .addNode("extractFields", extractFields)
 .addNode("validatePaymentData", validatePaymentData)
 .addNode("humanReview", humanReviewNode)
 .addNode("persist", persistNode)
 .addEdge(START, "extractFields")
 .addEdge("extractFields", "validatePaymentData")
 .addConditionalEdges("validatePaymentData", routeAfterValidation, {
   humanReview: "humanReview",
   persist: "persist",
 })
 .addEdge("humanReview", END)
 .addEdge("persist", END);

export const paymentExtractionApp = graph.compile();

Step-by-step execution

Run the compiled graph with tenant-aware metadata. In payments systems this matters because residency rules and audit trails are usually tenant-scoped.

const result = await paymentExtractionApp.invoke(
{
documentId:"doc_123",
tenantId:"tenant_eu_01",
jurisdiction:"EU",
rawText:"Invoice #A-1029 Supplier ACME Ltd Amount EUR 1250 Due date ...",
extraction:null,
needsReview:false,
},
{
metadata:{
source:"s3://payments-inbox/tenant_eu_01/doc_123.pdf",
correlationId:"payreq-8891"
}
}
);

console.log(result.extraction);
console.log(result.needsReview);

Step-by-step execution

If you need deeper observability or retries around OCR/extraction failures, wrap nodes with explicit error handling and store evidence spans. Keep the graph deterministic where possible; use the LLM only for the parts that need semantic understanding.

Production Considerations

  • Data residency

Keep EU documents in EU-hosted storage and run inference in-region. If you process cardholder-adjacent or banking data across borders without controls, you will create legal and contractual problems fast.

  • Auditability

Persist raw input hashes, extracted output hashes, model version, prompt version, reviewer identity, and timestamp. Payments teams need to answer “who approved this payout” months later.

  • Guardrails

Enforce strict schema validation with Zod before anything hits ERP/AP systems. Add amount thresholds and beneficiary verification rules so the agent cannot auto-release high-risk payments.

  • Monitoring

Track extraction accuracy by document type, manual review rate, false positive routing rate, and field-level confidence drift. A spike in review volume usually means OCR quality dropped or a vendor changed invoice format.

Common Pitfalls

  • Letting the model free-form parse payment data

Avoid plain-text outputs without schema enforcement. Use Zod validation on every extraction result so malformed payloads never reach downstream systems.

  • Skipping evidence mapping

If you don’t store page/line references or source text spans, reviewers cannot verify why a field was extracted. That kills auditability and slows exception handling.

  • Treating all documents the same

Invoices, remittance advice, bank statements, and payment instructions have different risk profiles. Route them through different validation rules instead of one generic extractor.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides