How to Build a document extraction Agent Using LangGraph in Python for payments

By Cyprian AaronsUpdated 2026-04-21
document-extractionlanggraphpythonpayments

A document extraction agent for payments takes invoices, remittance advice, bank statements, and payment instructions, then turns them into structured fields your downstream systems can trust. That matters because payment ops is mostly exception handling: if you misread an amount, beneficiary name, invoice number, or due date, you create reconciliation breaks, failed settlements, and audit pain.

Architecture

  • Document ingestion layer

    • Accepts PDFs, images, email attachments, or object storage references.
    • Normalizes file metadata like source system, tenant, jurisdiction, and document type.
  • OCR / text extraction node

    • Converts scanned documents into text and layout-aware blocks.
    • For payments, preserve line items and reference fields; flat text alone is usually not enough.
  • Extraction node

    • Uses an LLM to map raw text into a strict schema such as invoice_number, amount, currency, beneficiary, iban, due_date.
    • This is where you enforce field-level validation.
  • Validation and policy node

    • Checks totals, currency formats, country-specific bank identifiers, duplicate references, and missing mandatory fields.
    • Adds compliance rules like PII redaction and retention tags.
  • Human review / escalation node

    • Routes low-confidence or policy-failing documents to an ops queue.
    • Keeps the agent from auto-posting bad payment instructions.
  • Persistence and audit layer

    • Stores extracted results, confidence scores, prompts used, model version, and decision trace.
    • You need this for SOX-style controls, dispute handling, and regulator questions.

Implementation

1) Define the state and schema

Use a typed state so every node in the graph passes the same contract. For payments, keep both the raw text and the structured output so you can audit what was extracted.

from typing import TypedDict, Optional
from langgraph.graph import StateGraph, START, END
from langchain_core.messages import HumanMessage
from pydantic import BaseModel, Field

class PaymentExtraction(BaseModel):
    invoice_number: Optional[str] = None
    amount: Optional[float] = None
    currency: Optional[str] = None
    beneficiary_name: Optional[str] = None
    iban: Optional[str] = None
    due_date: Optional[str] = None
    confidence: float = Field(ge=0.0, le=1.0)

class AgentState(TypedDict):
    document_text: str
    extraction: dict
    needs_review: bool
    reason: str

2) Build the extraction and validation nodes

This example uses LangChain-compatible chat models inside LangGraph nodes. The pattern is simple: extract first, validate second.

import json
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

def extract_fields(state: AgentState) -> AgentState:
    prompt = f"""
You are extracting payment document fields.
Return valid JSON with keys:
invoice_number, amount, currency, beneficiary_name, iban, due_date, confidence

Document:
{state["document_text"]}
"""
    response = llm.invoke([HumanMessage(content=prompt)])
    data = json.loads(response.content)
    return {
        **state,
        "extraction": data,
        "needs_review": False,
        "reason": ""
    }

def validate_payment_data(state: AgentState) -> AgentState:
    e = state["extraction"]
    issues = []

    if not e.get("invoice_number"):
        issues.append("missing_invoice_number")
    if not e.get("amount") or e["amount"] <= 0:
        issues.append("invalid_amount")
    if not e.get("currency"):
        issues.append("missing_currency")
    if not e.get("beneficiary_name"):
        issues.append("missing_beneficiary_name")

    if e.get("confidence", 0) < 0.85:
        issues.append("low_confidence")

    return {
        **state,
        "needs_review": len(issues) > 0,
        "reason": ",".join(issues)
    }

3) Add routing with add_conditional_edges

This is where LangGraph earns its keep. You branch on validation results instead of burying control flow in application code.

def route_after_validation(state: AgentState):
    return "review" if state["needs_review"] else "accept"

def human_review(state: AgentState) -> AgentState:
    # In production this would create a work item in your ops system.
    return {**state}

def accept_payment_record(state: AgentState) -> AgentState:
    # Persist to your database / queue here.
    return {**state}

graph = StateGraph(AgentState)
graph.add_node("extract", extract_fields)
graph.add_node("validate", validate_payment_data)
graph.add_node("review", human_review)
graph.add_node("accept", accept_payment_record)

graph.add_edge(START, "extract")
graph.add_edge("extract", "validate")
graph.add_conditional_edges(
    "validate",
    route_after_validation,
    {
        "review": "review",
        "accept": "accept",
    },
)
graph.add_edge("review", END)
graph.add_edge("accept", END)

app = graph.compile()

4) Run it against a payment document

For OCR output from a PDF or scanned invoice service, pass the normalized text into the graph. Keep the original file in object storage for audit replay.

sample_state = {
    "document_text": """
Invoice No: INV-10422
Amount Due: EUR 12,450.00
Beneficiary: Acme Supplies GmbH
IBAN: DE89370400440532013000
Due Date: 2026-05-01
""",
    "extraction": {},
    "needs_review": False,
    "reason": ""
}

result = app.invoke(sample_state)
print(result["extraction"])
print(result["needs_review"], result["reason"])

Production Considerations

  • Keep data residency explicit

    • Payments documents often contain bank details and personal data.
    • Route EU documents to EU-hosted infrastructure; don’t let OCR or LLM calls cross regions without a legal basis.
  • Log every decision

    • Store document hash, model name/version, prompt template version, extracted fields, validation result, and reviewer override.
    • This is non-negotiable for auditability and dispute resolution.
  • Add guardrails before posting downstream

    • Never auto-initiate a payment from raw extraction alone.
    • Require deterministic checks for IBAN format, currency whitelist, duplicate invoice detection, and threshold-based human approval.
  • Monitor extraction quality by document type

    • Track precision/recall for invoice number، amount، IBAN، beneficiary name.
    • A model that is “good overall” can still be bad on credit notes or multi-page invoices.

Common Pitfalls

  • Treating OCR text as trusted input

    • OCR output is noisy. Always validate amounts against totals and verify bank identifiers with deterministic rules before using them in payment workflows.
  • Skipping an audit trail

    • If you only store the final JSON payload, you cannot explain why a payment was blocked or approved.
    • Persist raw text hash plus model output plus routing decisions.
  • Letting the LLM decide business policy

    • The model should extract fields; your code should decide whether a record passes compliance checks.
    • Keep sanctions screening flags, approval thresholds, and residency rules outside the prompt.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides