AI Agents for payments: How to Automate document extraction (multi-agent with LangGraph)
Payments teams still burn hours on document-heavy workflows: merchant onboarding packs, chargeback evidence, KYC/KYB files, bank statements, invoices, payout instructions, and dispute attachments. The problem is not just extraction; it’s routing the right fields to the right controls, with enough confidence to automate decisions without creating compliance noise. Multi-agent systems built with LangGraph fit here because they can split extraction, validation, policy checks, and exception handling into separate steps instead of forcing one model to do everything.
The Business Case
- •Onboarding throughput: A manual merchant onboarding review often takes 30–90 minutes per file set across ops and compliance. A well-scoped extraction pipeline can cut that to 5–10 minutes, mostly for exception handling.
- •Cost reduction: For a payments ops team processing 10,000 documents/month, reducing 20 minutes of human review per packet saves roughly 3,300 labor hours/year. At a loaded cost of $45–$70/hour, that’s $150k–$230k annually in direct labor alone.
- •Error rate reduction: Manual keying errors in settlement instructions, tax IDs, or beneficiary details commonly sit around 1–3% in high-volume back offices. With field-level validation and cross-check agents, you can push that below 0.5% on structured documents.
- •Faster revenue recognition: If merchant underwriting or KYB approval drops from 2 days to same-day for standard cases, you shorten time-to-live for new merchants and reduce abandonment. In payments, that is real revenue, not just operational efficiency.
Architecture
A production setup should be boring in the right ways: deterministic where it matters, flexible where documents vary.
- •
Ingestion layer
- •Accept PDFs, scans, images, email attachments, and portal uploads.
- •Use OCR via AWS Textract, Google Document AI, or Azure Form Recognizer for low-quality scans.
- •Normalize output into a canonical JSON schema before any agent sees it.
- •
Agent orchestration layer
- •Use LangGraph to model the workflow as a state machine:
- •
classify_document - •
extract_fields - •
validate_against_rules - •
resolve_exceptions - •
route_to_human_review
- •
- •Use LangChain tools for retrieval and structured output parsing.
- •Keep one agent narrow: one for document classification, one for field extraction, one for policy validation.
- •Use LangGraph to model the workflow as a state machine:
- •
Knowledge and retrieval layer
- •Store product rules, onboarding playbooks, chargeback reason-code mappings, and jurisdiction-specific requirements in a vector store like pgvector.
- •Retrieve only the relevant policy snippets for the document type and country.
- •This matters when a UK merchant pack has different requirements than a US card-not-present merchant or an EU payout beneficiary file under GDPR constraints.
- •
Control and observability layer
- •Log every extracted field with source span references and confidence scores.
- •Add human approval thresholds by field type: e.g., tax ID and bank account number require higher confidence than company name.
- •Track precision/recall by document class, plus exception rates by region and payment rail.
| Component | Recommended stack | Why it matters |
|---|---|---|
| OCR + normalization | Textract / Document AI / Form Recognizer | Handles messy scans before LLM reasoning |
| Orchestration | LangGraph | Multi-step control flow with retries and branching |
| Retrieval | pgvector + Postgres | Keeps policy context close to transaction data |
| Validation | Rules engine + Python services | Deterministic checks for IBANs, ABA routing numbers, VAT IDs |
| Audit trail | Postgres + object storage | Needed for SOC 2 evidence and internal audit |
A practical pattern is to combine deterministic validators with agentic extraction. For example: the extraction agent proposes an IBAN; the validation service checks checksum and country format; a second agent compares the extracted beneficiary name against the bank statement or invoice header; then the workflow either auto-approves or escalates.
What Can Go Wrong
- •
Regulatory risk
- •Payments data often touches PII, financial account data, and sometimes health-related payment data if you serve healthcare merchants. That means GDPR obligations in the EU, SOC 2 control expectations everywhere you sell enterprise software, and potentially HIPAA exposure in niche flows.
- •Mitigation: minimize stored raw documents, encrypt at rest and in transit, redact unnecessary fields early, define retention windows by document class, and keep an immutable audit trail of every automated decision.
- •
Reputation risk
- •A bad extraction on settlement instructions or chargeback evidence can create failed payouts or lost disputes. In payments, one visible error can become a partner escalation fast.
- •Mitigation: start with low-risk document types first. Use confidence thresholds so anything affecting money movement or regulatory identity checks goes to human review until precision is proven.
- •
Operational risk
- •Agent workflows can drift if prompts change without versioning or if OCR quality varies by region. That creates inconsistent outputs across merchants or geographies.
- •Mitigation: version prompts like code, pin model versions where possible, maintain golden test sets per document type, and run regression tests before every release.
Getting Started
- •
Pick one narrow use case
- •Start with merchant onboarding packs or invoice extraction for reconciliation.
- •Avoid mixing KYC/KYB, disputes, and payouts in the first pilot.
- •Target a process with at least 500 documents/month so you have enough signal within 4–6 weeks.
- •
Build a two-team pilot
- •You need a small squad: 1 product owner, 1 backend engineer, 1 ML/AI engineer, 1 compliance partner, and optionally 1 ops SME.
- •Keep the pilot under one payments domain team so approvals are fast.
- •Define success metrics up front: straight-through processing rate, average handling time, field-level accuracy.
- •
Implement guardrails before scale
- •Add schema validation for critical fields like legal entity name, routing number/IBAN/SWIFT/BIC, VAT/GST IDs, MCCs where relevant.
- •Build explicit fallback paths for exceptions and low-confidence outputs.
- •Store lineage: source file hash → OCR text → extracted fields → validation result → final disposition.
- •
Run a controlled rollout
- •Pilot for 6–8 weeks on one region or one merchant segment.
- •Compare against manual baselines using precision/recall by field type and false positive rates on compliance flags.
- •If you hit stable metrics — typically above 95% field accuracy on structured docs with low exception volume — expand to adjacent workflows like chargeback packet assembly or bank statement verification.
The right goal is not full autonomy on day one. It’s reducing manual review on repeatable payment documents while keeping compliance-grade traceability intact. That’s where multi-agent orchestration earns its keep: not by replacing ops teams wholesale but by removing the repetitive work that slows them down.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit