How to Build a compliance checking Agent Using LangGraph in Python for payments
A compliance checking agent for payments inspects a transaction before it moves forward and decides whether it can proceed, needs review, or must be blocked. In payments, that matters because one bad decision can mean sanctions exposure, fraud losses, failed audits, or a regulator asking why your system let a prohibited transfer through.
Architecture
- •
Input normalizer
- •Takes raw payment data from your API, checkout flow, or internal ledger.
- •Converts it into a stable schema: payer, payee, amount, currency, country, channel, and metadata.
- •
Policy engine node
- •Applies deterministic checks first: sanctions lists, country restrictions, amount thresholds, KYC status, and product-specific rules.
- •This should be explicit code, not model-only reasoning.
- •
LLM review node
- •Handles ambiguous cases like free-text payment purpose or inconsistent merchant descriptions.
- •Produces a structured compliance assessment with reasons.
- •
Decision router
- •Chooses the next step based on the current state: approve, reject, or escalate to manual review.
- •In LangGraph this is usually done with
add_conditional_edges.
- •
Audit logger
- •Persists every decision path, rule hit, and model output.
- •Required for traceability in regulated payment flows.
- •
Human review handoff
- •Sends risky transactions to an ops queue with enough context for fast review.
- •Keeps the agent from making irreversible decisions on edge cases.
Implementation
1) Define the state and deterministic checks
Start with a typed state object. Keep compliance fields explicit so you can audit them later and avoid passing around unstructured blobs.
from typing import TypedDict, Literal
from langgraph.graph import StateGraph, START, END
class PaymentState(TypedDict):
transaction_id: str
amount: float
currency: str
payer_country: str
payee_country: str
purpose: str
kyc_passed: bool
sanctions_hit: bool
risk_score: int
decision: Literal["approve", "reject", "review"]
reason: str
def policy_check(state: PaymentState) -> PaymentState:
if not state["kyc_passed"]:
return {**state, "decision": "reject", "reason": "KYC failed"}
if state["sanctions_hit"]:
return {**state, "decision": "reject", "reason": "Sanctions hit"}
if state["amount"] > 10000:
return {**state, "risk_score": 80}
return {**state, "risk_score": 10}
This node should stay deterministic. For payments compliance you want hard rules to fire before any model is asked to interpret anything.
2) Add an LLM-based review node for ambiguous cases
Use the model only when rules do not already decide the outcome. The key is to force structured output so you can store and inspect it later.
from langchain_openai import ChatOpenAI
from pydantic import BaseModel, Field
class ComplianceReview(BaseModel):
decision: Literal["approve", "reject", "review"] = Field(...)
reason: str = Field(...)
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
def llm_review(state: PaymentState) -> PaymentState:
prompt = (
f"Review this payment for compliance risk.\n"
f"Amount: {state['amount']} {state['currency']}\n"
f"Payer country: {state['payer_country']}\n"
f"Payee country: {state['payee_country']}\n"
f"Purpose: {state['purpose']}\n"
f"Risk score: {state['risk_score']}\n"
f"Return approve/reject/review with a short reason."
)
result = llm.with_structured_output(ComplianceReview).invoke(prompt)
return {
**state,
"decision": result.decision,
"reason": result.reason,
}
For payments use cases, keep the prompt narrow. Don’t ask the model to “think broadly” about compliance; ask it to classify against known policy inputs.
3) Wire the workflow with LangGraph routing
This is where LangGraph fits well. You define nodes with StateGraph, then route based on the current state using add_conditional_edges.
def route_after_policy(state: PaymentState):
if state.get("decision") == "reject":
return END
if state.get("risk_score", 0) >= 50:
return "llm_review"
return END
graph = StateGraph(PaymentState)
graph.add_node("policy_check", policy_check)
graph.add_node("llm_review", llm_review)
graph.add_edge(START, "policy_check")
graph.add_conditional_edges("policy_check", route_after_policy)
graph.add_edge("llm_review", END)
app = graph.compile()
sample = {
"transaction_id": "tx_123",
"amount": 12500.0,
"currency": "USD",
"payer_country": "US",
"payee_country": "AE",
"purpose": "invoice payment for consulting services",
"kyc_passed": True,
"sanctions_hit": False,
}
result = app.invoke(sample)
print(result["decision"], result["reason"])
That pattern gives you a clean split:
- •deterministic controls first
- •LLM only on higher-risk cases
- •final state returned as a single auditable object
4) Add checkpoints when you need human review loops
If your operations team needs to pause and resume reviews, use LangGraph’s checkpointing via MemorySaver. That gives you persistence across turns and makes manual escalation practical.
from langgraph.checkpoint.memory import MemorySaver
checkpointer = MemorySaver()
app = graph.compile(checkpointer=checkpointer)
config = {"configurable": {"thread_id": "payment-review-001"}}
result = app.invoke(sample, config=config)
For production systems this should usually back onto durable storage rather than in-memory only. The pattern stays the same even if your checkpointer changes.
Production Considerations
- •
Keep policy decisions deterministic
- •Sanctions screening, country blocks, velocity limits, and KYC status should never depend on model output.
- •Use the LLM for classification or explanation only after hard rules run.
- •
Log every state transition
- •Store input payload hash, rule hits, model response, final decision, and reviewer overrides.
- •This is essential for audit trails in card payments, cross-border transfers, and AML investigations.
- •
Respect data residency
- •Do not send full PANs, bank account numbers, or personal identifiers to external model endpoints unless your legal and security teams have approved it.
- •Redact or tokenize sensitive fields before they reach the graph nodes.
- •
Add operational guardrails
- •Put timeouts on model calls.
- •Fail closed for high-risk payment rails.
- •Use allowlists for countries and merchant categories where possible.
Common Pitfalls
- •
Letting the LLM make final compliance decisions
- •Mistake: using the model as the primary policy engine.
- •Fix: run deterministic checks first and reserve the model for ambiguous cases only.
- •
Passing raw payment data into prompts
- •Mistake: sending full customer records or card data to the LLM.
- •Fix: redact sensitive fields and pass only what is needed for compliance reasoning.
- •
Not preserving decision provenance
- •Mistake: storing only “approved” or “rejected” without reasons.
- •Fix: persist rule outcomes, risk scores, model output, and routing path so auditors can reconstruct every decision.
- •
Ignoring jurisdiction-specific rules
- •Mistake: applying one global policy set to all payments.
- •Fix: branch by corridor, currency zone, entity type, and local regulatory requirements before invoking any review logic.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit