How to Build a compliance checking Agent Using LlamaIndex in Python for payments

By Cyprian AaronsUpdated 2026-04-21

compliance-checkingllamaindexpythonpayments

A compliance checking agent for payments reads a transaction, customer profile, and policy set, then tells you whether the payment can proceed, needs review, or must be blocked. It matters because payments teams need fast decisions without losing auditability, and the agent has to explain every flag in terms a reviewer and regulator can trace.

Architecture

•
Policy corpus
- •Source documents for AML, sanctions, KYC, PCI-DSS controls, internal risk rules, and country-specific payment restrictions.
- •Store them as versioned documents so every decision can be traced to the exact policy snapshot.
•
LlamaIndex retrieval layer
- •Use VectorStoreIndex to retrieve relevant policy chunks for a given payment scenario.
- •Keep chunking strict so the model sees precise clauses instead of broad legal noise.
•
Decision engine
- •A structured output prompt that classifies the payment as approve, review, or block.
- •The output should include reasons, cited policy references, and missing data fields.
•
Audit log store
- •Persist input payloads, retrieved policy nodes, model output, and final human decision.
- •This is non-negotiable for payments compliance and post-incident review.
•
Guardrails layer
- •Deterministic checks for hard rules like sanctioned countries, blocked MCCs, velocity limits, or missing KYC.
- •Do not let the LLM override fixed compliance rules.

Implementation

1) Load policy documents into a LlamaIndex index

Use SimpleDirectoryReader for local policy files and build a VectorStoreIndex. In production you would point this at an approved document repository with versioning and access control.

from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
from llama_index.core.node_parser import SentenceSplitter

# Load policies from disk
documents = SimpleDirectoryReader("./policies").load_data()

# Split into smaller nodes for better retrieval precision
splitter = SentenceSplitter(chunk_size=512, chunk_overlap=50)
nodes = splitter.get_nodes_from_documents(documents)

# Build the index
index = VectorStoreIndex(nodes)

# Create a retriever
retriever = index.as_retriever(similarity_top_k=4)

2) Define the payment compliance input and hard-rule checks

Payments need deterministic checks before any LLM call. If a transaction hits a hard block like sanctioned geography or missing KYC status, return immediately.

from dataclasses import dataclass

SANCTIONED_COUNTRIES = {"IR", "KP", "SY", "CU"}
BLOCKED_MCCS = {"4829", "6012"}  # example: wire transfer / financial institutions

@dataclass
class PaymentRequest:
    transaction_id: str
    amount: float
    currency: str
    country: str
    merchant_category_code: str
    customer_kyc_status: str
    description: str

def hard_rule_check(req: PaymentRequest):
    if req.country in SANCTIONED_COUNTRIES:
        return {"decision": "block", "reason": "Sanctioned country"}
    if req.merchant_category_code in BLOCKED_MCCS:
        return {"decision": "review", "reason": "High-risk MCC"}
    if req.customer_kyc_status != "verified":
        return {"decision": "review", "reason": "KYC not verified"}
    return None

3) Retrieve relevant policies and ask the LLM for a structured decision

Use RetrieverQueryEngine with a custom prompt. For production compliance workflows, keep the response format strict so downstream systems can parse it reliably.

import json
from llama_index.core import PromptTemplate
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-4o-mini", temperature=0)

COMPLIANCE_PROMPT = PromptTemplate(
    """
You are a payments compliance analyst.
Use only the provided policy context to assess this payment request.

Return JSON with:
- decision: approve | review | block
- reasons: array of strings
- cited_policies: array of strings
- missing_fields: array of strings

Policy context:
{context_str}

Payment request:
{query_str}
"""
)

def assess_payment(req: PaymentRequest):
    hard_rule_result = hard_rule_check(req)
    if hard_rule_result:
        return {
            "transaction_id": req.transaction_id,
            **hard_rule_result,
            "source": "hard_rule"
        }

    query = f"""
transaction_id={req.transaction_id}
amount={req.amount}
currency={req.currency}
country={req.country}
mcc={req.merchant_category_code}
kyc_status={req.customer_kyc_status}
description={req.description}
"""

    query_engine = index.as_query_engine(
        llm=llm,
        text_qa_template=COMPLIANCE_PROMPT,
        similarity_top_k=4,
    )

    response = query_engine.query(query)
    return {
        "transaction_id": req.transaction_id,
        "decision_source": "llm_policy_check",
        "raw_response": str(response),
    }

4) Parse output and persist an audit record

For payments you need traceability. Store the input payload, decision output, retrieved evidence, model version, and timestamp in your audit system.

from datetime import datetime

def audit_record(req: PaymentRequest, result: dict):
    return {
        "timestamp": datetime.utcnow().isoformat(),
        "transaction_id": req.transaction_id,
        "request": req.__dict__,
        "result": result,
        "model": "gpt-4o-mini",
        "index_type": "VectorStoreIndex",
    }

payment = PaymentRequest(
    transaction_id="tx_123",
    amount=1250.00,
    currency="USD",
    country="GB",
    merchant_category_code="5812",
    customer_kyc_status="verified",
    description="Online retail purchase"
)

result = assess_payment(payment)
record = audit_record(payment, result)
print(json.dumps(record, indent=2))

Production Considerations

•
Enforce deterministic blocks first
- •Sanctions screening, country restrictions, AML thresholds, and KYC status checks should happen before retrieval.
- •The LLM should explain ambiguous cases, not decide on hard regulatory failures.
•
Log everything needed for audit
- •Persist policy document versions, retrieved node IDs from LlamaIndex responses, model name, prompt template version, and final outcome.
- •Regulators care about reproducibility more than cleverness.
•
Control data residency
- •Payment data often contains PII and sensitive financial metadata.
- •Keep embeddings, vector stores, and model inference inside approved regions; do not ship transaction payloads across borders casually.
•
Add human-in-the-loop routing
- •Route review decisions to analysts with the evidence bundle attached.
- •For high-value transactions or cross-border transfers, require manual approval even if the agent returns approve.

Common Pitfalls

•
Letting the LLM make hard compliance decisions
- •Bad pattern: asking the model whether a sanctioned country is allowed.
- •Fix: enforce static rules outside the model and use the agent only for interpretation of policy text.
•
Using broad chunks from legal docs
- •Bad pattern: large chunks that mix unrelated rules across multiple jurisdictions.
- •Fix: split policies by clause or control area using SentenceSplitter, then retrieve narrowly relevant context.
•
Skipping versioning on policy sources
- •Bad pattern: updating policies in place with no snapshot history.
- •Fix: version your documents and record which version influenced each decision so audits can reproduce outcomes later.
•
Ignoring output structure
- •Bad pattern: free-form prose that downstream systems cannot parse.
- •Fix: require JSON-like fields such as decision, reasons, cited_policies, and validate them before actioning any payment.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit