How to Build a transaction monitoring Agent Using LlamaIndex in Python for banking

By Cyprian AaronsUpdated 2026-04-21

transaction-monitoringllamaindexpythonbanking

A transaction monitoring agent scans payment events, customer profiles, and policy rules to flag suspicious activity before it becomes a compliance problem. In banking, that matters because you need faster alert triage, consistent investigations, and a defensible audit trail for AML, sanctions, fraud, and internal policy checks.

Architecture

•
Event ingestion layer
- •Pulls transactions from Kafka, S3, a database table, or an API.
- •Normalizes fields like customer_id, amount, country, merchant_category, and timestamp.
•
Policy and controls corpus
- •Stores AML typologies, internal escalation rules, sanctions guidance, and investigation playbooks.
- •Indexed as documents so the agent can retrieve the exact rule behind each decision.
•
LlamaIndex retrieval layer
- •Uses VectorStoreIndex to retrieve relevant policies and historical cases.
- •Keeps the agent grounded in bank-approved source material instead of free-form reasoning.
•
Decision orchestration
- •A query engine or tool-based agent turns transaction context into a structured risk assessment.
- •Produces an alert summary with rationale, referenced policy snippets, and next action.
•
Audit and case logging
- •Persists input payloads, retrieved sources, model outputs, timestamps, and reviewer actions.
- •Required for model governance, compliance review, and regulator-facing evidence.
•
Human review workflow
- •Routes high-risk cases to investigators.
- •Lets analysts override or confirm alerts so you can measure precision over time.

Implementation

1) Install dependencies and prepare policy documents

Use LlamaIndex plus a local embedding model if your bank requires data residency. The example below uses OpenAI embeddings for simplicity; swap this for an on-prem or self-hosted embedding setup if needed.

pip install llama-index llama-index-llms-openai llama-index-embeddings-openai

Create a small policy corpus from approved banking guidance.

from llama_index.core import Document

policy_docs = [
    Document(
        text=(
            "Escalate transactions over $10,000 when velocity increases "
            "or when counterparties are in high-risk jurisdictions."
        ),
        metadata={"source": "aml_policy_v1", "section": "thresholds"}
    ),
    Document(
        text=(
            "Review payments involving sanctioned countries or entities "
            "before settlement."
        ),
        metadata={"source": "sanctions_playbook", "section": "screening"}
    ),
    Document(
        text=(
            "Flag round-dollar transfers repeated multiple times within "
            "a short window for possible structuring."
        ),
        metadata={"source": "fraud_typology_guide", "section": "structuring"}
    ),
]

2) Build the index over bank-approved context

This is the retrieval core. The agent should not “invent” policy; it should retrieve relevant guidance first.

from llama_index.core import VectorStoreIndex

index = VectorStoreIndex.from_documents(policy_docs)
query_engine = index.as_query_engine(similarity_top_k=2)

3) Create a transaction monitor function with structured output

For production banking workflows, keep the output deterministic enough for downstream case management. The pattern below takes one transaction event, retrieves policy context, and returns a concise alert summary.

import json
from datetime import datetime
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-4o-mini", temperature=0)

def monitor_transaction(txn: dict) -> dict:
    prompt = f"""
You are a banking transaction monitoring analyst.
Assess this transaction against AML/sanctions/fraud policy context.

Transaction:
{json.dumps(txn, indent=2)}

Return:
- risk_level: low|medium|high
- reasons: short bullet-style list
- recommended_action: one sentence
- evidence_needed: list of missing data points if any
"""
    retrieved = query_engine.query(prompt)

    # Keep the final record audit-friendly
    return {
        "transaction_id": txn["transaction_id"],
        "customer_id": txn["customer_id"],
        "assessed_at": datetime.utcnow().isoformat() + "Z",
        "analysis": str(retrieved),
    }

sample_txn = {
    "transaction_id": "txn_10001",
    "customer_id": "cust_7788",
    "amount_usd": 12500,
    "country": "AE",
    "merchant_category": "money_transfer",
    "timestamp": "2026-04-21T10:15:00Z",
}

result = monitor_transaction(sample_txn)
print(result["analysis"])

If you want the agent to reason over live tools instead of only retrieved policies, wrap functions as LlamaIndex tools. That lets you fetch customer history or sanctions hits on demand.

from llama_index.core.tools import FunctionTool
from llama_index.core.agent import ReActAgent

def get_customer_velocity(customer_id: str) -> str:
    return f"Customer {customer_id} made 8 transfers in the last 24 hours."

velocity_tool = FunctionTool.from_defaults(fn=get_customer_velocity)

agent = ReActAgent.from_tools(
    tools=[velocity_tool],
    llm=llm,
    verbose=True,
)

response = agent.chat(
    f"Review customer cust_7788 for suspicious activity. "
    f"Transaction amount is {sample_txn['amount_usd']} USD."
)
print(response)

4) Persist results for audit and case management

Banking teams need traceability. Store the raw event, retrieved sources, model response, and reviewer outcome in your case system or warehouse.

audit_record = {
    "transaction": sample_txn,
    "output": result,
    "model": "gpt-4o-mini",
    "index_version": "policy_docs_v1",
}

with open("audit_log.jsonl", "a", encoding="utf-8") as f:
    f.write(json.dumps(audit_record) + "\n")

Production Considerations

•
Data residency
- •Keep embeddings, vector stores, and logs inside approved regions.
- •If regulations require it, use self-hosted models or private endpoints instead of public APIs.
•
Auditability
- •Log every prompt, retrieved chunk, model version, tool call, and final disposition.
- •Make sure investigators can reconstruct why an alert was raised months later.
•
Guardrails
- •Never let the agent auto-freeze accounts or file regulatory reports without human approval.
- •Use strict thresholds for escalation and route borderline cases to analysts.
•
Monitoring
- •Track false positives, missed alerts, latency per transaction batch, and drift in alert distribution.
- •Revalidate prompts and policies whenever AML rules or sanctions lists change.

Common Pitfalls

•
Using the model without retrieval grounding
- •Problem: The agent hallucinates policy interpretations.
- •Fix: Always retrieve from approved bank documents with VectorStoreIndex before generating conclusions.
•
Storing sensitive data in uncontrolled logs
- •Problem: PII leaks into observability tools or developer consoles.
- •Fix: Redact account numbers, names, and full PANs before logging. Keep audit logs access-controlled.
•
Treating alerts as final decisions
- •Problem: The agent becomes an automated enforcement engine with no review path.
- •Fix: Use it as a triage layer. Human investigators should confirm escalations before any customer-impacting action.
•
Ignoring versioning
- •Problem: You cannot explain why yesterday’s alert differs from today’s.
- •Fix: Version policy documents, prompts, embeddings model choice, and LLM configuration together.

A good transaction monitoring agent is not just an LLM wrapper. It is retrieval plus controls plus auditability plus human review. In banking that combination is what makes the system usable by compliance teams instead of becoming another risky prototype.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit