How to Build a transaction monitoring Agent Using LangChain in Python for retail banking

By Cyprian AaronsUpdated 2026-04-21
transaction-monitoringlangchainpythonretail-banking

A transaction monitoring agent watches payment activity, scores it for risk, and decides whether to flag, enrich, or escalate it for review. In retail banking, that matters because you need to catch fraud, mule activity, structuring, and unusual behavior without drowning analysts in false positives.

Architecture

  • Transaction ingestion layer

    • Pulls events from core banking, card processing, or message queues.
    • Normalizes fields like customer_id, amount, merchant_category, channel, and country.
  • Risk rules and feature builder

    • Computes deterministic signals before the LLM sees anything.
    • Examples: velocity over 24 hours, first-time beneficiary, cross-border transfer, cash-equivalent merchant.
  • LangChain agent

    • Uses an LLM to reason over structured transaction context.
    • Produces a decision like clear, escalate, or request_more_info with a short rationale.
  • Tool layer

    • Exposes bank-approved functions such as customer profile lookup, sanctions screening, and case creation.
    • Keeps the model from inventing facts.
  • Audit and evidence store

    • Persists every input, tool call, model output, and final decision.
    • Required for compliance review and internal model governance.
  • Case management integration

    • Sends high-risk alerts into the bank’s AML/fraud workflow.
    • Lets investigators review the evidence trail without re-querying source systems.

Implementation

  1. Install dependencies and define your transaction schema

Use LangChain with a standard chat model wrapper. For production, keep transaction payloads structured and explicit; do not feed raw logs into the model.

from typing import Literal
from pydantic import BaseModel, Field

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI
from langchain.agents import create_tool_calling_agent, AgentExecutor


class Transaction(BaseModel):
    transaction_id: str
    customer_id: str
    amount: float
    currency: str = "USD"
    channel: Literal["card", "ach", "wire", "cash", "mobile"]
    country: str
    merchant_category: str
    is_new_payee: bool = False
    velocity_24h: int = Field(ge=0)
    pep_match: bool = False
  1. Expose bank-approved tools

Keep tools narrow. Each tool should do one thing and return deterministic data that can be audited later.

@tool
def get_customer_risk_profile(customer_id: str) -> dict:
    """Fetch static risk attributes for a customer."""
    return {
        "customer_id": customer_id,
        "segment": "mass_affluent",
        "account_age_days": 420,
        "historical_alert_rate": 0.03,
        "resident_country": "GB",
    }


@tool
def create_case(transaction_id: str, reason: str) -> dict:
    """Create a compliance or fraud review case."""
    return {
        "case_id": f"CASE-{transaction_id}",
        "status": "open",
        "reason": reason,
    }
  1. Build the LangChain agent

This pattern uses ChatPromptTemplate, create_tool_calling_agent, and AgentExecutor. The prompt should force the model to stay within policy and output a decision grounded in the provided fields.

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

prompt = ChatPromptTemplate.from_messages([
    (
        "system",
        """
You are a retail banking transaction monitoring analyst.
Classify each transaction as clear, escalate, or request_more_info.

Rules:
- Prefer escalation when there is cross-border activity + new payee + high velocity.
- Never invent facts. Use only the transaction payload and tool outputs.
- Keep rationale short and audit-friendly.
- If sanctions/PEP indicators are present, escalate immediately.
"""
    ),
    (
        "human",
        """
Transaction:
{transaction}

Return:
- decision
- rationale
- next_action
"""
    ),
])

agent = create_tool_calling_agent(llm=llm, tools=[get_customer_risk_profile, create_case], prompt=prompt)
executor = AgentExecutor(agent=agent, tools=[get_customer_risk_profile, create_case], verbose=False)
  1. Run an evaluation-friendly monitoring decision

Convert the Pydantic object to JSON so the prompt receives a stable structure. In production you would also persist this request/response pair for audit.

tx = Transaction(
    transaction_id="TXN-10001",
    customer_id="CUST-7781",
    amount=9850.00,
    currency="USD",
    channel="wire",
    country="NG",
    merchant_category="money_transfer",
    is_new_payee=True,
    velocity_24h=6,
    pep_match=False,
)

result = executor.invoke({
    "transaction": tx.model_dump_json(indent=2)
})

print(result["output"])

That gives you a working baseline where the LLM reasons over structured data and can call approved tools. In a real bank, you would usually wrap this with policy checks before any case is opened.

Production Considerations

  • Deploy in-region

    • Keep PII and transaction data inside approved data residency boundaries.
    • If your bank operates in multiple jurisdictions, route EU customer traffic to EU-hosted infrastructure only.
  • Log everything needed for audit

    • Persist prompt version, model version, tool inputs/outputs, final decision, timestamps, and reviewer overrides.
    • Regulators will ask why a case was escalated; your logs need to answer that without reconstructing state from memory.
  • Add deterministic guardrails before the LLM

    • Hard-block known bad patterns such as sanctions hits or threshold breaches before calling the agent.
    • The model should assist triage, not override mandatory policy rules.
  • Monitor drift by segment

    • Track alert rates by channel, geography, product type, and customer segment.
    • A spike in one corridor can mean fraud adaptation or bad prompt behavior.

Common Pitfalls

  1. Letting the model decide on raw unstructured text

    • This causes inconsistent decisions and weak audits.
    • Fix it by passing normalized fields like velocity, payee novelty, geography risk, and prior alert history.
  2. Using the LLM as the source of truth

    • The model should not infer sanctions status or customer residency from vague hints.
    • Fix it by routing those checks through tools such as sanctioned-party screening or KYC profile lookup.
  3. Skipping governance around thresholds and overrides

    • If analysts cannot see why something was escalated or cleared, your workflow will fail internal review fast.
    • Fix it by versioning prompts, storing every decision artifact, and keeping human override paths in the case management system.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides