How to Build a transaction monitoring Agent Using LangChain in Python for banking

By Cyprian AaronsUpdated 2026-04-21

transaction-monitoringlangchainpythonbanking

A transaction monitoring agent watches payment activity, scores risk, flags suspicious patterns, and routes cases for review. In banking, that matters because you need faster detection of fraud and AML typologies without drowning analysts in false positives, while still keeping every decision auditable.

Architecture

•
Transaction intake layer
- •Pulls events from Kafka, a database, or an API.
- •Normalizes fields like amount, merchant, counterparty, country, device ID, and timestamp.
•
Risk rules + feature extraction
- •Applies deterministic checks first: velocity spikes, high-risk geographies, structuring patterns.
- •Produces a compact feature payload for the LLM.
•
LangChain agent
- •Uses ChatOpenAI or another chat model through LangChain.
- •Interprets the transaction context and decides whether to flag, escalate, or ignore.
•
Tooling layer
- •Exposes tools for customer history lookup, sanctions screening, case creation, and policy retrieval.
- •Keeps the model grounded in bank-approved data.
•
Decision store + audit log
- •Persists every prompt, tool call, model output, and final disposition.
- •Supports compliance review and model governance.
•
Case management integration
- •Sends suspicious transactions to an investigator queue.
- •Writes structured evidence into the case record.

Implementation

1) Define a strict schema for transaction decisions

For banking workflows, do not let the model return free-form text. Use a Pydantic schema so your downstream system gets structured output every time.

from typing import Literal
from pydantic import BaseModel, Field

class TransactionDecision(BaseModel):
    decision: Literal["clear", "review", "escalate"] = Field(
        description="Final disposition for the transaction"
    )
    risk_score: int = Field(ge=0, le=100)
    reason: str = Field(description="Short explanation for audit")
    typology: str = Field(description="Detected AML/fraud pattern if any")

2) Build a small toolset for bank-approved context

The agent should inspect internal data through tools instead of guessing. In production you would connect these to core banking systems, KYC stores, and screening engines.

from langchain_core.tools import tool

@tool
def get_customer_profile(customer_id: str) -> str:
    """Fetch customer profile summary from internal banking systems."""
    profiles = {
        "CUST-1001": "Retail customer. Tenure: 4 years. Normal monthly volume: low. Country: GB.",
        "CUST-2007": "SME customer. Tenure: 9 months. Normal monthly volume: medium. Country: NG."
    }
    return profiles.get(customer_id, "Profile not found")

@tool
def get_recent_transactions(customer_id: str) -> str:
    """Fetch recent transaction summary for velocity and pattern checks."""
    history = {
        "CUST-1001": "3 card payments today totaling 120 GBP. No cross-border activity.",
        "CUST-2007": "12 transfers in last hour totaling 48,000 GBP. Two counterparties in high-risk corridor."
    }
    return history.get(customer_id, "No recent transactions found")

3) Create the LangChain agent with structured output

This pattern uses ChatOpenAI, binds tools through create_openai_tools_agent, and enforces structured output with PydanticOutputParser. If your model supports native structured output well enough, you can also use with_structured_output, but this version keeps the flow explicit.

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain.agents import create_openai_tools_agent, AgentExecutor
from langchain_core.output_parsers import PydanticOutputParser

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

parser = PydanticOutputParser(pydantic_object=TransactionDecision)

prompt = ChatPromptTemplate.from_messages([
    ("system",
     "You are a transaction monitoring analyst for a bank. "
     "Use only the provided tools and transaction facts. "
     "Follow AML/fraud policy strictly. "
     "Return a structured decision."),
    ("human",
     "Transaction:\n{transaction}\n\n"
     "Customer ID: {customer_id}\n\n"
     "{format_instructions}")
]).partial(format_instructions=parser.get_format_instructions())

tools = [get_customer_profile, get_recent_transactions]
agent = create_openai_tools_agent(llm=llm, tools=tools, prompt=prompt)
executor = AgentExecutor(agent=agent, tools=tools, verbose=False)

def monitor_transaction(transaction: dict):
    result = executor.invoke({
        "transaction": str(transaction),
        "customer_id": transaction["customer_id"]
    })
    return parser.parse(result["output"])

4) Run it on a real transaction payload

Keep the input small and normalized. The LLM should see only what it needs to make a defensible decision.

transaction_event = {
    "transaction_id": "TXN-90001",
    "customer_id": "CUST-2007",
    "amount": 12000,
    "currency": "GBP",
    "country": "NG",
    "channel": "wire",
    "timestamp": "2026-04-21T10:15:00Z",
}

decision = monitor_transaction(transaction_event)
print(decision.model_dump())

A typical production flow is:

•Pre-score with deterministic rules.
•Call the LangChain agent only when rules trigger ambiguity or elevated risk.
•Persist the raw prompt/response pair to an immutable audit store.
•Send review or escalate cases into your case management system.

Production Considerations

•
Compliance and auditability
- •Log prompts, tool outputs, model version, timestamps, and final disposition.
- •Store immutable records so investigators can reconstruct why a case was flagged.
•
Data residency
- •Keep customer data in-region if your jurisdiction requires it.
- •If you use hosted models, verify where inference happens and whether prompts are retained.
•
Guardrails
- •Restrict tools to read-only banking context unless a human approves actioning.
- •Add allowlists for countries, products, and case actions; never let the model invent policies.
•
Monitoring
- •Track false positive rate, escalation rate by segment, tool failure rate, and latency.
- •Reconcile model decisions against investigator outcomes to detect drift.

Common Pitfalls

•
Letting the model decide without deterministic checks
- •Bad pattern: sending every transaction straight to the LLM.
- •Fix: run rule-based thresholds first; use the agent for interpretation and triage.
•
Using free-form text outputs in downstream systems
- •Bad pattern: parsing “looks suspicious” from an unstructured response.
- •Fix: enforce PydanticOutputParser or native structured output with strict enums.
•
Exposing sensitive bank data directly to the prompt
- •Bad pattern: dumping full account history or PII into context.
- •Fix: minimize fields sent to the model; mask identifiers; keep sensitive joins inside approved tools.
•
Skipping human review on high-risk cases
- •Bad pattern: auto-freezing accounts based on one model call.
- •Fix: route escalations into analyst review unless policy explicitly allows automation with controls.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit