How to Build a transaction monitoring Agent Using LangChain in Python for fintech

By Cyprian AaronsUpdated 2026-04-21
transaction-monitoringlangchainpythonfintech

A transaction monitoring agent watches payment and transfer activity, scores each event for risk, explains why it flagged something, and routes suspicious cases to the right workflow. For fintech, this matters because you need fast detection, consistent decisions, and an audit trail that can survive compliance review.

Architecture

  • Transaction ingestion layer

    • Pulls events from Kafka, SQS, a webhook, or a database stream.
    • Normalizes fields like amount, currency, merchant category, country, account age, and velocity metrics.
  • Risk feature builder

    • Computes deterministic signals before the LLM sees anything.
    • Examples: daily spend delta, geo mismatch, first-time beneficiary, rapid retries.
  • LangChain decision agent

    • Uses a ChatOpenAI model wrapped in a ChatPromptTemplate.
    • Produces structured outputs like risk score, reason codes, and next action.
  • Policy and compliance guardrails

    • Enforces hard rules outside the model.
    • Example: block sanctioned countries immediately; never let the model override AML policy.
  • Case management sink

    • Writes results to your alert queue, case system, or SIEM.
    • Stores input features, model output, prompt version, and timestamp for auditability.
  • Human review path

    • Escalates medium/high-risk cases to an analyst.
    • Keeps low-confidence decisions out of automatic enforcement.

Implementation

1) Install the right packages

Use LangChain’s current split packages. For a production service you want explicit dependencies rather than a single monolith import path.

pip install langchain langchain-openai pydantic

Set your API key in the environment:

export OPENAI_API_KEY="your-key"

2) Define the transaction schema and risk rules

Keep deterministic checks outside the LLM. The model should explain and classify; your code should enforce policy.

from typing import Literal
from pydantic import BaseModel, Field

class Transaction(BaseModel):
    transaction_id: str
    account_id: str
    amount: float
    currency: str
    country: str
    merchant_category: str
    account_age_days: int
    velocity_1h: int = Field(description="Number of transactions in the last hour")
    is_new_beneficiary: bool

SANCTIONED_COUNTRIES = {"IR", "KP", "SY"}

def hard_block(tx: Transaction) -> bool:
    return tx.country in SANCTIONED_COUNTRIES or tx.amount > 1000000

3) Build a structured LangChain agent for classification

Use ChatPromptTemplate, ChatOpenAI, and PydanticOutputParser so you get machine-readable output. This is better than free-form text because downstream systems need stable fields.

from typing import Literal
from pydantic import BaseModel, Field
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import PydanticOutputParser

class MonitoringDecision(BaseModel):
    risk_level: Literal["low", "medium", "high"]
    reason_codes: list[str] = Field(default_factory=list)
    recommended_action: Literal["allow", "review", "block"]
    summary: str

parser = PydanticOutputParser(pydantic_object=MonitoringDecision)

prompt = ChatPromptTemplate.from_messages([
    ("system",
     "You are a transaction monitoring analyst for a fintech company. "
     "Apply AML/fraud reasoning. Never override hard policy rules. "
     "Return only valid structured output."),
    ("human",
     "Transaction:\n{transaction_json}\n\n"
     "Known risk signals:\n{signals}\n\n"
     "{format_instructions}")
])

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

agent_chain = prompt | llm | parser

def build_signals(tx: Transaction) -> list[str]:
    signals = []
    if tx.velocity_1h >= 10:
        signals.append("high_velocity")
    if tx.is_new_beneficiary:
        signals.append("new_beneficiary")
    if tx.account_age_days < 30:
        signals.append("new_account")
    if tx.amount > 5000:
        signals.append("high_amount")
    return signals

def monitor_transaction(tx_dict: dict):
    tx = Transaction(**tx_dict)

    if hard_block(tx):
        return {
            "transaction_id": tx.transaction_id,
            "risk_level": "high",
            "recommended_action": "block",
            "reason_codes": ["policy_block"],
            "summary": "Blocked by deterministic policy rule."
        }

    result = agent_chain.invoke({
        "transaction_json": tx.model_dump_json(),
        "signals": build_signals(tx),
        "format_instructions": parser.get_format_instructions()
    })

    return {
        "transaction_id": tx.transaction_id,
        **result.model_dump()
    }

4) Run it on a sample transaction

This pattern fits a worker consuming events from your queue. The important part is that the LLM only handles interpretation after policy checks have already run.

sample_tx = {
    "transaction_id": "tx_12345",
    "account_id": "acct_999",
    "amount": 7800.50,
    "currency": "USD",
    "country": "GB",
    "merchant_category": "digital_goods",
    "account_age_days": 12,
    "velocity_1h": 14,
    "is_new_beneficiary": True,
}

decision = monitor_transaction(sample_tx)
print(decision)

Production Considerations

  • Keep compliance logic deterministic

    • Sanctions screening, threshold blocks, and residency checks should live in code or dedicated rules engines.
    • The LLM should never be the final authority for AML or fraud controls.
  • Log everything needed for audit

    • Persist input payload hashes, extracted features, prompt version, model name, output JSON, and human reviewer actions.
    • Regulators will ask why a case was flagged; you need reproducible evidence.
  • Respect data residency

    • If customer data must stay in-region, route prompts through region-bound infrastructure or use an approved private deployment.
    • Avoid sending raw PII unless you have explicit legal basis and masking controls.
  • Add confidence-based routing

    • Low-risk cases can auto-close.
    • Medium-risk cases go to analysts.
    • High-risk or policy-triggered cases should block immediately and create an alert.

Common Pitfalls

  1. Letting the model make policy decisions

    • Mistake: asking the LLM whether to ignore sanctions or threshold rules.
    • Fix: enforce hard blocks before any model call.
  2. Using free-form outputs

    • Mistake: parsing plain English summaries with regex.
    • Fix: use PydanticOutputParser or another structured output approach so downstream systems get stable fields.
  3. Ignoring prompt/version traceability

    • Mistake: changing prompts without tracking which version produced which alert.
    • Fix: store prompt templates, model name, temperature, and chain version with every decision record.
  4. Sending raw sensitive data everywhere

    • Mistake: dumping full customer profiles into prompts.
    • Fix: minimize payloads. Mask account numbers, remove unnecessary PII, and keep only features needed for the monitoring decision.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides