How to Build a transaction monitoring Agent Using AutoGen in Python for banking
A transaction monitoring agent watches payment events, flags suspicious patterns, and routes cases for review before they become losses or compliance failures. In banking, that matters because you need fast detection, traceable decisions, and controls that satisfy AML, fraud, and audit requirements.
Architecture
- •
Transaction ingestion layer
- •Pulls card, ACH, wire, and internal transfer events from a queue or stream.
- •Normalizes fields like amount, merchant, counterparty, timestamp, country, and customer profile.
- •
Risk rules engine
- •Applies deterministic checks first: velocity thresholds, structuring patterns, high-risk geographies, sanctions hits.
- •Keeps obvious cases out of the LLM path.
- •
AutoGen analyst agent
- •Uses
AssistantAgentto explain why a transaction is suspicious or benign. - •Produces structured outputs for case management.
- •Uses
- •
Tooling layer
- •Exposes functions for customer lookup, historical activity retrieval, sanctions screening status, and alert creation.
- •Implemented with
FunctionToolor plain Python callables registered as tools.
- •
Review / escalation agent
- •Uses a second agent to validate the first agent’s reasoning and reduce false positives.
- •Useful when you want a “second pair of eyes” before filing an alert.
- •
Audit and persistence layer
- •Stores input data hashes, model outputs, tool calls, and final disposition.
- •Required for model governance and regulator review.
Implementation
1) Install AutoGen and define the transaction schema
Use a minimal schema so every alert has the same shape. In banking workflows, consistency matters more than free-form analysis.
pip install pyautogen pydantic
from pydantic import BaseModel
from typing import List
class Transaction(BaseModel):
transaction_id: str
customer_id: str
amount: float
currency: str
country: str
channel: str
timestamp: str
merchant_category: str
prior_transactions_24h: int
prior_amount_24h: float
is_sanctioned_country: bool = False
is_pep: bool = False
class AlertResult(BaseModel):
risk_level: str
reasons: List[str]
action: str
2) Build deterministic checks before calling the model
Do not send every transaction straight to the LLM. Use rules to filter obvious low-risk or high-risk cases first.
def rule_score(txn: Transaction) -> dict:
reasons = []
score = 0
if txn.amount >= 10000:
score += 2
reasons.append("Amount exceeds $10k threshold")
if txn.prior_transactions_24h >= 8:
score += 2
reasons.append("High velocity in last 24h")
if txn.prior_amount_24h >= txn.amount * 5:
score += 1
reasons.append("Material cumulative spend in last 24h")
if txn.is_sanctioned_country:
score += 5
reasons.append("Counterparty country flagged as sanctioned/high risk")
if txn.is_pep:
score += 1
reasons.append("Customer marked PEP")
return {"score": score, "reasons": reasons}
3) Create an AutoGen agent with tools for case analysis
This pattern uses AssistantAgent plus registered tools. The tool returns facts; the model turns them into a banking-friendly narrative.
import os
import json
from autogen import AssistantAgent
def get_customer_profile(customer_id: str) -> dict:
# Replace with real DB lookup in production.
profiles = {
"C123": {"segment": "retail", "tenure_months": 18, "risk_rating": "medium"},
"C456": {"segment": "business", "tenure_months": 4, "risk_rating": "high"},
}
return profiles.get(customer_id, {"segment": "unknown", "tenure_months": 0, "risk_rating": "unknown"})
def create_alert(payload: str) -> str:
# Replace with case management API call.
return json.dumps({"status": "queued", "case_id": "CASE-2026-00127", "payload": payload})
llm_config = {
"model": os.environ["OPENAI_MODEL"],
"api_key": os.environ["OPENAI_API_KEY"],
}
agent = AssistantAgent(
name="transaction_monitor",
llm_config=llm_config,
)
agent.register_for_llm(name="get_customer_profile", description="Fetch customer profile")(get_customer_profile)
agent.register_for_llm(name="create_alert", description="Create a compliance alert")(create_alert)
4) Run analysis and force structured output
Keep the prompt narrow. Ask for risk level, reasons, and action only. That makes downstream case handling predictable.
def analyze_transaction(txn: Transaction) -> AlertResult:
rules = rule_score(txn)
if rules["score"] < 2:
return AlertResult(
risk_level="low",
reasons=rules["reasons"] or ["No material rule hits"],
action="no_action"
)
prompt = f"""
Analyze this banking transaction for AML/fraud risk.
Transaction:
{txn.model_dump_json(indent=2)}
Deterministic rule findings:
{json.dumps(rules)}
Return only:
- risk_level: low|medium|high
- reasons: list of concise reasons grounded in facts
- action: no_action|queue_for_review|escalate_immediately
Use customer profile if needed via tool calls.
"""
response = agent.generate_reply(messages=[{"role": "user", "content": prompt}])
content = response if isinstance(response, str) else response.get("content", "")
# In production parse JSON strictly; here we assume the model returns valid JSON.
data = json.loads(content)
if data["action"] != "no_action":
alert_payload = json.dumps({"transaction_id": txn.transaction_id, **data})
create_alert(alert_payload)
return AlertResult(**data)
Production Considerations
- •
Deploy behind a controlled service boundary
- •Put the agent behind an internal API with mTLS and service-to-service auth.
- •Keep raw PII out of prompts unless you have a documented legal basis and retention policy.
- •
Log everything needed for audit
- •Persist transaction input hashes, rule scores, tool invocations, model version, prompt version, and final disposition.
- •Regulators care about reproducibility more than cleverness.
- •
Add hard guardrails
- •Block unsupported actions like account closure or SAR filing without human approval.
- •Use allowlisted tools only; never give the model unrestricted database access.
- •
Respect residency and privacy constraints
- •If data must stay in-region, run inference in-region too.
- •Mask account numbers, names, and addresses when they are not required for decisioning.
Common Pitfalls
- •
Sending every transaction to the LLM
- •This increases cost and latency while making results less stable.
- •Fix it by using deterministic thresholds first and only escalating ambiguous cases.
- •
Letting the model free-write decisions
- •Free-text outputs are hard to audit and hard to integrate with case systems.
- •Fix it by requiring structured JSON output with a strict schema like
AlertResult.
- •
Skipping human review on high-impact alerts
- •Banking workflows need escalation controls for SAR candidates and severe fraud cases.
- •Fix it by routing
highrisk cases to a reviewer agent or analyst queue before any external filing or customer action.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit