How to Build a transaction monitoring Agent Using AutoGen in Python for investment banking
A transaction monitoring agent in investment banking watches client and market activity, flags suspicious patterns, and routes cases for review before they become compliance problems. It matters because false negatives create regulatory exposure, while false positives waste analyst time and slow down investigations.
Architecture
- •
Ingestion layer
- •Pulls trades, payments, order events, and reference data from internal systems.
- •Normalizes timestamps, client IDs, instrument identifiers, and venue codes.
- •
Rules and feature extraction
- •Computes AML/market abuse features like velocity, structuring patterns, round-tripping, wash trading indicators, and threshold breaches.
- •Keeps deterministic logic outside the LLM so results are auditable.
- •
AutoGen agent team
- •A triage agent summarizes alerts.
- •A compliance agent applies policy language.
- •A reviewer agent checks for missing evidence and escalates edge cases.
- •
Case store and audit log
- •Persists every alert, prompt, tool call, decision, and reviewer comment.
- •Supports model governance and regulator requests.
- •
Human-in-the-loop review
- •Routes high-risk or low-confidence cases to a compliance analyst.
- •Prevents the model from making final adjudication on suspicious activity.
Implementation
1) Install AutoGen and define your data contract
Use pyautogen and keep the transaction payload small enough to audit. In banking systems, do not pass raw PII unless your environment explicitly allows it.
pip install pyautogen
from dataclasses import dataclass
from typing import List, Dict
@dataclass
class TransactionAlert:
alert_id: str
client_id: str
account_id: str
instrument: str
amount: float
currency: str
timestamp: str
venue: str
risk_features: List[str]
notes: str = ""
2) Create agents with explicit roles
Use AssistantAgent for analysis and UserProxyAgent for orchestration. Set llm_config with a real model configuration in your environment.
import os
import autogen
llm_config = {
"config_list": [
{
"model": os.environ["OPENAI_MODEL"],
"api_key": os.environ["OPENAI_API_KEY"],
}
],
"temperature": 0,
}
triage_agent = autogen.AssistantAgent(
name="triage_agent",
llm_config=llm_config,
system_message=(
"You triage investment banking transaction alerts. "
"Return concise findings, cite risk features, and never claim certainty. "
"If evidence is insufficient, recommend escalation."
),
)
compliance_agent = autogen.AssistantAgent(
name="compliance_agent",
llm_config=llm_config,
system_message=(
"You assess alerts against AML and market abuse policy. "
"Focus on compliance language, auditability, and escalation criteria."
),
)
reviewer_agent = autogen.AssistantAgent(
name="reviewer_agent",
llm_config=llm_config,
system_message=(
"You validate whether the prior analysis is complete. "
"Check for missing red flags, unsupported claims, and required escalation."
),
)
user_proxy = autogen.UserProxyAgent(
name="case_orchestrator",
human_input_mode="NEVER",
)
3) Run a controlled multi-agent review loop
This pattern keeps the workflow deterministic enough for production. The LLM produces triage text; your application decides whether to escalate based on policy thresholds.
def build_prompt(alert: TransactionAlert) -> str:
return f"""
Alert ID: {alert.alert_id}
Client ID: {alert.client_id}
Account ID: {alert.account_id}
Instrument: {alert.instrument}
Amount: {alert.amount} {alert.currency}
Timestamp: {alert.timestamp}
Venue: {alert.venue}
Risk features: {", ".join(alert.risk_features)}
Notes: {alert.notes}
Task:
1. Summarize the alert in one paragraph.
2. Identify the top compliance concerns.
3. State whether to escalate to human review.
4. Keep the response suitable for an audit record.
"""
def analyze_alert(alert: TransactionAlert):
prompt = build_prompt(alert)
triage_result = user_proxy.initiate_chat(
triage_agent,
message=prompt,
clear_history=True,
silent=True,
)
compliance_result = user_proxy.initiate_chat(
compliance_agent,
message=f"Review this triage output for policy alignment:\n{triage_result.summary}",
clear_history=True,
silent=True,
)
reviewer_result = user_proxy.initiate_chat(
reviewer_agent,
message=f"Validate completeness of this case:\n{compliance_result.summary}",
clear_history=True,
silent=True,
)
return {
"triage": triage_result.summary,
"compliance": compliance_result.summary,
"review": reviewer_result.summary,
}
if __name__ == "__main__":
alert = TransactionAlert(
alert_id="ALRT-10021",
client_id="C12345",
account_id="A99881",
instrument="EUR/USD SPOT",
amount=2500000.0,
currency="USD",
timestamp="2026-04-21T10:15:00Z",
venue="ECN-7",
risk_features=["rapid_repeat_trades", "threshold_breach", "unusual_venue_pattern"],
)
result = analyze_alert(alert)
print(result["triage"])
4) Add a decision gate outside the model
Do not let the agent decide final disposition without policy checks. Use deterministic rules for escalation.
def should_escalate(alert: TransactionAlert) -> bool:
high_risk_flags = {
"threshold_breach",
"rapid_repeat_trades",
"structuring_pattern",
"sanctions_match_candidate",
"unusual_venue_pattern",
"layering_indicator"
}
return any(flag in high_risk_flags for flag in alert.risk_features)
if should_escalate(alert):
print("Route to human compliance review")
else:
print("Store as low-risk monitored case")
Production Considerations
- •
Auditability
- •Persist prompts, responses, model version, timestamps, input features, and final disposition.
- •Regulators will ask why a case was escalated or closed; you need an immutable trail.
- •
Data residency
- •Keep client data inside approved regions.
- •If your bank has regional processing constraints, pin model endpoints and logs to that jurisdiction.
- •
Guardrails
- •Redact PII before sending content to the model when possible.
- •Block free-form advice on legal conclusions; force outputs into fixed schemas or short summaries.
- •
Monitoring
- •Track false positive rate, analyst override rate, latency per case, and escalation volume by desk or product.
- •Sudden drift in these metrics usually means a rule change upstream or a broken prompt.
Common Pitfalls
- •
Using the LLM as the decision engine
- •Don’t ask AutoGen to decide whether a trade is suspicious on its own.
- •Use it for triage and explanation; keep final disposition in deterministic policy code plus human review.
- •
Passing too much raw data into prompts
- •Dumping full trade histories increases cost and leaks sensitive information.
- •Precompute features in your pipeline and send only what the agent needs to explain the alert.
- •
Skipping replayable audit logs
- •If you cannot reproduce a case exactly later, you do not have a compliant workflow.
- •Store the exact prompt text, retrieved evidence set, model config, and agent outputs for every alert.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit