How to Build a fraud detection Agent Using AutoGen in Python for investment banking

By Cyprian AaronsUpdated 2026-04-21

fraud-detectionautogenpythoninvestment-banking

A fraud detection agent in investment banking watches transaction streams, client activity, and exception signals, then decides when to escalate a case, request more evidence, or block an action. It matters because the cost of a missed alert is not just financial loss; it is regulatory exposure, failed controls, and audit findings that can hit the desk, the business line, and the bank’s license.

Architecture

Build this agent as a small multi-agent system, not a single monolith.

•
Transaction Analyst Agent
- •Ingests trade events, wire transfers, account changes, and login metadata.
- •Extracts suspicious patterns like velocity spikes, round-tripping, unusual counterparties, and sanction-adjacent behavior.
•
Policy/Compliance Agent
- •Checks decisions against AML/KYC rules, internal surveillance policy, and escalation thresholds.
- •Ensures the agent does not recommend actions outside approved operating procedures.
•
Evidence Retriever
- •Pulls customer profile data, historical alerts, case notes, and watchlist hits from internal systems.
- •Keeps responses grounded in bank-owned data instead of free-form reasoning.
•
Escalation Agent
- •Converts a suspicious event into a structured case summary.
- •Routes to human investigators with the right severity and supporting evidence.
•
Supervisor Orchestrator
- •Coordinates the conversation between agents.
- •Decides whether the output is “clear,” “needs review,” or “escalate immediately.”

Implementation

1. Install AutoGen and define your model client

For production banking systems, use a controlled model endpoint with logging turned on at the application layer. The example below uses AutoGen’s AssistantAgent and UserProxyAgent from autogen.agentchat.

import os
from autogen import AssistantAgent, UserProxyAgent

llm_config = {
    "config_list": [
        {
            "model": "gpt-4o-mini",
            "api_key": os.environ["OPENAI_API_KEY"],
        }
    ],
    "temperature": 0,
}

fraud_analyst = AssistantAgent(
    name="fraud_analyst",
    llm_config=llm_config,
    system_message=(
        "You are a fraud detection analyst for investment banking. "
        "Focus on AML red flags, market abuse indicators, unusual transfer behavior, "
        "and compliance-safe escalation. Return concise findings."
    ),
)

compliance_reviewer = AssistantAgent(
    name="compliance_reviewer",
    llm_config=llm_config,
    system_message=(
        "You are a compliance reviewer. Validate that any recommendation follows "
        "bank policy, auditability requirements, and escalation rules."
    ),
)

operator = UserProxyAgent(
    name="operator",
    human_input_mode="NEVER",
)

2. Feed the agents a structured event

Do not send raw chatty text. Investment banking controls work better when you pass normalized facts: client type, amount, geography, timestamps, counterparties, and prior risk flags.

transaction_event = """
Event:
- client_id: CUST-88421
- desk: Equity Derivatives
- amount_usd: 4850000
- currency: USD
- timestamp_utc: 2026-04-21T09:14:00Z
- counterparty_country: KY
- beneficiary_name: Orion Trading Ltd
- prior_alerts_90d: 3
- login_ip_country: SG
- device_change_last_24h: true
- notes: wire requested shortly after margin call; beneficiary added today
"""

analysis_prompt = f"""
Review this transaction for fraud/AML risk and provide:
1) risk level,
2) key indicators,
3) recommended action,
4) whether escalation is required.

{transaction_event}
"""

result = operator.initiate_chat(
    fraud_analyst,
    message=analysis_prompt,
)

3. Add a second-pass compliance review

The first agent should detect risk. The second agent should validate that the recommendation is safe to execute under bank policy. This keeps your workflow aligned with audit expectations.

fraud_summary = result.chat_history[-1]["content"]

review_prompt = f"""
Validate this fraud assessment for compliance:

{fraud_summary}

Check for:
- unsupported claims,
- missing evidence,
- inappropriate automated action,
- need for human investigator review.
"""

review = operator.initiate_chat(
    compliance_reviewer,
    message=review_prompt,
)
print(review.chat_history[-1]["content"])

4. Use a simple orchestration pattern for escalation

In production you want deterministic routing around the LLM output. A common pattern is to parse severity and send only high-risk cases to investigators.

def route_case(assessment_text: str) -> str:
    text = assessment_text.lower()
    if "high risk" in text or "escalation required" in text:
        return "ESCALATE"
    if "medium risk" in text or "manual review" in text:
        return "REVIEW"
    return "CLOSE"

final_assessment = review.chat_history[-1]["content"]
decision = route_case(final_assessment)

if decision == "ESCALATE":
    print("Open case in investigation queue")
elif decision == "REVIEW":
    print("Send to analyst queue")
else:
    print("No action required")

Production Considerations

•
Audit trail
- •Persist every prompt, model response, routing decision, and source event ID.
- •Regulators will ask why a case was escalated or closed; you need reproducible evidence.
•
Data residency
- •Keep customer data inside approved regions and approved model endpoints.
- •If your bank has jurisdictional constraints, do not send PII or trade data to an unapproved external service.
•
Guardrails
- •Hard-code disallowed actions like “freeze account” unless a human approves it.
- •Use schema validation on outputs so the agent returns structured severity fields instead of free-text only.
•
Monitoring
- •Track false positives by desk, region, product type, and counterparty class.
- •Watch for drift after product launches or market events; fraud patterns change fast in investment banking flows.

Common Pitfalls

•
Letting the model make final decisions
- •Don’t let AutoGen directly trigger account blocks or SAR filing.
- •Use it to recommend; keep execution behind deterministic policy checks and human approval gates.
•
Passing unstructured context
- •Free-text dumps make it hard to audit and harder to compare cases.
- •Normalize inputs into fields like amount, country code, desk name, alert history, and timestamp.
•
Ignoring compliance boundaries
- •A good fraud score is not enough if the workflow violates retention rules or residency rules.
- •Define what data can be sent to the model before you ship anything to production.
•
No feedback loop from investigators
- •If analysts override the agent repeatedly and you do nothing with that signal, accuracy will stall.
- •Feed closure reasons back into your rules engine and prompt templates so the system improves with actual case outcomes.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit