How to Build a fraud detection Agent Using AutoGen in Python for wealth management

By Cyprian AaronsUpdated 2026-04-21
fraud-detectionautogenpythonwealth-management

A fraud detection agent for wealth management watches client activity, flags suspicious behavior, and routes high-risk cases for human review before money moves or accounts get compromised. It matters because wealth platforms deal with large transfers, trusted relationships, and strict compliance requirements, so false negatives are expensive and false positives can disrupt clients and advisors fast.

Architecture

  • Client activity ingestion
    • Pulls transactions, login events, beneficiary changes, address updates, and advisor notes from internal systems.
  • Risk scoring layer
    • Applies deterministic rules first: velocity spikes, unusual geolocation, new payee setup, device mismatch, and transfer size anomalies.
  • AutoGen orchestration
    • Uses a UserProxyAgent to run the workflow and one or more AssistantAgent instances to analyze cases and produce structured findings.
  • Policy and compliance guardrail
    • Enforces wealth management rules like suitability checks, AML escalation thresholds, audit logging, and data residency constraints.
  • Case management output
    • Produces a JSON decision payload for downstream systems: approve, hold, escalate, or reject.
  • Human review handoff
    • Sends borderline cases to an operations or compliance analyst with evidence attached.

Implementation

1) Install AutoGen and define the case schema

Use the current AutoGen package and keep the output contract strict. In fraud workflows, free-form prose is a liability; you want a machine-readable decision object every time.

pip install pyautogen pydantic
from pydantic import BaseModel, Field
from typing import Literal, List

class FraudCase(BaseModel):
    client_id: str
    account_id: str
    event_type: str
    amount_usd: float
    country: str
    device_trust_score: int = Field(ge=0, le=100)
    prior_alerts_30d: int = Field(ge=0)
    advisor_override: bool = False

class FraudDecision(BaseModel):
    decision: Literal["approve", "hold", "escalate", "reject"]
    risk_score: int = Field(ge=0, le=100)
    reasons: List[str]
    compliance_notes: List[str]

2) Create the AutoGen agents

The pattern here is simple: one assistant evaluates fraud risk, another enforces compliance language and auditability, and the user proxy runs the conversation. For wealth management, I keep these roles separate so policy checks don’t get buried inside analysis.

import os
from autogen import AssistantAgent, UserProxyAgent

llm_config = {
    "model": "gpt-4o-mini",
    "api_key": os.environ["OPENAI_API_KEY"],
}

fraud_analyst = AssistantAgent(
    name="fraud_analyst",
    llm_config=llm_config,
    system_message=(
        "You are a fraud detection analyst for a wealth management firm. "
        "Assess suspicious activity using transaction patterns, identity signals, "
        "and behavioral anomalies. Return concise findings."
    ),
)

compliance_reviewer = AssistantAgent(
    name="compliance_reviewer",
    llm_config=llm_config,
    system_message=(
        "You are a compliance reviewer for wealth management. "
        "Check that decisions respect AML/KYC expectations, auditability, "
        "and do not expose unnecessary personal data."
    ),
)

user_proxy = UserProxyAgent(
    name="orchestrator",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=2,
)

3) Run the case through a two-agent review flow

This example uses initiate_chat() to send the case to the fraud analyst first, then passes the result to compliance for a second pass. That gives you an auditable chain of reasoning without letting one model own both detection and policy.

import json

case = FraudCase(
    client_id="C12345",
    account_id="A99881",
    event_type="wire_transfer",
    amount_usd=250000.00,
    country="GB",
    device_trust_score=18,
    prior_alerts_30d=3,
    advisor_override=False,
)

fraud_prompt = f"""
Analyze this wealth management event for fraud risk.

Return:
- risk score from 0 to 100
- decision recommendation
- bullet reasons
- any evidence gaps

Case:
{case.model_dump_json()}
"""

fraud_result = user_proxy.initiate_chat(
    fraud_analyst,
    message=fraud_prompt,
)

compliance_prompt = f"""
Review the fraud analysis below for compliance concerns in wealth management.
Check for auditability, data minimization, escalation needs, and residency concerns.

Fraud analysis:
{fraud_result.summary}
"""

compliance_result = user_proxy.initiate_chat(
    compliance_reviewer,
    message=compliance_prompt,
)

final_payload = {
    "case": case.model_dump(),
    "fraud_analysis": fraud_result.summary,
    "compliance_review": compliance_result.summary,
}

print(json.dumps(final_payload, indent=2))

4) Add deterministic guardrails before calling the model

Do not let the LLM be your first line of defense. In production wealth systems, hard rules should catch obvious abuse before any model call happens.

def precheck(case: FraudCase) -> list[str]:
    reasons = []

    if case.amount_usd >= 100000:
        reasons.append("High-value transfer above internal threshold")
    if case.device_trust_score < 30:
        reasons.append("Low device trust score")
    if case.prior_alerts_30d >= 2:
        reasons.append("Repeated alerts in last 30 days")
    if case.country not in {"US", "CA", "GB", "DE", "FR"}:
        reasons.append("Cross-border destination requires enhanced review")

    return reasons

hard_flags = precheck(case)
if hard_flags:
    print({"decision": "hold", "hard_flags": hard_flags})
else:
    print("Send to AutoGen workflow")

Production Considerations

  • Deploy behind a policy gate

    • Put deterministic rules in front of AutoGen so only ambiguous cases reach the model.
    • In wealth management, this reduces noise on large but legitimate client transfers.
  • Log every decision path

    • Store input features, model outputs, prompt versions, timestamps, and reviewer actions.
    • Auditors will ask why a wire was held or escalated; you need traceability.
  • Respect data residency

    • Keep client PII and transaction data in-region if your jurisdiction requires it.
    • If you use hosted LLM APIs, confirm where prompts are processed and whether logs are retained.
  • Add human approval thresholds

    • Automatically escalate high-value transfers with low trust scores or repeated alerts.
    • Advisors should not be able to override controls without leaving an immutable audit trail.

Common Pitfalls

  1. Letting the model make final decisions on its own

    • Fix it by using LLM output as advisory only.
    • Final action should come from rules plus a human review step for sensitive cases.
  2. Sending raw sensitive data into prompts

    • Fix it by redacting account numbers, tax IDs, addresses, and full beneficiary details unless strictly needed.
    • Use tokenized identifiers so you can still correlate events in logs.
  3. Using vague outputs instead of structured decisions

    • Fix it by forcing JSON-like responses with explicit fields such as decision, risk_score, reasons, and compliance_notes.
    • Wealth operations teams need something they can route into case management systems without manual cleanup.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides