How to Build a transaction monitoring Agent Using AutoGen in Python for fintech

By Cyprian AaronsUpdated 2026-04-21

transaction-monitoringautogenpythonfintech

A transaction monitoring agent watches payment events, scores them against risk rules and model outputs, and escalates suspicious activity for human review. In fintech, that matters because you need to catch fraud, AML patterns, and policy breaches early without drowning analysts in false positives.

Architecture

•
Event ingest layer
- •Pulls transactions from Kafka, SQS, webhooks, or a database change stream.
- •Normalizes fields like customer_id, merchant_id, amount, country, and timestamp.
•
Risk context service
- •Enriches each transaction with customer history, device signals, velocity counts, sanctions hits, and account metadata.
- •Keeps sensitive lookups out of the LLM prompt when possible.
•
AutoGen agent group
- •A monitoring agent triages the event.
- •A policy/compliance agent checks AML/KYC rules.
- •An optional investigator agent writes a concise case summary for analysts.
•
Decision engine
- •Converts agent output into actions: approve, hold, escalate, or file case.
- •Enforces deterministic thresholds outside the model.
•
Audit store
- •Persists prompts, outputs, rule hits, timestamps, and final decisions.
- •Required for compliance reviews and model governance.

Implementation

1) Install AutoGen and define the transaction schema

Use the modern AutoGen package and keep your transaction payload structured. For fintech work, structured input is non-negotiable because auditability beats free-form prompting.

pip install pyautogen pydantic

from datetime import datetime
from pydantic import BaseModel
from typing import Optional

class Transaction(BaseModel):
    transaction_id: str
    customer_id: str
    merchant_id: str
    amount: float
    currency: str
    country: str
    timestamp: datetime
    channel: str
    is_high_risk_merchant: bool = False
    prior_txn_count_1h: int = 0
    prior_txn_amount_24h: float = 0.0
    sanctions_match: bool = False

2) Create AutoGen agents with explicit roles

For this pattern, use AssistantAgent instances with narrow responsibilities. Keep the model temperature low so decisions stay consistent across repeated reviews.

import os
from autogen import AssistantAgent

llm_config = {
    "config_list": [
        {
            "model": "gpt-4o-mini",
            "api_key": os.environ["OPENAI_API_KEY"],
        }
    ],
    "temperature": 0,
}

monitor_agent = AssistantAgent(
    name="monitor_agent",
    llm_config=llm_config,
    system_message=(
        "You review financial transactions for fraud/AML risk. "
        "Return only JSON with keys: risk_level, reasons, action."
    ),
)

compliance_agent = AssistantAgent(
    name="compliance_agent",
    llm_config=llm_config,
    system_message=(
        "You assess transactions against AML/KYC/compliance policy. "
        "Return only JSON with keys: policy_flags, escalation_required."
    ),
)

3) Run a single transaction through both agents and merge the result

This is the core pattern. One agent handles risk triage; the other checks policy. Your application merges both outputs and applies deterministic business rules before any alert is created.

import json

def review_transaction(txn: Transaction):
    txn_json = txn.model_dump_json()

    monitor_msg = f"""
Transaction:
{txn_json}

Assess fraud/behavioral risk.
"""
    compliance_msg = f"""
Transaction:
{txn_json}

Assess AML/KYC/compliance concerns.
"""

    monitor_result = monitor_agent.generate_reply(messages=[{"role": "user", "content": monitor_msg}])
    compliance_result = compliance_agent.generate_reply(messages=[{"role": "user", "content": compliance_msg}])

    monitor_data = json.loads(monitor_result)
    compliance_data = json.loads(compliance_result)

    final_action = "approve"
    if txn.sanctions_match or compliance_data.get("escalation_required"):
        final_action = "escalate"
    elif monitor_data.get("risk_level") in {"high", "critical"}:
        final_action = "hold"

    return {
        "transaction_id": txn.transaction_id,
        "monitoring": monitor_data,
        "compliance": compliance_data,
        "final_action": final_action,
        "reviewed_at": datetime.utcnow().isoformat(),
    }

4) Add an analyst-facing case summary agent

When a transaction is escalated, generate a short explanation that investigators can read quickly. This reduces time-to-triage and keeps the LLM out of the final decision path.

case_writer = AssistantAgent(
    name="case_writer",
    llm_config=llm_config,
    system_message=(
        "Write concise investigation summaries for analysts. "
        "Do not invent facts. Use only provided inputs."
    ),
)

def build_case_summary(review_output: dict):
    prompt = f"""
Create an analyst summary from this review output:
{json.dumps(review_output)}
"""
    return case_writer.generate_reply(messages=[{"role": "user", "content": prompt}])

Production Considerations

•
Keep decisioning deterministic
- •Use the agent to classify and explain.
- •Use code to enforce thresholds like sanctions hits, velocity caps, country blocks, and merchant blacklists.
•
Log everything for audit
- •Store raw transaction input, prompt text, model response, rule outcomes, and final action.
- •Fintech auditors will ask why a payment was held or escalated; you need a replayable trail.
•
Respect data residency
- •Route EU customer data to EU-hosted infrastructure if required by policy.
- •Redact PII before sending prompts when the full identifier is not needed.
•
Add guardrails around model output
- •Reject malformed JSON.
- •Cap maximum alert volume per hour to avoid alert storms during outages or adversarial bursts.

Common Pitfalls

•
Letting the LLM make the final compliance decision
- •Don’t do this.
- •The model should recommend; your rules engine should decide based on hard policy like sanctions matches or threshold breaches.
•
Stuffing too much raw customer data into prompts
- •This increases privacy risk and token cost.
- •Pass only the fields needed for triage plus derived features like velocity counts or prior risk flags.
•
Skipping output validation
- •AutoGen agents can return text that looks structured but isn’t valid JSON.
- •Always parse and validate responses before using them in downstream workflows.
•
Ignoring jurisdiction-specific controls
- •A monitoring workflow that works in one region may violate another region’s retention or residency rules.
- •Partition storage and model access by tenant and geography from day one.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit