How to Build a compliance checking Agent Using AutoGen in Python for payments

By Cyprian AaronsUpdated 2026-04-21

compliance-checkingautogenpythonpayments

A compliance checking agent for payments reviews transaction requests, customer context, and policy rules before money moves. Its job is to catch violations like sanctioned counterparties, suspicious amounts, missing KYC fields, or cross-border routing issues early, then return a decision with an audit trail that a risk team can defend.

Architecture

•
Transaction intake layer
- •Accepts payment requests from an API, queue, or workflow engine.
- •Normalizes fields like sender, receiver, amount, currency, country, and purpose.
•
Policy retrieval layer
- •Loads payment compliance rules from a controlled source.
- •Pulls in sanctions logic, AML thresholds, KYC completeness checks, and jurisdiction-specific restrictions.
•
AutoGen agent pair
- •A compliance analyst agent evaluates the request against policy.
- •A reviewer/approver agent validates the reasoning and flags ambiguous cases.
•
Decision engine
- •Converts the agent output into approve, reject, or escalate.
- •Enforces deterministic rules for hard stops like sanctioned entities.
•
Audit logger
- •Stores the input payload, policy version, model output, and final decision.
- •Required for traceability in regulated payments flows.
•
Human escalation path
- •Routes uncertain cases to operations or compliance staff.
- •Keeps the system useful without letting the model make unsupported calls.

Implementation

1) Install AutoGen and define the payment schema

Use AutoGen’s current Python package and keep your transaction payload explicit. For payments, schema discipline matters more than clever prompting.

pip install pyautogen

from dataclasses import dataclass
from typing import Literal

@dataclass
class PaymentRequest:
    payment_id: str
    sender_country: str
    receiver_country: str
    amount: float
    currency: str
    beneficiary_name: str
    purpose_code: str
    kyc_complete: bool

2) Create agents with `AssistantAgent` and a strict system prompt

The main pattern is to make one agent reason over policy and another agent verify the result. In production you would also add deterministic checks before any LLM call.

from autogen import AssistantAgent

llm_config = {
    "config_list": [
        {
            "model": "gpt-4o-mini",
            "api_key": "YOUR_OPENAI_API_KEY",
        }
    ],
    "temperature": 0,
}

compliance_agent = AssistantAgent(
    name="compliance_analyst",
    llm_config=llm_config,
    system_message=(
        "You are a payments compliance analyst. "
        "Check transactions for sanctions risk, KYC completeness, AML red flags, "
        "cross-border restrictions, and suspicious patterns. "
        "Return only JSON with fields: decision, reasons, risk_level, escalation_required."
    ),
)

reviewer_agent = AssistantAgent(
    name="compliance_reviewer",
    llm_config=llm_config,
    system_message=(
        "You review compliance decisions for payments. "
        "Reject unsupported claims. Ensure the output matches policy facts only. "
        "Return only JSON with fields: approved_by_reviewer, issues_found."
    ),
)

3) Run a two-agent compliance check using `GroupChat` and `GroupChatManager`

This is the core AutoGen orchestration pattern. The first agent evaluates the payment; the second checks whether the reasoning is defensible.

import json
from autogen import GroupChat, GroupChatManager

def build_prompt(payment: PaymentRequest) -> str:
    return f"""
Review this payment request against standard payments compliance controls:

payment_id: {payment.payment_id}
sender_country: {payment.sender_country}
receiver_country: {payment.receiver_country}
amount: {payment.amount}
currency: {payment.currency}
beneficiary_name: {payment.beneficiary_name}
purpose_code: {payment.purpose_code}
kyc_complete: {payment.kyc_complete}

Rules:
- Reject if KYC is incomplete.
- Escalate if cross-border risk is high or purpose is unclear.
- Reject if there is any sanctions concern.
- Approve only if no material risk exists.
"""

payment = PaymentRequest(
    payment_id="pay_10021",
    sender_country="GB",
    receiver_country="AE",
    amount=25000.00,
    currency="USD",
    beneficiary_name="Acme Trading LLC",
    purpose_code="SUPPLIER_PAYMENT",
    kyc_complete=True,
)

groupchat = GroupChat(
    agents=[compliance_agent, reviewer_agent],
    messages=[],
    max_round=4,
)

manager = GroupChatManager(groupchat=groupchat, llm_config=llm_config)

result = manager.initiate_chat(
    recipient=compliance_agent,
    message=build_prompt(payment),
)

print(result.chat_history[-1]["content"])

4) Parse the decision and enforce deterministic guardrails

Do not let the model be your final authority on hard-stop rules. In payments, deterministic checks must override model output when policy says so.

def hard_stop_checks(payment: PaymentRequest) -> tuple[bool, str]:
    if not payment.kyc_complete:
        return True, "KYC incomplete"
    if payment.amount > 100000:
        return True, "Amount exceeds manual review threshold"
    return False, ""

should_block, reason = hard_stop_checks(payment)

if should_block:
    final_decision = {"decision": "escalate", "reason": reason}
else:
    raw_output = result.chat_history[-1]["content"]
    final_decision = {"decision": "review_later", "agent_output": raw_output}

print(json.dumps(final_decision, indent=2))

Production Considerations

•
Keep sanctions logic outside the LLM
- •Use a deterministic screening service or internal watchlist lookup before calling AutoGen.
- •The agent should explain results; it should not be your source of truth for blocked entities.
•
Log everything needed for audit
- •Store request payloads, policy version IDs, model responses, timestamps, and final actions.
- •Payments teams need reproducibility when regulators ask why a transfer was approved or rejected.
•
Respect data residency
- •Route EU payment data to approved regions only.
- •If you use hosted models or tools, confirm where prompts and logs are stored and whether they cross borders.
•
Add escalation thresholds
- •High-value transfers, unusual corridors, shell-company indicators, or missing purpose codes should trigger human review.
- •That keeps false positives manageable without weakening controls.

Common Pitfalls

•
Using free-form prompts as policy
- •Mistake: asking the model to “decide if this looks suspicious.”
- •Fix: encode explicit rules and thresholds in code; use the model for explanation and triage.
•
Skipping audit metadata
- •Mistake: logging only the final decision.
- •Fix: persist input fields, rule hits, agent outputs, reviewer outputs, and policy version hashes.
•
Letting the LLM handle restricted data carelessly
- •Mistake: sending full PII or bank account details to every tool call.
- •Fix: redact unnecessary fields before prompting and minimize what leaves your boundary.

For payments compliance agents with AutoGen in Python; build them as decision-support systems first. Deterministic controls do blocking and routing. The agents do interpretation, explanation, and escalation.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit