How to Build a compliance checking Agent Using AutoGen in Python for retail banking

By Cyprian AaronsUpdated 2026-04-21
compliance-checkingautogenpythonretail-banking

A compliance checking agent for retail banking reviews customer-facing content, transaction narratives, KYC notes, and internal case text against policy rules before anything is sent to a customer, analyst, or downstream system. It matters because a missed disclosure, a prohibited phrase, or a data residency violation can turn into regulatory exposure, customer harm, or an audit finding.

Architecture

  • UserProxyAgent

    • Receives the request from your application and triggers the workflow.
    • In production, this is usually wrapped behind your API layer, not exposed directly to end users.
  • ComplianceAssistantAgent

    • The main reasoning agent that inspects text against bank policy.
    • Uses the policy pack you provide: AML/KYC rules, complaint handling rules, marketing disclosures, fair lending constraints, and data handling requirements.
  • Policy retrieval layer

    • Pulls the current compliance policy snippets from a controlled source of truth.
    • For retail banking, this should be versioned and region-aware so UK/EU/US policies do not get mixed.
  • Audit logger

    • Stores every prompt, response, policy version, and decision.
    • Needed for exam readiness and internal model risk reviews.
  • Human review queue

    • Handles low-confidence or high-risk cases.
    • Required when the agent detects ambiguous language, missing disclosures, or possible PII leakage.

Implementation

1. Install AutoGen and define the policy scope

Use pyautogen and keep the scope narrow. This agent should only check content; it should not rewrite customer communications unless your control framework allows it.

pip install pyautogen
from autogen import AssistantAgent, UserProxyAgent

COMPLIANCE_SYSTEM_MESSAGE = """
You are a retail banking compliance checker.
Your job is to review text for:
- prohibited claims
- missing disclosures
- possible PII leakage
- AML/KYC red flags
- region-specific compliance issues

Return:
1. risk_level: low|medium|high
2. findings: bullet list
3. recommended_action: approve|revise|escalate
4. rationale: concise explanation grounded in policy
Do not invent policy. If uncertain, escalate.
"""

2. Create the agents with deterministic settings

For compliance workflows you want stable outputs. Keep temperature at zero and force structured responses in your own parsing layer.

llm_config = {
    "model": "gpt-4o-mini",
    "temperature": 0,
}

compliance_agent = AssistantAgent(
    name="compliance_checker",
    system_message=COMPLIANCE_SYSTEM_MESSAGE,
    llm_config=llm_config,
)

user_proxy = UserProxyAgent(
    name="bank_app",
    human_input_mode="NEVER",
    code_execution_config=False,
)

3. Send a real banking example through the agent

This pattern works for marketing copy checks, case-note checks, and outbound message reviews.

def check_compliance(text: str) -> str:
    message = f"""
Review this retail banking text:

{text}

Check for:
- misleading product claims
- missing APR / fee disclosures where relevant
- PII exposure
- suspicious transaction language
- any escalation requirement

Respond in the required format only.
"""
    result = user_proxy.initiate_chat(
        compliance_agent,
        message=message,
        max_turns=1,
    )
    return result.chat_history[-1]["content"]


sample_text = """
Open a premium savings account today and get guaranteed returns with no risk.
Customer SSN: 123-45-6789 was mentioned in the note.
"""

print(check_compliance(sample_text))

4. Add a policy gate before approval

In production you do not trust the model output blindly. Parse the result and block anything high risk or ambiguous.

def should_approve(compliance_result: str) -> bool:
    lowered = compliance_result.lower()
    if "risk_level: high" in lowered:
        return False
    if "recommended_action: escalate" in lowered:
        return False
    if "pii" in lowered or "ssn" in lowered:
        return False
    return True


result = check_compliance(sample_text)
if should_approve(result):
    print("APPROVED")
else:
    print("ESCALATE TO HUMAN REVIEW")

Production Considerations

  • Deploy behind a regional boundary

    • Keep EU customer data in EU-hosted infrastructure and US customer data in approved US regions.
    • Do not route sensitive retail banking prompts to generic third-party endpoints without legal review.
  • Log everything for auditability

    • Store input text hash, full prompt template version, model version, output, decision outcome, and reviewer override.
    • Regulators care about traceability more than clever prompting.
  • Add hard guardrails before model invocation

    • Redact account numbers, SSNs, card PANs, passport numbers, and internal case IDs where they are not needed for review.
    • Use deterministic regex filters before AutoGen sees the payload.
  • Treat low-confidence results as failures

    • If policy context is missing or the model returns vague language like “may be non-compliant,” escalate to a human queue.
    • Banking compliance systems should prefer false positives over false negatives.

Common Pitfalls

  1. Letting the agent free-write policy interpretations

    • Fix this by providing exact policy excerpts and requiring grounded findings only.
    • If it cannot cite a rule source internally from your prompt pack, it should escalate.
  2. Sending raw customer data into the chat

    • Fix this with preprocessing redaction and minimum necessary data access.
    • A compliance checker does not need full PANs or full SSNs to flag violations.
  3. Using one global policy set for all regions

    • Fix this by partitioning policies by jurisdiction and product type.
    • Mortgage disclosures, deposit account rules, complaints handling, and sanctions screening all have different controls depending on market and entity.
  4. Skipping human review on edge cases

    • Fix this with explicit escalation thresholds tied to severity classes.
    • High-risk items like potential fraud indicators, discriminatory language, or privacy leaks should never auto-pass on model output alone.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides