How to Build a KYC verification Agent Using AutoGen in Python for payments

By Cyprian AaronsUpdated 2026-04-21
kyc-verificationautogenpythonpayments

A KYC verification agent for payments takes customer identity data, checks it against policy and external evidence, and returns a decision with an audit trail. In payments, that matters because onboarding speed, fraud exposure, and compliance risk all sit on the same workflow.

Architecture

Build this agent as a small multi-agent system, not a single monolith:

  • User proxy / orchestrator

    • Receives the onboarding payload from your app or backend.
    • Triggers the verification flow and collects the final decision.
  • KYC analyst agent

    • Reviews submitted fields like name, DOB, address, country, and document metadata.
    • Applies your KYC policy rules and decides whether the case is pass, fail, or manual review.
  • Evidence retrieval tool

    • Pulls data from internal systems: customer profile, sanctions screening result, device risk score, document OCR output.
    • Keeps the analyst grounded in actual evidence instead of free-form reasoning.
  • Compliance reviewer agent

    • Checks edge cases against policy: missing fields, high-risk geographies, PEP/sanctions hits, inconsistent identity attributes.
    • Produces a structured escalation note for compliance ops.
  • Audit logger

    • Stores every prompt, tool call, decision, and reason code.
    • Gives you traceability for regulators and internal audits.
  • Policy store

    • Holds versioned KYC rules by market, product type, and risk tier.
    • Lets you change rules without rewriting agent logic.

Implementation

1) Install AutoGen and define your KYC payload

For this pattern, use AutoGen’s AssistantAgent for analysis and UserProxyAgent to kick off the workflow. The example below assumes you have a local policy function or API that returns structured KYC evidence.

pip install pyautogen
from autogen import AssistantAgent, UserProxyAgent

kyc_payload = {
    "customer_id": "cus_12345",
    "full_name": "Jane Doe",
    "dob": "1990-04-21",
    "country": "GB",
    "address": "12 Baker Street, London",
    "document_type": "passport",
    "document_verified": True,
    "sanctions_hit": False,
    "pep_hit": False,
    "device_risk_score": 12,
    "source_of_funds_declared": True,
}

2) Create an analyst agent with strict output formatting

Payments teams need structured decisions. Do not let the model return vague prose; force JSON-like output with reason codes so downstream systems can route the case.

llm_config = {
    "config_list": [
        {
            "model": "gpt-4o-mini",
            "api_key": os.environ["OPENAI_API_KEY"],
        }
    ],
    "temperature": 0,
}

kyc_analyst = AssistantAgent(
    name="kyc_analyst",
    llm_config=llm_config,
    system_message=(
        "You are a KYC verification analyst for a payments company. "
        "Use only the provided customer payload. "
        "Return one of: APPROVE, REJECT, MANUAL_REVIEW. "
        "Always include reason_codes as a list and a short audit_summary."
    ),
)

3) Run the verification flow through a user proxy

UserProxyAgent.initiate_chat() is the simplest production-friendly entry point when your backend acts as the caller. The message should contain only the minimum necessary data for review; avoid dumping raw documents unless your residency and retention controls are in place.

import os
import json
from autogen import AssistantAgent, UserProxyAgent

def build_prompt(payload: dict) -> str:
    return (
        "Verify this payment customer for KYC.\n"
        f"Customer payload:\n{json.dumps(payload, indent=2)}\n\n"
        "Decision rules:\n"
        "- Reject if sanctions_hit or pep_hit is true\n"
        "- Manual review if document_verified is false\n"
        "- Manual review if device_risk_score > 70\n"
        "- Approve only if identity is consistent and no red flags exist\n\n"
        "Return valid JSON with keys: decision, reason_codes, audit_summary."
    )

user_proxy = UserProxyAgent(
    name="payments_backend",
    human_input_mode="NEVER",
)

result = user_proxy.initiate_chat(
    kyc_analyst,
    message=build_prompt(kyc_payload),
)

print(result.chat_history[-1]["content"])

4) Add a second-pass compliance reviewer for escalations

Use a second AssistantAgent when the first pass returns MANUAL_REVIEW. This keeps your primary flow fast while still capturing compliance nuance for higher-risk cases.

compliance_reviewer = AssistantAgent(
    name="compliance_reviewer",
    llm_config=llm_config,
    system_message=(
        "You are a compliance reviewer for payment onboarding. "
        "Review only escalated KYC cases. "
        "Return JSON with keys: escalation_required, rationale, next_action."
    ),
)

review_prompt = """
This case was flagged for manual review.
Assess whether it needs compliance escalation based on:
- sanctions/PEP status
- device risk
- document verification status
- high-risk jurisdiction exposure

Case data:
{payload}
""".format(payload=json.dumps(kyc_payload))

review_result = user_proxy.initiate_chat(compliance_reviewer, message=review_prompt)
print(review_result.chat_history[-1]["content"])

Production Considerations

  • Keep data residency explicit

    • If you process EU customer data in-region only, pin model endpoints and storage to that region.
    • Do not send full documents or unnecessary PII to external tools unless your legal basis and transfer controls are already approved.
  • Log every decision path

    • Store prompt version, model version, input hash, output JSON, timestamps, and rule set version.
    • Regulators care about why a customer was approved or rejected more than they care about model internals.
  • Put hard guardrails around approvals

    • Never allow the agent to override sanctions or PEP hits.
    • Encode non-negotiable rules in code before the LLM sees the case.
  • Separate low-risk automation from manual review

    • Auto-approve only when all deterministic checks pass.
    • Route anything ambiguous to an ops queue with reason codes already populated.

Common Pitfalls

  1. Letting the model make final compliance decisions without deterministic checks

    • Fix: run sanctions screening, document validation, age checks, and jurisdiction rules in code first.
    • Use the LLM for interpretation and case narration, not as the source of truth.
  2. Returning free-form text instead of structured outcomes

    • Fix: require JSON output with decision, reason_codes, and audit_summary.
    • Your onboarding service should reject malformed responses before they hit production queues.
  3. Ignoring cross-border data handling

    • Fix: classify PII fields by residency requirements before sending them into AutoGen.
    • Mask document numbers where possible and keep raw artifacts in your controlled storage layer.

If you want this to hold up in payments production, treat AutoGen as orchestration around policy—not as policy itself. That separation is what keeps onboarding fast without turning compliance into guesswork.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides