How to Build a KYC verification Agent Using AutoGen in Python for wealth management

By Cyprian AaronsUpdated 2026-04-21
kyc-verificationautogenpythonwealth-management

A KYC verification agent for wealth management ingests client identity documents, checks them against policy and external sources, and produces an auditable decision package for compliance teams. It matters because onboarding high-net-worth clients is slow, expensive, and heavily regulated; if you automate the repetitive checks without losing control of evidence and approvals, you reduce cycle time while keeping the firm defensible under audit.

Architecture

  • Client intake layer

    • Accepts passport, utility bill, tax ID, source-of-funds notes, and beneficial ownership data.
    • Normalizes inputs into a structured case object.
  • Document extraction agent

    • Reads OCR text from PDFs/images.
    • Pulls out names, dates of birth, addresses, document numbers, and expiry dates.
  • Policy validation agent

    • Compares extracted fields against KYC rules.
    • Flags missing fields, stale documents, PEP/sanctions hits, and inconsistent addresses.
  • Risk review agent

    • Scores the case using wealth-management-specific rules.
    • Escalates complex cases: trusts, offshore entities, multiple jurisdictions, or source-of-wealth ambiguity.
  • Supervisor / compliance approver

    • Reviews the final recommendation before any account opening action.
    • Ensures human sign-off for high-risk or ambiguous cases.
  • Audit logger

    • Stores prompts, model outputs, timestamps, rule hits, and final decisions.
    • Supports regulator review and internal model governance.

Implementation

1) Install AutoGen and define your case schema

For this pattern, use pyautogen and keep your case data explicit. Wealth management KYC fails when inputs are loosely shaped JSON blobs with no traceability.

pip install pyautogen pydantic
from pydantic import BaseModel
from typing import Optional, List

class KYCCase(BaseModel):
    client_id: str
    full_name: str
    country_of_residence: str
    document_type: str
    document_number: str
    document_expiry: str
    address: Optional[str] = None
    pep_match: bool = False
    sanctions_match: bool = False
    source_of_wealth_notes: Optional[str] = None
    risk_flags: List[str] = []

2) Create specialized agents with AutoGen

AutoGen’s AssistantAgent works well for focused roles. Keep each agent narrow so you can audit behavior by function instead of trying to explain one giant prompt later.

import os
from autogen import AssistantAgent

llm_config = {
    "model": "gpt-4o-mini",
    "api_key": os.environ["OPENAI_API_KEY"],
}

extractor = AssistantAgent(
    name="extractor",
    llm_config=llm_config,
    system_message=(
        "You extract KYC fields from provided text. "
        "Return only structured findings and note missing fields."
    ),
)

validator = AssistantAgent(
    name="validator",
    llm_config=llm_config,
    system_message=(
        "You validate KYC cases for wealth management. "
        "Check completeness, expiry dates, PEP/sanctions indicators, "
        "and jurisdictional risk. Be strict."
    ),
)

compliance = AssistantAgent(
    name="compliance",
    llm_config=llm_config,
    system_message=(
        "You are a compliance reviewer. "
        "Recommend APPROVE / REJECT / ESCALATE with concise rationale."
    ),
)

3) Orchestrate the workflow with GroupChat and GroupChatManager

This is the core pattern. The extractor proposes findings first, the validator checks policy risk next, and compliance makes the final recommendation. In production you would feed OCR text or parsed case notes into the first message.

from autogen import GroupChat, GroupChatManager

case_text = """
Client: Amelia Grant
Passport: X1234567 expiring 2028-09-30
Residence: Singapore
Address proof dated last month at Marina Bay address
Source of wealth: proceeds from sale of family business
No PEP match found in screening summary
No sanctions match found in screening summary
"""

groupchat = GroupChat(
    agents=[extractor, validator, compliance],
    messages=[],
    max_round=3,
)

manager = GroupChatManager(groupchat=groupchat, llm_config=llm_config)

result = extractor.initiate_chat(
    manager,
    message=(
        f"Analyze this KYC case for wealth management onboarding:\n\n{case_text}\n\n"
        "1) Extract key identity facts.\n"
        "2) Validate against KYC policy.\n"
        "3) Provide final compliance recommendation."
    ),
)

print(result.chat_history[-1]["content"])

4) Add deterministic guardrails before any downstream action

Do not let the LLM be the only decision maker. Use hard rules for obvious failures like expired IDs or sanctions matches. That keeps your agent aligned with policy even when the model output is noisy.

from datetime import datetime

def hard_fail(case: KYCCase) -> list[str]:
    issues = []
    expiry = datetime.strptime(case.document_expiry, "%Y-%m-%d").date()
    if expiry < datetime.utcnow().date():
        issues.append("Document expired")
    if case.pep_match:
        issues.append("PEP match requires enhanced due diligence")
    if case.sanctions_match:
        issues.append("Sanctions match requires immediate escalation")
    if not case.address:
        issues.append("Missing residential address")
    return issues

sample_case = KYCCase(
    client_id="C-10091",
    full_name="Amelia Grant",
    country_of_residence="Singapore",
    document_type="Passport",
    document_number="X1234567",
    document_expiry="2028-09-30",
)

issues = hard_fail(sample_case)
print(issues)

Production Considerations

  • Deployment boundaries

    • Keep KYC processing in-region if your firm has data residency requirements.
    • For cross-border wealth platforms, pin storage and inference to approved jurisdictions only.
  • Monitoring

    • Log every agent turn with client ID, rule hits, model version, and reviewer outcome.
    • Track false positives on PEP/sanctions flags because they create onboarding friction fast.
  • Guardrails

    • Use deterministic checks for expiry dates, mandatory fields, jurisdiction blocks, and sanctions hits.
    • Require human approval for trusts, shell companies, politically exposed persons, and complex beneficial ownership structures.
  • Compliance evidence

    • Store extracted text snippets alongside decisions so auditors can see why a case passed or failed.
    • Keep immutable logs for retention periods defined by your AML/KYC policy.

Common Pitfalls

  1. Letting the model decide everything

    • Mistake: using one assistant to both extract facts and approve onboarding.
    • Fix: split extraction, validation, and compliance review; add hard rules outside the model.
  2. Ignoring wealth-management edge cases

    • Mistake: treating all clients like retail applicants.
    • Fix: add branches for trusts, family offices,, offshore entities,, source-of-wealth verification,, and beneficial ownership chains.
  3. Weak audit trails

    • Mistake: storing only the final answer.
    • Fix: persist prompts,, intermediate outputs,, screening results,, timestamps,, and reviewer identity so every decision is defensible under audit.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides