How to Build a KYC verification Agent Using AutoGen in Python for wealth management
A KYC verification agent for wealth management ingests client identity documents, checks them against policy and external sources, and produces an auditable decision package for compliance teams. It matters because onboarding high-net-worth clients is slow, expensive, and heavily regulated; if you automate the repetitive checks without losing control of evidence and approvals, you reduce cycle time while keeping the firm defensible under audit.
Architecture
- •
Client intake layer
- •Accepts passport, utility bill, tax ID, source-of-funds notes, and beneficial ownership data.
- •Normalizes inputs into a structured case object.
- •
Document extraction agent
- •Reads OCR text from PDFs/images.
- •Pulls out names, dates of birth, addresses, document numbers, and expiry dates.
- •
Policy validation agent
- •Compares extracted fields against KYC rules.
- •Flags missing fields, stale documents, PEP/sanctions hits, and inconsistent addresses.
- •
Risk review agent
- •Scores the case using wealth-management-specific rules.
- •Escalates complex cases: trusts, offshore entities, multiple jurisdictions, or source-of-wealth ambiguity.
- •
Supervisor / compliance approver
- •Reviews the final recommendation before any account opening action.
- •Ensures human sign-off for high-risk or ambiguous cases.
- •
Audit logger
- •Stores prompts, model outputs, timestamps, rule hits, and final decisions.
- •Supports regulator review and internal model governance.
Implementation
1) Install AutoGen and define your case schema
For this pattern, use pyautogen and keep your case data explicit. Wealth management KYC fails when inputs are loosely shaped JSON blobs with no traceability.
pip install pyautogen pydantic
from pydantic import BaseModel
from typing import Optional, List
class KYCCase(BaseModel):
client_id: str
full_name: str
country_of_residence: str
document_type: str
document_number: str
document_expiry: str
address: Optional[str] = None
pep_match: bool = False
sanctions_match: bool = False
source_of_wealth_notes: Optional[str] = None
risk_flags: List[str] = []
2) Create specialized agents with AutoGen
AutoGen’s AssistantAgent works well for focused roles. Keep each agent narrow so you can audit behavior by function instead of trying to explain one giant prompt later.
import os
from autogen import AssistantAgent
llm_config = {
"model": "gpt-4o-mini",
"api_key": os.environ["OPENAI_API_KEY"],
}
extractor = AssistantAgent(
name="extractor",
llm_config=llm_config,
system_message=(
"You extract KYC fields from provided text. "
"Return only structured findings and note missing fields."
),
)
validator = AssistantAgent(
name="validator",
llm_config=llm_config,
system_message=(
"You validate KYC cases for wealth management. "
"Check completeness, expiry dates, PEP/sanctions indicators, "
"and jurisdictional risk. Be strict."
),
)
compliance = AssistantAgent(
name="compliance",
llm_config=llm_config,
system_message=(
"You are a compliance reviewer. "
"Recommend APPROVE / REJECT / ESCALATE with concise rationale."
),
)
3) Orchestrate the workflow with GroupChat and GroupChatManager
This is the core pattern. The extractor proposes findings first, the validator checks policy risk next, and compliance makes the final recommendation. In production you would feed OCR text or parsed case notes into the first message.
from autogen import GroupChat, GroupChatManager
case_text = """
Client: Amelia Grant
Passport: X1234567 expiring 2028-09-30
Residence: Singapore
Address proof dated last month at Marina Bay address
Source of wealth: proceeds from sale of family business
No PEP match found in screening summary
No sanctions match found in screening summary
"""
groupchat = GroupChat(
agents=[extractor, validator, compliance],
messages=[],
max_round=3,
)
manager = GroupChatManager(groupchat=groupchat, llm_config=llm_config)
result = extractor.initiate_chat(
manager,
message=(
f"Analyze this KYC case for wealth management onboarding:\n\n{case_text}\n\n"
"1) Extract key identity facts.\n"
"2) Validate against KYC policy.\n"
"3) Provide final compliance recommendation."
),
)
print(result.chat_history[-1]["content"])
4) Add deterministic guardrails before any downstream action
Do not let the LLM be the only decision maker. Use hard rules for obvious failures like expired IDs or sanctions matches. That keeps your agent aligned with policy even when the model output is noisy.
from datetime import datetime
def hard_fail(case: KYCCase) -> list[str]:
issues = []
expiry = datetime.strptime(case.document_expiry, "%Y-%m-%d").date()
if expiry < datetime.utcnow().date():
issues.append("Document expired")
if case.pep_match:
issues.append("PEP match requires enhanced due diligence")
if case.sanctions_match:
issues.append("Sanctions match requires immediate escalation")
if not case.address:
issues.append("Missing residential address")
return issues
sample_case = KYCCase(
client_id="C-10091",
full_name="Amelia Grant",
country_of_residence="Singapore",
document_type="Passport",
document_number="X1234567",
document_expiry="2028-09-30",
)
issues = hard_fail(sample_case)
print(issues)
Production Considerations
- •
Deployment boundaries
- •Keep KYC processing in-region if your firm has data residency requirements.
- •For cross-border wealth platforms, pin storage and inference to approved jurisdictions only.
- •
Monitoring
- •Log every agent turn with client ID, rule hits, model version, and reviewer outcome.
- •Track false positives on PEP/sanctions flags because they create onboarding friction fast.
- •
Guardrails
- •Use deterministic checks for expiry dates, mandatory fields, jurisdiction blocks, and sanctions hits.
- •Require human approval for trusts, shell companies, politically exposed persons, and complex beneficial ownership structures.
- •
Compliance evidence
- •Store extracted text snippets alongside decisions so auditors can see why a case passed or failed.
- •Keep immutable logs for retention periods defined by your AML/KYC policy.
Common Pitfalls
- •
Letting the model decide everything
- •Mistake: using one assistant to both extract facts and approve onboarding.
- •Fix: split extraction, validation, and compliance review; add hard rules outside the model.
- •
Ignoring wealth-management edge cases
- •Mistake: treating all clients like retail applicants.
- •Fix: add branches for trusts, family offices,, offshore entities,, source-of-wealth verification,, and beneficial ownership chains.
- •
Weak audit trails
- •Mistake: storing only the final answer.
- •Fix: persist prompts,, intermediate outputs,, screening results,, timestamps,, and reviewer identity so every decision is defensible under audit.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit