How to Build a KYC verification Agent Using CrewAI in Python for lending
A KYC verification agent for lending takes borrower identity data, checks it against internal and external sources, flags mismatches, and produces an audit-ready decision package. For lenders, this matters because onboarding speed, fraud prevention, and regulatory compliance all depend on getting identity verification right before credit is issued.
Architecture
- •
Input normalization layer
- •Cleans applicant data from loan forms, CRM records, or uploaded documents.
- •Standardizes names, DOBs, addresses, and government IDs before any checks run.
- •
Verification tools
- •Connect to KYC vendors, sanctions lists, internal watchlists, and document OCR services.
- •Expose these as Python functions the agent can call through CrewAI tools.
- •
KYC analyst agent
- •Reviews the outputs from each tool.
- •Applies lending-specific rules like “name mismatch above threshold requires manual review.”
- •
Decision aggregator
- •Consolidates results into approve / reject / refer-for-review outcomes.
- •Produces a structured JSON summary for downstream loan origination systems.
- •
Audit logger
- •Stores every tool call, input hash, output, and final recommendation.
- •Keeps a defensible trail for compliance reviews and model governance.
- •
Policy guardrails
- •Enforce what the agent can and cannot decide automatically.
- •Prevent unsupported claims like “identity verified” when only partial signals are available.
Implementation
1) Install CrewAI and define your KYC tools
CrewAI agents work best when you wrap real checks as tools instead of stuffing everything into prompts. For lending, keep each tool narrow: one for sanctions lookup, one for ID validation, one for residency or address consistency.
from crewai import Agent, Task, Crew
from crewai.tools import BaseTool
from pydantic import BaseModel, Field
from typing import Optional
class KYCInput(BaseModel):
full_name: str = Field(..., description="Applicant legal full name")
date_of_birth: str = Field(..., description="DOB in YYYY-MM-DD")
id_number: str = Field(..., description="Government-issued ID number")
country: str = Field(..., description="Applicant country of residence")
class SanctionsLookupTool(BaseTool):
name: str = "sanctions_lookup"
description: str = "Checks whether an applicant matches sanctions or watchlist records."
def _run(self, full_name: str) -> str:
# Replace with vendor API call
blocked_names = {"John Doe", "Jane Black"}
return "match" if full_name in blocked_names else "no_match"
class IdValidationTool(BaseTool):
name: str = "id_validation"
description: str = "Validates government ID format and basic checksum rules."
def _run(self, id_number: str) -> str:
return "valid" if len(id_number) >= 8 else "invalid"
class AddressRiskTool(BaseTool):
name: str = "address_risk"
description: str = "Checks whether the applicant country is supported by the lender's policy."
def _run(self, country: str) -> str:
restricted = {"IR", "KP", "SY"}
return "restricted" if country in restricted else "ok"
2) Create a KYC analyst agent with explicit lending constraints
Do not let the agent make up policy. Put the rules in the system instructions so the output stays usable by compliance teams and loan ops.
kyc_agent = Agent(
role="KYC Verification Analyst",
goal=(
"Verify borrower identity signals for lending onboarding and produce "
"a structured recommendation with audit-friendly reasoning."
),
backstory=(
"You review KYC evidence for consumer and SME lending. "
"You never approve cases with sanctions hits or restricted-country issues. "
"You refer ambiguous cases to manual review."
),
tools=[SanctionsLookupTool(), IdValidationTool(), AddressRiskTool()],
verbose=True,
)
3) Define a task that forces structured output
For production lending flows, you want deterministic fields that downstream systems can parse. Ask for status plus reasons plus any escalation notes.
kyc_task = Task(
description=(
"Verify this applicant using available tools. "
"Return a JSON-like summary with keys: status, reasons, risk_flags, next_action.\n\n"
f"Applicant data:\n{KYCInput(full_name='Amina Patel', date_of_birth='1991-04-18', "
f"id_number='AB1234567', country='KE').model_dump()}"
),
expected_output=(
"{status: 'approve'|'reject'|'manual_review', reasons: [...], "
"risk_flags: [...], next_action: ...}"
),
agent=kyc_agent,
)
4) Run the crew and consume the result
CrewAI’s Crew class orchestrates execution. In a lending pipeline, treat the result as an input to policy enforcement rather than a final source of truth.
crew = Crew(
agents=[kyc_agent],
tasks=[kyc_task],
)
result = crew.kickoff()
print(result)
If you want to harden this further:
- •Parse
resultinto a Pydantic model before persisting it. - •Reject any response that does not contain all required fields.
- •Store tool outputs separately from final recommendations for auditability.
Production Considerations
- •
Compliance controls
- •Keep KYC decisions explainable and bounded.
- •The agent should recommend outcomes; your policy engine should enforce final approval rules.
- •For regulated lending flows, retain evidence of sanctions screening timing and source version.
- •
Data residency
- •Route applicant PII through region-specific infrastructure.
- •If your lender operates across jurisdictions, make sure vendor calls do not move data outside approved regions.
- •Mask sensitive fields in logs; store only hashes where possible.
- •
Monitoring
- •Track false positives on watchlist hits and manual-review rates.
- •Alert when tool latency spikes or a vendor starts returning inconsistent responses.
- •Log every run with applicant ID, policy version, tool versions, and final disposition.
- •
Guardrails
- •Block the agent from inventing missing documents or inferring identity from weak signals.
- •Require human review for edge cases like transliterated names or cross-border applicants.
- •Separate “KYC complete” from “credit eligible”; those are not the same decision.
Common Pitfalls
- •
Letting the agent decide too much
- •Mistake: asking CrewAI to both verify identity and approve credit risk.
- •Fix: keep KYC limited to identity/compliance signals. Credit decisioning belongs to a separate policy layer.
- •
Using free-form text outputs
- •Mistake: accepting narrative responses that are hard to parse or audit.
- •Fix: require structured keys like
status,reasons,risk_flags, and validate them before use.
- •
Ignoring jurisdiction-specific rules
- •Mistake: applying one global KYC flow across all borrowers.
- •Fix: parameterize policies by country or region so sanctions handling, retention rules, and residency constraints match local requirements.
- •
Not preserving evidence
- •Mistake: storing only the final verdict.
- •Fix: persist tool inputs/outputs, timestamps, model versioning context if used elsewhere in the workflow, and the exact policy snapshot that produced the decision.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit