How to Build a KYC verification Agent Using CrewAI in Python for banking

By Cyprian AaronsUpdated 2026-04-21
kyc-verificationcrewaipythonbanking

A KYC verification agent automates the first pass of customer due diligence: it collects identity data, checks document consistency, flags sanctions/PEP risk, and produces an auditable decision packet for a human reviewer. For banking, this matters because onboarding speed is tied directly to conversion, but the control surface has to stay tight enough for compliance, auditability, and jurisdiction-specific data handling.

Architecture

A production KYC agent in banking usually needs these components:

  • Intake layer

    • Accepts customer-submitted data: name, DOB, address, ID number, document images, and source metadata.
    • Normalizes inputs before any LLM call.
  • Document extraction tool

    • Pulls structured fields from passports, national IDs, utility bills, and incorporation documents.
    • Can be backed by OCR or a document AI service.
  • Verification tools

    • Cross-checks identity fields against internal CRM/KYC records.
    • Calls sanctions/PEP/adverse media APIs and validates address/document consistency.
  • Risk scoring and decisioning

    • Produces a risk summary: low, medium, high.
    • Distinguishes between auto-approve, manual review, and reject.
  • Audit trail store

    • Persists every input, tool result, model output, and final recommendation.
    • Required for regulator review and internal model governance.
  • Human review handoff

    • Escalates edge cases to compliance analysts with a concise evidence packet.
    • Keeps the agent advisory-only for regulated decisions.

Implementation

1) Install CrewAI and define your tools

For banking workflows, keep the LLM away from raw policy logic. Use CrewAI agents to orchestrate reasoning and use tools for deterministic checks.

pip install crewai crewai-tools pydantic python-dotenv

Here’s a minimal but realistic setup with tools that represent common KYC checks:

from crewai import Agent, Task, Crew
from crewai.tools import BaseTool
from pydantic import BaseModel
from typing import Type

class KYCInput(BaseModel):
    customer_name: str
    dob: str
    country: str
    id_number: str
    address: str

class SanctionsCheckTool(BaseTool):
    name: str = "sanctions_check"
    description: str = "Check customer against sanctions/PEP screening service"

    def _run(self, customer_name: str) -> str:
        # Replace with real API call
        if customer_name.lower() in ["john doe", "test user"]:
            return "MATCH_FOUND: possible sanctions/pep hit"
        return "NO_MATCH"

class AddressConsistencyTool(BaseTool):
    name: str = "address_consistency"
    description: str = "Validate address format and consistency against submitted documents"

    def _run(self, address: str) -> str:
        if len(address) < 10:
            return "INVALID_ADDRESS"
        return "ADDRESS_OK"

class IdFormatTool(BaseTool):
    name: str = "id_format_check"
    description: str = "Validate ID number format by country"

    def _run(self, country: str, id_number: str) -> str:
        if country.upper() == "GB" and not id_number.startswith("GB"):
            return "INVALID_ID_FORMAT"
        return "ID_FORMAT_OK"

2) Create specialized agents for compliance-friendly separation of duties

Don’t make one agent do everything. Split verification from decisioning so your audit trail is easier to defend.

verification_agent = Agent(
    role="KYC Verification Analyst",
    goal="Verify identity attributes and produce structured findings",
    backstory="You validate customer identity data using approved banking controls.",
    tools=[SanctionsCheckTool(), AddressConsistencyTool(), IdFormatTool()],
    verbose=True,
)

compliance_agent = Agent(
    role="KYC Compliance Reviewer",
    goal="Assess verification findings and recommend approve, review, or reject",
    backstory="You apply bank policy conservatively and escalate uncertain cases.",
    verbose=True,
)

3) Define tasks that produce evidence-first outputs

The key pattern is to ask the verification agent for structured findings, then have the compliance agent interpret them. In production you’d persist both outputs as part of your case record.

kyc_verification_task = Task(
    description=(
        "Review this customer profile for KYC checks:\n"
        "{customer_name}, {dob}, {country}, {id_number}, {address}\n\n"
        "Run sanctions screening logic, check ID format consistency, "
        "and validate address quality. Return concise findings with risk flags."
    ),
    expected_output="A structured summary of findings with explicit risk flags.",
    agent=verification_agent,
)

kyc_decision_task = Task(
    description=(
        "Using the verification findings below, recommend one of:\n"
        "- APPROVE\n- MANUAL_REVIEW\n- REJECT\n\n"
        "Explain the reason in banking/compliance terms.\n\n"
        "{verification_findings}"
    ),
    expected_output="A final KYC recommendation with rationale.",
    agent=compliance_agent,
)

4) Run the crew and capture the result for audit

This is where CrewAI’s Crew orchestration comes in. In banking you want deterministic logging around inputs and outputs before anything reaches downstream systems.

def run_kyc_case(customer_profile: dict):
    crew = Crew(
        agents=[verification_agent, compliance_agent],
        tasks=[kyc_verification_task, kyc_decision_task],
        verbose=True,
    )

    result = crew.kickoff(inputs={
        "customer_name": customer_profile["customer_name"],
        "dob": customer_profile["dob"],
        "country": customer_profile["country"],
        "id_number": customer_profile["id_number"],
        "address": customer_profile["address"],
        # This would usually be injected after task 1 in a real pipeline.
        # For demo purposes we pass it through as an input placeholder.
        "verification_findings": ""
    })

    return result


if __name__ == "__main__":
    profile = {
        "customer_name": "Jane Smith",
        "dob": "1990-03-14",
        "country": "GB",
        "id_number": "GB12345678",
        "address": "12 Baker Street London NW1 6XE",
    }

    output = run_kyc_case(profile)
    print(output)

Production Considerations

  • Keep regulated decisions human-in-the-loop

    • The agent should recommend APPROVE, MANUAL_REVIEW, or REJECT, but final approval should sit with policy-controlled systems or analysts.
    • This is important when regulators ask who made the decision and why.
  • Log everything needed for audit

    • Store input payloads, tool responses, timestamps, model version, prompt version, and final recommendation.
    • Make logs immutable or append-only so they can support investigations later.
  • Respect data residency

    • If your bank operates across regions, route EU customer data to EU-hosted infrastructure only.
    • Don’t send PII to third-party services unless legal review has cleared transfer terms and retention rules.
  • Add hard guardrails around PII

    • Redact unnecessary fields before LLM calls.
    • Use schema validation on all tool inputs/outputs so malformed data doesn’t become a compliance incident.

Common Pitfalls

  • Letting the LLM decide without deterministic checks

    • Mistake: asking the model to “figure out if this is valid” with no tool-backed verification.
    • Avoid it by moving sanctions screening, ID format validation, and address rules into tools.
  • No separation between verification and approval

    • Mistake: one agent both checks facts and approves onboarding.
    • Avoid it by splitting responsibilities into a verification agent and a compliance reviewer so your control flow mirrors banking policy.
  • Ignoring jurisdiction-specific requirements

    • Mistake: using one global KYC prompt for every market.
    • Avoid it by parameterizing rules per country or business line; residency rules in one region may differ from recordkeeping obligations in another.

If you want this to hold up in a real bank environment, treat CrewAI as orchestration only. Put policy enforcement in code, keep evidence structured from day one, and make every recommendation traceable back to a source check.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides