How to Build a KYC verification Agent Using AutoGen in TypeScript for investment banking

By Cyprian AaronsUpdated 2026-04-21

kyc-verificationautogentypescriptinvestment-banking

A KYC verification agent for investment banking automates the first pass on client onboarding: it extracts identity data, checks completeness, flags inconsistencies, and routes risky cases to human compliance analysts. That matters because onboarding delays kill deal velocity, but missing a sanction hit, beneficial ownership issue, or source-of-funds gap creates regulatory exposure.

Architecture

A production KYC agent for investment banking needs these components:

•
Document intake layer
- •Receives passports, corporate registries, proof of address, UBO charts, and source-of-funds docs.
- •Normalizes PDFs, images, and extracted text into a single case payload.
•
Agent orchestration layer
- •
  Uses AutoGen to coordinate specialist agents:
  - •document parser
  - •sanctions/PEP checker
  - •risk assessor
  - •compliance reviewer
- •Keeps the workflow deterministic enough for audit.
•
Policy and rules engine
- •Encodes bank-specific KYC thresholds.
- •Applies jurisdiction rules, entity-type rules, and escalation criteria.
•
Evidence store
- •Persists extracted fields, model outputs, timestamps, and source references.
- •Supports audit replay and regulator review.
•
Human-in-the-loop review queue
- •Routes ambiguous or high-risk cases to a compliance officer.
- •Captures reviewer decisions as structured feedback.
•
Monitoring and controls
- •Tracks false positives, turnaround time, refusal reasons, and model drift.
- •Enforces data residency and access boundaries.

Implementation

1. Install AutoGen and define the case schema

Use the TypeScript AutoGen SDK and keep your KYC payload typed. For investment banking, your schema should preserve source references so every extracted field can be traced back to a document page or OCR chunk.

npm install @autogenai/autogen openai zod

import { z } from "zod";

export const KycCaseSchema = z.object({
  caseId: z.string(),
  clientType: z.enum(["individual", "corporate"]),
  jurisdiction: z.string(),
  documents: z.array(
    z.object({
      name: z.string(),
      type: z.enum(["passport", "incorporation", "ubo", "address", "sow"]),
      text: z.string()
    })
  ),
});

export type KycCase = z.infer<typeof KycCaseSchema>;

2. Create specialist agents with `AssistantAgent`

The pattern here is simple: one agent extracts facts, another evaluates risk, and a third produces the compliance memo. In banking workflows you want narrow prompts and explicit outputs; do not let one generic agent “decide everything.”

import { AssistantAgent } from "@autogenai/autogen";
import OpenAI from "openai";
import { KycCaseSchema } from "./schema";

const llmConfig = {
  model: "gpt-4o-mini",
  apiKey: process.env.OPENAI_API_KEY!,
};

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY! });

export const documentAgent = new AssistantAgent({
  name: "document_agent",
  llmConfig,
  systemMessage:
    "Extract KYC facts only. Return JSON with fields found, missing items, and source citations."
});

export const riskAgent = new AssistantAgent({
  name: "risk_agent",
  llmConfig,
  systemMessage:
    "Assess KYC risk for investment banking onboarding. Focus on sanctions exposure, PEP indicators, UBO opacity, adverse media signals, jurisdiction risk, and source-of-funds gaps."
});

export const complianceAgent = new AssistantAgent({
  name: "compliance_agent",
  llmConfig,
  systemMessage:
    "Write a concise compliance recommendation. If evidence is insufficient or risk is high, recommend escalation to human review."
});

3. Orchestrate the workflow with `GroupChat` and `GroupChatManager`

AutoGen’s group chat pattern works well when you want each agent to contribute in sequence without building brittle handoffs yourself. The manager keeps the conversation state while you enforce a fixed decision path.

import { GroupChat } from "@autogenai/autogen";
import { GroupChatManager } from "@autogenai/autogen";
import { documentAgent, riskAgent, complianceAgent } from "./agents";

async function runKyc(caseData: unknown) {
  const parsed = KycCaseSchema.parse(caseData);

  const groupChat = new GroupChat({
    agents: [documentAgent, riskAgent, complianceAgent],
    messages: [],
    maxRounds: 6,
    speakerSelectionMethod: "round_robin"
  });

  const manager = new GroupChatManager({
    groupchat: groupChat,
    llmConfig: {
      model: "gpt-4o-mini",
      apiKey: process.env.OPENAI_API_KEY!
    }
  });

  const prompt = `
KYC case ${parsed.caseId}
Client type: ${parsed.clientType}
Jurisdiction: ${parsed.jurisdiction}

Documents:
${parsed.documents.map(d => `- ${d.type}: ${d.name}\n${d.text}`).join("\n\n")}

Task:
1) Extract verified identity fields
2) Identify missing documents or inconsistencies
3) Assess AML/KYC risk
4) Recommend approve / escalate / reject with reason
`;

  const result = await manager.run(prompt);
  return result;
}

4. Add deterministic guardrails before any analyst sees the output

For investment banking you should not rely on the model alone. Apply hard business rules after the agent output so your control framework is auditable.

type Decision = "approve" | "escalate" | "reject";

function applyKycPolicy(output: string): Decision {
  const normalized = output.toLowerCase();

  if (normalized.includes("sanctions match")) return "reject";
  if (normalized.includes("pep") || normalized.includes("ubo opaque")) return "escalate";
  if (normalized.includes("missing") && normalized.includes("source of funds")) return "escalate";

  return "approve";
}

Production Considerations

•
Auditability
- •Persist every prompt, response, model version, timestamp, and source citation.
- •Store an immutable review trail so internal audit can reconstruct why a case was escalated.
•
Data residency
- •Keep client documents in-region if your bank has jurisdictional constraints.
- •Do not send raw PII across regions or into non-approved processing environments.
•
Human override
- •Any sanctions ambiguity, UBO complexity, or adverse media hit should route to a compliance analyst.
- •The agent should recommend; it should not auto-onboard high-risk clients.
•
Monitoring
- •
  Track escalation rate by desk, region, entity type, and reviewer outcome.
  - •A spike in false positives usually means your prompts or policy thresholds are too loose.
  - •A drop in escalations can be worse than noise if it hides missed risk.

Common Pitfalls

•
Treating extraction as approval
- •Extraction is not due diligence.
- •Fix this by separating fact extraction from policy decisioning and requiring deterministic rule checks before approval.
•
Letting one agent make every call
- •A single general-purpose agent will blur parsing, risk scoring, and recommendation.
- •Fix this with specialized agents and a fixed orchestration flow using GroupChatManager.
•
Ignoring source traceability
- •If you cannot show where a field came from, you cannot defend the onboarding decision during audit.
- •Fix this by storing citations for every extracted value and linking them to original document fragments.
•
Skipping jurisdiction controls
- •Banking KYC is not globally uniform; residency rules and regulatory expectations vary by region.
- •Fix this by encoding country-specific policy layers outside the model and applying them after generation.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit