How to Build a KYC verification Agent Using LangChain in TypeScript for investment banking
A KYC verification agent for investment banking automates the first pass of client due diligence: it collects identity data, checks it against policy, extracts risk signals from documents, and produces an auditable recommendation for compliance teams. It matters because onboarding speed is useless if you cannot prove why a client was accepted, rejected, or escalated.
Architecture
- •Intake layer
- •Accepts structured client data plus uploaded documents like passports, incorporation certificates, proof of address, and beneficial ownership declarations.
- •Document extraction layer
- •Uses OCR or text extraction upstream, then LangChain to normalize content into a consistent schema.
- •Policy reasoning layer
- •Applies bank-specific KYC rules: jurisdiction restrictions, PEP/sanctions flags, UBO thresholds, source-of-funds requirements.
- •LLM orchestration layer
- •Uses LangChain
ChatOpenAI,PromptTemplate, andStructuredOutputParserto turn raw evidence into a deterministic review object.
- •Uses LangChain
- •Audit and trace layer
- •Stores prompts, model outputs, extracted fields, and decision rationale for compliance review and internal audit.
- •Escalation layer
- •Routes cases to human analysts when confidence is low or when policy requires manual approval.
Implementation
1. Define the KYC output schema
For investment banking, do not let the model return free-form prose. Force structured output so downstream systems can make deterministic decisions.
import { z } from "zod";
export const KycReviewSchema = z.object({
customerName: z.string(),
entityType: z.enum(["individual", "company"]),
jurisdiction: z.string(),
pepRisk: z.enum(["low", "medium", "high"]),
sanctionsRisk: z.enum(["clear", "possible_match", "match"]),
uboVerified: z.boolean(),
sourceOfFundsRequired: z.boolean(),
recommendation: z.enum(["approve", "manual_review", "reject"]),
rationale: z.array(z.string()),
});
export type KycReview = z.infer<typeof KycReviewSchema>;
This schema becomes your contract with compliance systems. If the model cannot fit the result into this shape, you fail closed and escalate.
2. Build the LangChain chain in TypeScript
Use ChatOpenAI with structured parsing. This pattern is predictable and easy to audit.
import { ChatOpenAI } from "@langchain/openai";
import { PromptTemplate } from "@langchain/core/prompts";
import { StructuredOutputParser } from "@langchain/core/output_parsers";
import { RunnableSequence } from "@langchain/core/runnables";
import { KycReviewSchema } from "./schema";
const parser = StructuredOutputParser.fromZodSchema(KycReviewSchema);
const prompt = PromptTemplate.fromTemplate(`
You are a KYC analyst for an investment bank.
Use only the provided evidence. Do not invent facts.
Apply conservative judgment. If data is incomplete, recommend manual_review.
Evidence:
{evidence}
{format_instructions}
`);
const model = new ChatOpenAI({
model: "gpt-4o-mini",
temperature: 0,
});
export const kycChain = RunnableSequence.from([
{
evidence: (input: { evidence: string }) => input.evidence,
format_instructions: () => parser.getFormatInstructions(),
},
prompt,
model,
]);
This is the core pattern:
- •
temperature: 0for stable outputs - •
StructuredOutputParserto enforce shape - •
RunnableSequenceto keep the flow explicit
3. Add a review function with fail-closed behavior
The agent should never auto-approve on malformed output. In banking, malformed equals escalated.
import { ZodError } from "zod";
import { KycReviewSchema } from "./schema";
import { kycChain } from "./chain";
export async function runKycReview(evidence: string) {
const raw = await kycChain.invoke({ evidence });
const parsedText =
typeof raw === "string" ? raw : JSON.stringify(raw);
try {
const parsed = JSON.parse(parsedText);
return KycReviewSchema.parse(parsed);
} catch (error) {
if (error instanceof ZodError) {
return {
customerName: "unknown",
entityType: "individual",
jurisdiction: "unknown",
pepRisk: "high",
sanctionsRisk: "possible_match",
uboVerified: false,
sourceOfFundsRequired: true,
recommendation: "manual_review",
rationale: ["Output failed schema validation; escalated for analyst review."],
};
}
throw error;
}
}
In practice you would log both the raw model response and the validation error. That gives compliance teams a clear audit trail when they ask why a case was escalated.
4. Feed the agent real KYC evidence
The agent works best when you pre-normalize upstream document text into one evidence blob or a typed object serialized to text.
const evidence = `
Customer Name: Acme Capital Ltd
Entity Type: Company
Jurisdiction: Cayman Islands
UBO Disclosure: One UBO at 35%, two nominees listed
PEP Screening Result: No direct match
Sanctions Screening Result: Potential fuzzy match on director name
Source of Funds Document: Not provided
`;
runKycReview(evidence).then(console.log);
For production onboarding flows, this evidence usually comes from OCR, entity resolution, sanctions screening APIs, and your internal client master data.
Production Considerations
- •Auditability
- •Persist prompts, inputs, model version, output JSON, and validation failures.
- •For investment banking audits, you need to show why a case was approved or escalated months later.
- •Data residency
- •Keep client PII in-region where required by policy or regulation.
- •If your bank operates across EMEA, APAC, and US regions, route requests to region-specific deployments and storage.
- •Guardrails
- •Never let the LLM make final approval decisions on its own.
- •Use hard policy checks for sanctions lists, restricted jurisdictions, and mandatory source-of-funds rules before any recommendation is accepted.
- •Monitoring
- •Track schema failure rate, manual-review rate, false positive rates on PEP/sanctions references, and latency per onboarding case.
- •A spike in manual reviews often means your prompts are drifting or upstream OCR quality has degraded.
Common Pitfalls
- •
Using free-form LLM output in production
- •This creates brittle downstream logic and weak auditability.
- •Fix it by enforcing Zod validation and rejecting anything outside your schema.
- •
Letting the model infer missing compliance facts
- •If source-of-funds is missing or ownership is unclear, the correct answer is escalation.
- •Fix it with explicit instructions to use
manual_reviewwhenever evidence is incomplete.
- •
Ignoring jurisdiction-specific policy
- •A rule that works for one booking center may violate another region’s onboarding requirements.
- •Fix it by injecting region-specific policy text into the prompt and keeping hard rules outside the model in code.
- •
Skipping trace storage
- •Without stored inputs and outputs, your compliance team cannot reconstruct decisions.
- •Fix it by logging every run with case ID, timestamp, model name, prompt hash, and final recommendation.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit