How to Build a KYC verification Agent Using LangChain in TypeScript for insurance

By Cyprian AaronsUpdated 2026-04-21

kyc-verificationlangchaintypescriptinsurance

A KYC verification agent for insurance collects customer identity data, checks it against policy and compliance rules, flags missing or suspicious information, and produces an auditable decision trail. It matters because insurers need to reduce onboarding friction without weakening AML, sanctions, fraud, and regulatory controls.

Architecture

Build this agent as a small workflow, not a single prompt.

•
Input normalization layer
- •Accepts applicant data from web forms, broker uploads, or CRM payloads
- •Converts inconsistent fields into a canonical KYC schema
•
Document extraction layer
- •Pulls structured fields from passports, driver’s licenses, proof of address, and company registration docs
- •Uses OCR or upstream document services before the LLM sees anything
•
KYC reasoning layer
- •Uses LangChain to classify missing fields, detect inconsistencies, and decide whether the case is pass / review / reject
- •Produces structured output only
•
Policy and compliance rules layer
- •Encodes insurer-specific thresholds: age limits, residency constraints, sanctioned country checks, beneficial ownership requirements
- •Keeps deterministic logic outside the model
•
Audit trail layer
- •Stores every input snapshot, model output, tool call, and final decision
- •Required for internal audit and regulator review
•
Escalation layer
- •Routes ambiguous or high-risk cases to a human underwriter or compliance analyst
- •Prevents the model from making unsupported decisions

Implementation

1) Define the KYC schema and model output contract

For insurance, you want a strict schema. Don’t let the model return free text when your downstream system expects a decision code.

import { z } from "zod";

export const KycInputSchema = z.object({
  fullName: z.string(),
  dateOfBirth: z.string(),
  countryOfResidence: z.string(),
  governmentIdNumber: z.string().optional(),
  proofOfAddressCountry: z.string().optional(),
  pepFlag: z.boolean().default(false),
});

export const KycDecisionSchema = z.object({
  status: z.enum(["approved", "needs_review", "rejected"]),
  riskScore: z.number().min(0).max(100),
  reasons: z.array(z.string()),
  missingFields: z.array(z.string()),
});

export type KycInput = z.infer<typeof KycInputSchema>;
export type KycDecision = z.infer<typeof KycDecisionSchema>;

2) Build the LangChain chain with structured output

Use ChatOpenAI plus withStructuredOutput() so the agent returns validated JSON. This is the pattern you want in production.

import { ChatOpenAI } from "@langchain/openai";
import { PromptTemplate } from "@langchain/core/prompts";
import { RunnableLambda } from "@langchain/core/runnables";
import { KycInputSchema, KycDecisionSchema } from "./schemas";

const llm = new ChatOpenAI({
  model: "gpt-4o-mini",
  temperature: 0,
});

const prompt = PromptTemplate.fromTemplate(`
You are a KYC verification assistant for an insurance onboarding workflow.

Rules:
- Only assess the provided data.
- Do not invent missing fields.
- If there is any mismatch between residence country and proof of address country, flag it.
- If pepFlag is true, require manual review.
- Return only structured data matching the schema.

Applicant data:
{input}
`);

const structuredModel = llm.withStructuredOutput(KycDecisionSchema);

export const kycChain = RunnableLambda.from(async (rawInput: unknown) => {
  const input = KycInputSchema.parse(rawInput);
  const formattedPrompt = await prompt.format({
    input: JSON.stringify(input),
  });

  return structuredModel.invoke(formattedPrompt);
});

3) Add deterministic insurance rules before final decisioning

Do not outsource obvious policy checks to the model. Use code for residency restrictions and mandatory-field validation.

import { RunnableSequence } from "@langchain/core/runnables";

const restrictedCountries = new Set(["IR", "KP", "SY"]);

function applyInsuranceRules(input: KycInput) {
  const reasons: string[] = [];
  const missingFields: string[] = [];

  if (!input.governmentIdNumber) missingFields.push("governmentIdNumber");
  if (!input.proofOfAddressCountry) missingFields.push("proofOfAddressCountry");

  if (restrictedCountries.has(input.countryOfResidence)) {
    reasons.push("Country of residence is restricted by underwriting policy");
    return { status: "rejected" as const, riskScore: 100, reasons, missingFields };
    }

  if (input.pepFlag) {
    reasons.push("PEP flag requires manual review");
    return { status: "needs_review" as const, riskScore: 85, reasons, missingFields };
  }

  return null;
}

export const verificationPipeline = RunnableSequence.from([
  RunnableLambda.from((rawInput: unknown) => KycInputSchema.parse(rawInput)),
  RunnableLambda.from(async (input: KycInput) => {
    const ruleResult = applyInsuranceRules(input);
    if (ruleResult) return ruleResult;

    return kycChain.invoke(input);
  }),
]);

4) Execute and persist an audit record

The decision is only useful if you can explain it later. Store inputs, outputs, timestamps, and versioned policy rules.

async function saveAuditRecord(record: {
  requestId: string;
  input: KycInput;
});
async function saveAuditRecord(record: {
requestId:string;
input:any;
decision:any;
createdAt:string;
}) {
console.log("Persisting audit record", record.requestId);
}

async function runKyc(requestId: string, payload: unknown) {
const input = KycInputSchema.parse(payload);
const decision = await verificationPipeline.invoke(input);

await saveAuditRecord({
requestId,
input,
decision,
createdAt:new Date().toISOString()
});

return decision;
}

Production Considerations

•Deploy in-region

For insurance workflows with residency constraints, keep processing in approved cloud regions. If customer data leaves the jurisdiction without controls, you create legal exposure fast.

•Log every decision path

Store the raw input hash, normalized input, rule hits, model version, prompt version, and final status. Auditors will ask why one applicant was approved and another escalated.

•Add guardrails around PII

Redact unnecessary fields before sending data to the model. Passport numbers and addresses should only be included if they are needed for the specific check.

•Use human review thresholds

Anything with PEP flags, mismatched documents, low confidence extraction quality, or restricted geographies should go to manual review. Do not auto-reject borderline cases without a documented policy.

Common Pitfalls

•
Letting the LLM make policy decisions
- •Mistake: asking the model to decide sanctions eligibility or jurisdictional acceptance on its own.
- •Fix: implement those checks in deterministic TypeScript first; use LangChain for interpretation and classification only.
•
Returning unstructured text
- •Mistake: accepting “looks good” responses from the model.
- •Fix: use withStructuredOutput() with a Zod schema so your downstream workflow gets validated machine-readable output.
•
Ignoring auditability
- •Mistake: logging only the final decision.
- •Fix: persist input snapshots, prompt versions, rule results, model version IDs, and timestamps. Insurance compliance teams need reconstruction capability months later.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit