How to Build a KYC verification Agent Using LangGraph in TypeScript for retail banking

By Cyprian AaronsUpdated 2026-04-21
kyc-verificationlanggraphtypescriptretail-banking

A KYC verification agent handles the back-and-forth work of collecting customer identity data, validating documents, checking sanctions/PEP lists, and deciding whether a case can be auto-approved or needs manual review. In retail banking, this matters because onboarding speed directly affects conversion, while weak verification creates compliance risk, audit gaps, and downstream fraud exposure.

Architecture

A production KYC agent for retail banking needs these components:

  • Input normalizer

    • Converts application payloads into a strict internal schema.
    • Rejects incomplete or malformed submissions before any LLM call.
  • Document extraction node

    • Pulls structured fields from passport, driver’s license, utility bill, or bank statement.
    • In practice this is usually OCR + deterministic parsing, not just prompting.
  • Policy/rules engine

    • Applies bank-specific rules like minimum age, address match thresholds, document freshness, and country restrictions.
    • Keeps deterministic decisions out of the model.
  • Risk assessment node

    • Uses an LLM only for classification, explanation drafting, and ambiguity resolution.
    • Produces a risk score and reason codes for audit.
  • External screening node

    • Calls sanctions, PEP, watchlist, and fraud APIs.
    • Must be isolated behind retries, timeouts, and traceable request IDs.
  • Case router

    • Decides whether to auto-approve, request more information, or escalate to a human analyst.
    • This is where LangGraph state transitions are useful.

Implementation

1. Define the graph state and typed outputs

Keep the state explicit. For banking workflows, you want every decision traceable and serializable.

import { Annotation, StateGraph, START, END } from "@langchain/langgraph";
import { ChatOpenAI } from "@langchain/openai";
import { z } from "zod";

const KycDecisionSchema = z.object({
  decision: z.enum(["approve", "manual_review", "reject"]),
  riskScore: z.number().min(0).max(100),
  reasons: z.array(z.string()),
});

type KycDecision = z.infer<typeof KycDecisionSchema>;

const KycState = Annotation.Root({
  applicantId: Annotation<string>(),
  fullName: Annotation<string>(),
  dob: Annotation<string>(),
  country: Annotation<string>(),
  documentText: Annotation<string>(),
  sanctionsHit: Annotation<boolean>(),
  pepHit: Annotation<boolean>(),
  extractedAddress: Annotation<string | null>(),
  riskScore: Annotation<number | null>(),
  decision: Annotation<KycDecision | null>(),
});

2. Add deterministic checks before model calls

Do not let the model decide obvious policy violations. If the applicant is on a sanctions list or the document is missing critical fields, route immediately.

const policyCheck = async (state: typeof KycState.State) => {
  const missingDoc = !state.documentText || state.documentText.trim().length < 20;

  if (state.sanctionsHit) {
    return {
      ...state,
      decision: {
        decision: "reject",
        riskScore: 100,
        reasons: ["Sanctions match detected"],
      },
    };
  }

  if (missingDoc) {
    return {
      ...state,
      decision: {
        decision: "manual_review",
        riskScore: 85,
        reasons: ["Document text missing or unreadable"],
      },
    };
  }

  return state;
};

3. Use LangGraph nodes for screening and LLM-based classification

This pattern keeps external checks separate from the reasoning step. The LLM gets only the minimum context it needs.

const llm = new ChatOpenAI({
  model: "gpt-4o-mini",
  temperature: 0,
});

const assessRisk = async (state: typeof KycState.State) => {
  const prompt = `
You are assessing a retail banking KYC case.
Return JSON with decision, riskScore (0-100), and reasons.

Applicant:
- Name: ${state.fullName}
- DOB: ${state.dob}
- Country: ${state.country}
- PEP hit: ${state.pepHit}
- Address extracted: ${state.extractedAddress ?? "none"}

Rules:
- If PEP hit is true, prefer manual_review unless other evidence is strong.
- If address is missing or inconsistent, increase risk.
- Do not approve if critical identity data is absent.
`;

  const response = await llm.withStructuredOutput(KycDecisionSchema).invoke(prompt);

  return {
    ...state,
    riskScore: response.riskScore,
    decision: response,
  };
};

4. Wire the graph with conditional routing

This is where LangGraph pays off. You can encode business flow without burying it in nested if statements.

const routeByDecision = (state: typeof KycState.State) => {
  const d = state.decision?.decision;
  
  if (d === "approve") return END;
  
  if (d === "reject") return END;

};

const graph = new StateGraph(KycState)
 .addNode("policyCheck", policyCheck)
 .addNode("assessRisk", assessRisk)
 .addEdge(START, "policyCheck")
 .addEdge("policyCheck", "assessRisk")
 .addConditionalEdges("assessRisk", routeByDecision)
 .compile();

async function runKyc() {
 const result = await graph.invoke({
   applicantId: "app_123",
   fullName: "Jane Doe",
   dob: "1991-04-18",
   country: "ZA",
   documentText: "Passport MRZ ...",
   sanctionsHit: false,
   pepHit: false,
   extractedAddress: "12 Main Street",
   riskScore: null,
   decision: null,
 });

 console.log(result.decision);
}

runKyc();

Production Considerations

  • Auditability

    • Persist every node input/output with timestamps and correlation IDs.
    • Store final decisions plus reason codes so compliance teams can reconstruct why an account was approved or escalated.
  • Data residency

    • Keep PII inside your approved region.
    • If you use hosted models or OCR vendors, verify where prompts, logs, embeddings, and traces are stored.
  • Guardrails

Never let the LLM make final sanctions decisions.
Never send raw document images to the model unless policy allows it.
Never approve on low-confidence extraction alone.
  • Monitoring
Track:
- auto_approve_rate
- manual_review_rate
- false_positive_sanctions_hits
- average_time_to_decision
- model_output_schema_failures

Common Pitfalls

  1. Letting the model own compliance logic

    • Mistake: asking the LLM to decide sanctions/PEP outcomes directly.
    • Avoid it by making those checks deterministic and routing around them before any generation step.
  2. Weak state design

    • Mistake: passing free-form objects through the graph with no schema discipline.
    • Avoid it by using Annotation.Root, typed state fields, and structured outputs with Zod.
  3. No separation between onboarding UX and case resolution

    • Mistake: treating “request more documents” as just another prompt response.
    • Avoid it by making explicit graph states for approve, manual_review, and reject, then mapping each to a downstream workflow in your case management system.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides