How to Build a compliance checking Agent Using LlamaIndex in TypeScript for banking

By Cyprian AaronsUpdated 2026-04-21
compliance-checkingllamaindextypescriptbanking

A compliance checking agent in banking reviews customer communications, product drafts, transaction narratives, and internal policies against regulatory rules before anything leaves the building. It matters because one missed disclosure, one prohibited phrase, or one data residency violation can become a regulatory issue, a legal dispute, or a costly remediation exercise.

Architecture

Build this agent as a small pipeline, not a single prompt:

  • Policy corpus loader

    • Ingests bank policies, AML/KYC procedures, product T&Cs, and jurisdiction-specific rules.
    • Stores them as indexed documents with metadata like jurisdiction, policy_type, effective_date, and owner.
  • Retrieval layer

    • Uses LlamaIndex retrieval to pull only the relevant policy chunks for the request.
    • Keeps the agent grounded in approved internal sources instead of free-form model memory.
  • Compliance reasoning layer

    • Compares the user input against retrieved policy snippets.
    • Produces structured findings: pass, needs_review, or block.
  • Audit trail logger

    • Saves the input, retrieved evidence, decision, and model output.
    • This is non-negotiable in banking; auditors need traceability.
  • Guardrail layer

    • Blocks unsafe outputs like legal advice phrased as certainty, PII leakage, or unsupported approvals.
    • Enforces escalation when confidence is low or policy coverage is incomplete.

Implementation

1) Load policy documents into a vector index

Use SimpleDirectoryReader for local policy files and VectorStoreIndex for retrieval. In production, swap the storage backend for something that meets your residency requirements.

import { SimpleDirectoryReader } from "llamaindex";
import { VectorStoreIndex } from "llamaindex";

async function buildPolicyIndex() {
  const docs = await new SimpleDirectoryReader({
    inputDir: "./policies",
  }).loadData();

  const index = await VectorStoreIndex.fromDocuments(docs);
  return index;
}

Keep your source documents scoped by jurisdiction. If you mix UK FCA guidance with US Reg E language in the same corpus without metadata filtering, your agent will return noisy results.

2) Retrieve only the relevant compliance context

Create a query engine from the index and use it to fetch evidence for each request. The key pattern here is retrieval-first: never ask the model to judge compliance without grounding it in policy text.

import { VectorStoreIndex } from "llamaindex";

type ComplianceResult = {
  decision: "pass" | "needs_review" | "block";
  rationale: string;
  evidence: string[];
};

async function checkCompliance(userText: string): Promise<ComplianceResult> {
  const index = await buildPolicyIndex();
  const queryEngine = index.asQueryEngine({
    similarityTopK: 4,
  });

  const response = await queryEngine.query({
    query: `
You are a banking compliance reviewer.
Assess whether the following text violates internal policy or requires human review.

Text:
${userText}

Return:
- decision
- rationale
- evidence snippets
`,
  });

  return {
    decision: "needs_review",
    rationale: response.toString(),
    evidence: [],
  };
}

This is intentionally conservative. In banking, “needs review” is often better than an overconfident “approved” when the evidence is incomplete.

3) Add structured output and escalation logic

Don’t ship raw prose back to downstream systems. Wrap the LLM output into a schema so your application can enforce routing rules.

import { OpenAI } from "@llamaindex/openai";
import { VectorStoreIndex } from "llamaindex";

const llm = new OpenAI({
  model: "gpt-4o-mini",
});

async function reviewMarketingCopy(copy: string) {
  const index = await buildPolicyIndex();
  const queryEngine = index.asQueryEngine({ similarityTopK: 5 });

  const context = await queryEngine.query({
    query: `Find relevant banking compliance rules for this text:\n${copy}`,
  });

  const prompt = `
You are reviewing bank marketing copy for compliance.
Use only the provided context.

Context:
${context.toString()}

Copy:
${copy}

Return JSON with keys:
decision (pass|needs_review|block)
rationale
evidence (array of strings)
`;

  const result = await llm.complete({ prompt });
  return result.text;
}

If you need tighter control, parse the JSON before returning it to your app. Reject anything malformed and send it to manual review.

4) Log every decision for auditability

Banking teams need to answer three questions later: what was checked, what evidence was used, and who approved it. Persist that information alongside timestamps and document versions.

type AuditRecord = {
  requestId: string;
  inputText: string;
  decisionPayload: string;
};

async function writeAudit(record: AuditRecord) {
  console.log(JSON.stringify({
    ...record,
    timestamp: new Date().toISOString(),
    system: "compliance-agent",
    version: "1.0.0",
  }));
}

In production, send this to an immutable log store or SIEM. Keep policy document hashes too; if a regulation changes, you need to know which version informed the answer.

Production Considerations

  • Deploy inside your residency boundary

    • Keep policy indexes and inference endpoints in-region if your bank has UK-only, EU-only, or country-specific residency constraints.
    • Don’t send sensitive customer text to unmanaged external services unless legal has signed off.
  • Treat retrieval quality as a control

    • Monitor top-k hits, empty retrievals, and policy coverage by jurisdiction.
    • If retrieval returns weak evidence, force human review instead of generating a confident answer.
  • Instrument decisions

    • Track pass, needs_review, and block rates by product line and rule set.
    • Spikes usually mean either bad prompts or newly introduced business language that isn’t covered by policy docs yet.
  • Add hard guardrails

    • Redact PII before sending text into the agent where possible.

Common Pitfalls

  1. Using one global index for all jurisdictions

    • That creates cross-border contamination between policies.
    • Fix it by partitioning indexes by region or using metadata filters on retrieval.
  2. Letting the model decide without citations

    • A free-form “looks compliant” response is useless during audit review.
    • Require retrieved snippets in every output and store them with the decision record.
  3. Ignoring prompt injection inside user content

    • A customer email can contain instructions like “ignore prior rules.”
    • Strip or isolate user-provided instructions and keep system prompts strict about source-of-truth policies only.
  4. Skipping manual escalation paths

    • Some cases will always be ambiguous: sanctions edge cases, product suitability language, cross-border disclosures.
    • Route those to compliance staff instead of forcing an automated yes/no answer.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides