How to Build a policy Q&A Agent Using LlamaIndex in TypeScript for retail banking

By Cyprian AaronsUpdated 2026-04-21
policy-q-allamaindextypescriptretail-bankingpolicy-qanda

A policy Q&A agent for retail banking answers customer-service and internal-ops questions from approved policy documents, not from model memory. That matters because banking teams need consistent answers on fees, card disputes, KYC, overdrafts, and account servicing without exposing staff or customers to hallucinations, outdated policies, or compliance drift.

Architecture

  • Policy document ingestion

    • Pull PDFs, DOCX files, HTML policy pages, and internal SOPs into a controlled corpus.
    • Keep source metadata like document title, version, effective date, jurisdiction, and owner.
  • Chunking and indexing

    • Split policies into retrieval-friendly nodes with SentenceSplitter.
    • Build a vector index with VectorStoreIndex so the agent can retrieve exact policy passages.
  • Retriever layer

    • Use index.asRetriever() with top-k limits.
    • Filter by product line, region, or effective date when the bank has multiple policy variants.
  • Answer synthesis

    • Use a query engine that cites sources and constrains responses to retrieved context.
    • Return concise answers plus references for auditability.
  • Guardrails and escalation

    • Detect low-confidence queries, missing policy coverage, or regulated advice requests.
    • Route those cases to a human queue or case management system.

Implementation

1) Install dependencies and set up your environment

Use the TypeScript LlamaIndex packages and keep the model key in environment variables.

npm install llamaindex dotenv

Set your OpenAI key:

export OPENAI_API_KEY="your-key"

2) Load policy text with metadata

For retail banking, metadata is not optional. You need to know which policy version answered the question when audit asks six months later.

import "dotenv/config";
import {
  Document,
  VectorStoreIndex,
  SentenceSplitter,
} from "llamaindex";

const documents = [
  new Document({
    text: `
Retail Banking Fee Waiver Policy
Effective Date: 2025-01-01
Region: UK
Policy: Branch staff may waive monthly account fees only for customers with documented service failure.
`,
    metadata: {
      source: "fee-waiver-policy.md",
      product: "current-account",
      region: "UK",
      effectiveDate: "2025-01-01",
      owner: "Retail Banking Operations",
    },
  }),
  new Document({
    text: `
Card Dispute Policy
Effective Date: 2025-02-15
Region: UK
Policy: Customers must report unauthorized card transactions within 13 months.
Staff must open a dispute case within one business day.
`,
    metadata: {
      source: "card-dispute-policy.md",
      product: "debit-card",
      region: "UK",
      effectiveDate: "2025-02-15",
      owner: "Payments Operations",
    },
  }),
];

3) Build the index and query engine

This is the core pattern. Split into nodes, index them, then query through a retriever-backed engine that can cite sources.

import { OpenAI } from "@llamaindex/openai";

async function main() {
  const splitter = new SentenceSplitter({
    chunkSize: 256,
    chunkOverlap: 32,
  });

  const nodes = await splitter.getNodesFromDocuments(documents);

  const index = await VectorStoreIndex.fromDocuments(documents, {
    transformations: [splitter],
    embedModel: new OpenAIEmbedding({
      model: "text-embedding-3-small",
    }),
  });

  const queryEngine = index.asQueryEngine({
    retriever_kwargs: {
      similarityTopK: 3,
    },
    responseSynthesizerConfig: {
      streamFinalResponse: false,
    },
  });

  const response = await queryEngine.query({
    query:
      "Can branch staff waive monthly fees for a customer who complained about poor service?",
  });

  console.log(String(response));
}

main().catch(console.error);

A couple of notes on that pattern:

  • VectorStoreIndex.fromDocuments(...) is enough for a first production pilot when your corpus is small to medium.
  • SentenceSplitter keeps chunks aligned to policy language instead of arbitrary token boundaries.
  • index.asQueryEngine() gives you a clean retrieval-to-answer path without hand-wiring every component.

4) Add an explicit compliance filter before answering

In banking, you do not want every question answered. Questions that ask for legal interpretation, credit decisions, or customer-specific outcomes should be escalated.

function shouldEscalate(question: string): boolean {
  const q = question.toLowerCase();
  return [
    "legal advice",
    "should we approve",
    "credit score",
    "exception to policy",
    "guarantee approval",
    "complaint escalation",
  ].some((phrase) => q.includes(phrase));
}

async function answerPolicyQuestion(question: string) {
  if (shouldEscalate(question)) {
    return {
      answer:
        "This question needs human review because it may require compliance or discretionary decisioning.",
      escalated: true,
    };
  }

  const result = await queryEngine.query({ query: question });
  return {
    answer: String(result),
    escalated: false,
  };
}

That simple gate catches a lot of bad requests before they reach the model. In a real bank, this should sit behind role-based access control and ticketing integration.

Production Considerations

  • Deployment

    • Run the agent behind an authenticated internal API.
    • Separate corpora by jurisdiction if your policies differ across UK, EU, and APAC entities.
    • Keep embeddings and source documents in approved regions for data residency requirements.
  • Monitoring

SignalWhy it mattersAction
Low retrieval scoresThe agent may be answering from weak contextEscalate or refuse
High fallback ratePolicy coverage is incompleteAdd missing documents
Escalation volume by topicIndicates ambiguous or risky policy areasReview content with compliance
Source mismatchWrong region/version citedFix metadata filters
  • Guardrails
GuardrailBanking concernImplementation
Answer only from retrieved contextHallucination riskRefuse if no relevant nodes are found
Cite source metadataAuditabilityInclude document name and effective date
Role-based access controlInternal confidentialityFilter documents by user role
Human escalation pathRegulated advice riskRoute edge cases to operations/compliance

Common Pitfalls

  1. Ignoring document versioning

    • If you index old fee policies alongside current ones, the agent will return stale answers.
    • Fix it by storing effectiveDate, version, and region in metadata and filtering at query time.
  2. Letting the model answer outside the corpus

    • A policy Q&A agent should not improvise explanations for overdrafts, disputes, or AML rules.
    • Fix it by refusing low-confidence queries and requiring retrieved context before synthesis.
  3. Skipping audit trails

    • In retail banking, “the model said so” is not acceptable evidence.
    • Fix it by logging question text, retrieved node IDs, source metadata, response text, timestamp, and user identity.

If you build this pattern correctly, you get a controlled assistant that helps contact centers and operations teams answer policy questions fast without turning compliance into guesswork.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides