How to Build a policy Q&A Agent Using LlamaIndex in TypeScript for banking

By Cyprian AaronsUpdated 2026-04-21

policy-q-allamaindextypescriptbankingpolicy-qanda

A policy Q&A agent answers employee or customer questions against bank policy documents, compliance manuals, and procedure guides. In banking, that matters because the difference between a correct answer and a hallucinated one is not just bad UX — it can create compliance risk, audit findings, and inconsistent customer treatment.

Architecture

•
Document ingestion layer
- •Pull policy PDFs, DOCX files, and HTML pages from approved internal sources.
- •Normalize them into text chunks with metadata like policy_name, version, owner, and jurisdiction.
•
Embedding and indexing layer
- •Use OpenAIEmbedding or your approved embedding provider.
- •Store vectors in a controlled backend such as Pinecone, Weaviate, Qdrant, or a self-hosted option if data residency requires it.
•
Retriever
- •Use LlamaIndex’s VectorStoreIndex with a retriever configured for top-k semantic search.
- •Add metadata filters for region, business unit, and policy version.
•
Answer synthesis layer
- •Use a response synthesizer or query engine to generate grounded answers only from retrieved policy text.
- •Force citations so reviewers can trace every answer back to source material.
•
Guardrails layer
- •Add refusal behavior for out-of-scope questions.
- •Block answers when retrieval confidence is low or when the question requests legal advice outside policy scope.
•
Audit and observability layer
- •Log query text, retrieved document IDs, policy versions, model output, and latency.
- •Keep immutable audit trails for compliance review.

Implementation

1) Install dependencies and configure your environment

Use the TypeScript LlamaIndex package and an approved LLM provider. For banking workloads, keep secrets in your vault and make the model endpoint explicit.

npm install llamaindex dotenv

import "dotenv/config";
import { OpenAI } from "llamaindex";

const llm = new OpenAI({
  model: "gpt-4o-mini",
  apiKey: process.env.OPENAI_API_KEY,
});

2) Load policy documents with metadata

The key pattern here is: ingest only approved documents, attach metadata up front, and preserve versioning. That lets you filter on jurisdiction or policy revision later.

import {
  Document,
  SimpleDirectoryReader,
} from "llamaindex";

async function loadPolicyDocs() {
  const reader = new SimpleDirectoryReader();
  const docs = await reader.loadData({
    directoryPath: "./policies",
    fileExts: [".pdf", ".txt", ".md"],
  });

  return docs.map(
    (doc) =>
      new Document({
        text: doc.text,
        metadata: {
          source: doc.metadata?.fileName ?? "unknown",
          policy_type: "banking_policy",
          jurisdiction: "US",
          version: "2025.01",
          owner: "Compliance",
        },
      })
  );
}

3) Build the index and query engine

This is the core retrieval pattern. In production you would swap the storage backend for your vector database of choice; the API shape stays similar. The important part is that answers come from indexed policy content, not free-form generation.

import {
  VectorStoreIndex,
  Settings,
} from "llamaindex";
import { OpenAI } from "llamaindex";

Settings.llm = new OpenAI({
  model: "gpt-4o-mini",
  apiKey: process.env.OPENAI_API_KEY,
});

async function buildPolicyAgent() {
  const docs = await loadPolicyDocs();

  const index = await VectorStoreIndex.fromDocuments(docs);

  const queryEngine = index.asQueryEngine({
    similarityTopK: 4,
    responseMode: "compact",
    preFilters: {
      filters: [
        {
          key: "jurisdiction",
          value: "US",
          operator: "==",
        },
      ],
    },
  });

  return queryEngine;
}

4) Add a banking-safe answer wrapper

Do not return raw model output directly. Wrap it with a simple policy check so the agent refuses questions outside scope or without enough support. This is where you enforce compliance behavior.

async function answerPolicyQuestion(question: string) {
  const queryEngine = await buildPolicyAgent();

  const response = await queryEngine.query({
    query: question,
  });

  const answerText =
    typeof response.response === "string"
      ? response.response
      : String(response.response);

  if (
    !answerText ||
    answerText.toLowerCase().includes("i don't know") ||
    answerText.toLowerCase().includes("insufficient")
  ) {
    return {
      answer:
        "I could not find enough support in the current policy set to answer this question.",
      citations: [],
      needsReview: true,
    };
  }

  
return {
    answer: answerText,
    citations:
      response.sourceNodes?.map((node) => ({
        source:
          node.node.metadata?.source ?? "unknown",
        score: node.score,
      })) ?? [],
    needsReview: false,
};
}

answerPolicyQuestion(
"Can we waive overdraft fees for premium customers?"
).then(console.log);

Production Considerations

•
Deployment
- •Run the agent behind an internal API gateway with authentication and request logging.
- •Keep vector storage in-region if data residency rules require it.
•
Monitoring
- •Track retrieval hit rate, citation coverage, latency, refusal rate, and unanswered queries.
- •Alert when answers are produced without source nodes or when a new policy version causes drift.
•
Guardrails
- •Add strict prompt instructions that the agent must only answer from retrieved policies.
- •Refuse legal interpretation requests unless they are explicitly covered by approved bank guidance.
•
Auditability
- •Store question text, retrieved chunk IDs, document versions, model version, and final answer hash.
- •Make audit logs immutable and searchable for compliance teams.

Common Pitfalls

•
Using stale policies

If you do not version documents at ingest time, the agent may answer from obsolete rules. Fix this by tagging every chunk with version, effective_date, and owner, then filtering on current approved versions only.
•
Returning uncited answers

A banking Q&A agent without citations is hard to defend in audit reviews. Always surface source nodes from response.sourceNodes and reject answers when retrieval confidence is too low.
•
Mixing public knowledge with internal policy

If you let the model rely on general internet knowledge, it will drift away from bank-approved language. Keep the prompt narrow, restrict retrieval to internal sources, and block questions that need legal or regulatory interpretation beyond published policy.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit