How to Build a policy Q&A Agent Using LlamaIndex in TypeScript for retail banking
A policy Q&A agent for retail banking answers customer-service and internal-ops questions from approved policy documents, not from model memory. That matters because banking teams need consistent answers on fees, card disputes, KYC, overdrafts, and account servicing without exposing staff or customers to hallucinations, outdated policies, or compliance drift.
Architecture
- •
Policy document ingestion
- •Pull PDFs, DOCX files, HTML policy pages, and internal SOPs into a controlled corpus.
- •Keep source metadata like document title, version, effective date, jurisdiction, and owner.
- •
Chunking and indexing
- •Split policies into retrieval-friendly nodes with
SentenceSplitter. - •Build a vector index with
VectorStoreIndexso the agent can retrieve exact policy passages.
- •Split policies into retrieval-friendly nodes with
- •
Retriever layer
- •Use
index.asRetriever()with top-k limits. - •Filter by product line, region, or effective date when the bank has multiple policy variants.
- •Use
- •
Answer synthesis
- •Use a query engine that cites sources and constrains responses to retrieved context.
- •Return concise answers plus references for auditability.
- •
Guardrails and escalation
- •Detect low-confidence queries, missing policy coverage, or regulated advice requests.
- •Route those cases to a human queue or case management system.
Implementation
1) Install dependencies and set up your environment
Use the TypeScript LlamaIndex packages and keep the model key in environment variables.
npm install llamaindex dotenv
Set your OpenAI key:
export OPENAI_API_KEY="your-key"
2) Load policy text with metadata
For retail banking, metadata is not optional. You need to know which policy version answered the question when audit asks six months later.
import "dotenv/config";
import {
Document,
VectorStoreIndex,
SentenceSplitter,
} from "llamaindex";
const documents = [
new Document({
text: `
Retail Banking Fee Waiver Policy
Effective Date: 2025-01-01
Region: UK
Policy: Branch staff may waive monthly account fees only for customers with documented service failure.
`,
metadata: {
source: "fee-waiver-policy.md",
product: "current-account",
region: "UK",
effectiveDate: "2025-01-01",
owner: "Retail Banking Operations",
},
}),
new Document({
text: `
Card Dispute Policy
Effective Date: 2025-02-15
Region: UK
Policy: Customers must report unauthorized card transactions within 13 months.
Staff must open a dispute case within one business day.
`,
metadata: {
source: "card-dispute-policy.md",
product: "debit-card",
region: "UK",
effectiveDate: "2025-02-15",
owner: "Payments Operations",
},
}),
];
3) Build the index and query engine
This is the core pattern. Split into nodes, index them, then query through a retriever-backed engine that can cite sources.
import { OpenAI } from "@llamaindex/openai";
async function main() {
const splitter = new SentenceSplitter({
chunkSize: 256,
chunkOverlap: 32,
});
const nodes = await splitter.getNodesFromDocuments(documents);
const index = await VectorStoreIndex.fromDocuments(documents, {
transformations: [splitter],
embedModel: new OpenAIEmbedding({
model: "text-embedding-3-small",
}),
});
const queryEngine = index.asQueryEngine({
retriever_kwargs: {
similarityTopK: 3,
},
responseSynthesizerConfig: {
streamFinalResponse: false,
},
});
const response = await queryEngine.query({
query:
"Can branch staff waive monthly fees for a customer who complained about poor service?",
});
console.log(String(response));
}
main().catch(console.error);
A couple of notes on that pattern:
- •
VectorStoreIndex.fromDocuments(...)is enough for a first production pilot when your corpus is small to medium. - •
SentenceSplitterkeeps chunks aligned to policy language instead of arbitrary token boundaries. - •
index.asQueryEngine()gives you a clean retrieval-to-answer path without hand-wiring every component.
4) Add an explicit compliance filter before answering
In banking, you do not want every question answered. Questions that ask for legal interpretation, credit decisions, or customer-specific outcomes should be escalated.
function shouldEscalate(question: string): boolean {
const q = question.toLowerCase();
return [
"legal advice",
"should we approve",
"credit score",
"exception to policy",
"guarantee approval",
"complaint escalation",
].some((phrase) => q.includes(phrase));
}
async function answerPolicyQuestion(question: string) {
if (shouldEscalate(question)) {
return {
answer:
"This question needs human review because it may require compliance or discretionary decisioning.",
escalated: true,
};
}
const result = await queryEngine.query({ query: question });
return {
answer: String(result),
escalated: false,
};
}
That simple gate catches a lot of bad requests before they reach the model. In a real bank, this should sit behind role-based access control and ticketing integration.
Production Considerations
- •
Deployment
- •Run the agent behind an authenticated internal API.
- •Separate corpora by jurisdiction if your policies differ across UK, EU, and APAC entities.
- •Keep embeddings and source documents in approved regions for data residency requirements.
- •
Monitoring
| Signal | Why it matters | Action |
|---|---|---|
| Low retrieval scores | The agent may be answering from weak context | Escalate or refuse |
| High fallback rate | Policy coverage is incomplete | Add missing documents |
| Escalation volume by topic | Indicates ambiguous or risky policy areas | Review content with compliance |
| Source mismatch | Wrong region/version cited | Fix metadata filters |
- •Guardrails
| Guardrail | Banking concern | Implementation |
|---|---|---|
| Answer only from retrieved context | Hallucination risk | Refuse if no relevant nodes are found |
| Cite source metadata | Auditability | Include document name and effective date |
| Role-based access control | Internal confidentiality | Filter documents by user role |
| Human escalation path | Regulated advice risk | Route edge cases to operations/compliance |
Common Pitfalls
- •
Ignoring document versioning
- •If you index old fee policies alongside current ones, the agent will return stale answers.
- •Fix it by storing
effectiveDate,version, andregionin metadata and filtering at query time.
- •
Letting the model answer outside the corpus
- •A policy Q&A agent should not improvise explanations for overdrafts, disputes, or AML rules.
- •Fix it by refusing low-confidence queries and requiring retrieved context before synthesis.
- •
Skipping audit trails
- •In retail banking, “the model said so” is not acceptable evidence.
- •Fix it by logging question text, retrieved node IDs, source metadata, response text, timestamp, and user identity.
If you build this pattern correctly, you get a controlled assistant that helps contact centers and operations teams answer policy questions fast without turning compliance into guesswork.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit