How to Build a policy Q&A Agent Using LlamaIndex in TypeScript for fintech

By Cyprian AaronsUpdated 2026-04-21
policy-q-allamaindextypescriptfintechpolicy-qanda

A policy Q&A agent answers employee or customer questions against approved policy documents, then returns a grounded response with citations. In fintech, that matters because the difference between “probably right” and “provably sourced” is compliance risk, audit pain, and bad customer outcomes.

Architecture

Build this agent with a small number of explicit components:

  • Document ingestion layer

    • Pull policy PDFs, markdown, or internal wiki exports from a controlled source.
    • Normalize them into text chunks before indexing.
  • Vector index

    • Store embeddings for policy chunks using VectorStoreIndex.
    • Keep the corpus scoped to approved, versioned documents only.
  • Retriever

    • Use similarity search to find the most relevant policy sections for each question.
    • Tune similarityTopK based on how broad your policies are.
  • Response synthesizer / query engine

    • Generate an answer only from retrieved context.
    • Force citations so reviewers can trace every answer back to source text.
  • Guardrails layer

    • Block unsupported questions, sensitive data leakage, and off-policy answers.
    • Add refusal behavior when retrieval confidence is low.
  • Audit logging

    • Persist question, retrieved document IDs, model output, and timestamps.
    • This is non-negotiable in fintech.

Implementation

1) Install the TypeScript packages

Use the LlamaIndex TypeScript SDK plus an embedding provider. For production fintech systems, pin versions and keep model/provider choice explicit.

npm install llamaindex dotenv

Set your environment variables:

OPENAI_API_KEY=your_key_here

2) Load policy documents and build an index

This example uses SimpleDirectoryReader and VectorStoreIndex. It’s a clean starting point for internal policy corpora.

import "dotenv/config";
import {
  SimpleDirectoryReader,
  VectorStoreIndex,
} from "llamaindex";

async function main() {
  const reader = new SimpleDirectoryReader();
  const docs = await reader.loadData({
    directoryPath: "./policies",
  });

  const index = await VectorStoreIndex.fromDocuments(docs);

  const queryEngine = index.asQueryEngine({
    similarityTopK: 3,
    responseMode: "compact",
  });

  const question =
    "What is the maximum daily transfer limit for retail customers?";
  
  const response = await queryEngine.query({
    query: question,
  });

  console.log(String(response));
}

main().catch(console.error);

This pattern works because the agent is not “chatting from memory.” It retrieves relevant chunks first, then synthesizes an answer from those chunks.

3) Add metadata-aware retrieval and citations

For fintech, you need document provenance. Put versioning metadata on every document and return sources in the final answer path.

import "dotenv/config";
import {
  Document,
  MetadataMode,
  Settings,
  VectorStoreIndex,
} from "llamaindex";

async function buildPolicyAgent() {
  const docs = [
    new Document({
      text: "Retail customers may transfer up to $10,000 per day unless enhanced due diligence applies.",
      metadata: {
        docType: "policy",
        policyName: "Payments Limits",
        version: "2025.01",
        jurisdiction: "US",
      },
    }),
    new Document({
      text: "Enhanced due diligence requires manual review before increasing limits.",
      metadata: {
        docType: "policy",
        policyName: "Risk Controls",
        version: "2025.01",
        jurisdiction: "US",
      },
    }),
  ];

  const index = await VectorStoreIndex.fromDocuments(docs);

  const retriever = index.asRetriever({ similarityTopK: 2 });
  
  const nodes = await retriever.retrieve({
    query: "Can we raise a customer's transfer limit?",
  });

  for (const node of nodes) {
    console.log(node.node.getContent(MetadataMode.ALL));
    console.log(node.score);
  }
}

buildPolicyAgent().catch(console.error);

The important bit here is that you can inspect retrieved nodes before answering. In regulated environments, that inspection step is part of the control surface.

4) Wrap retrieval with a refusal rule

Do not answer if retrieval is weak or irrelevant. Fintech agents should fail closed.

import "dotenv/config";
import {
  SimpleDirectoryReader,
  VectorStoreIndex,
} from "llamaindex";

async function askPolicy(question: string) {
const docs = await new SimpleDirectoryReader().loadData({
    directoryPath: "./policies",
});

const index = await VectorStoreIndex.fromDocuments(docs);
const retriever = index.asRetriever({ similarityTopK: 3 });
const nodes = await retriever.retrieve({ query: question });

if (nodes.length === 0 || (nodes[0]?.score ?? 0) < 0.7) {
    return {
      answer:
        "I can't answer this confidently from approved policy documents. Please route to Compliance.",
      sources: [],
    };
}

const queryEngine = index.asQueryEngine({ similarityTopK: 3 });
const response = await queryEngine.query({ query: question });

return {
    answer: String(response),
    sources: nodes.map((n) => ({
      score: n.score,
      textSnippet: n.node.getContent().slice(0, 200),
      metadata: n.node.metadata,
    })),
};
}

That refusal threshold should be tuned against your own corpus. A payments policy library with short documents will behave differently from a large AML handbook.

Production Considerations

  • Deploy in-region

    • Keep embeddings, vector storage, and model calls inside approved data residency boundaries.
    • If your policies are EU-only or UK-only, don’t send them to a cross-region service by accident.
  • Log everything needed for audit

    • Store the user question, top-k retrieved chunks, scores, response text, model version, and document versions.
  • Add compliance guardrails

    • Detect requests for account-specific advice, legal interpretation, or confidential customer data.
    • Route those to human review instead of generating an answer.
  • Monitor retrieval quality

    • Track no-answer rate, low-confidence retrievals, citation coverage, and escalation volume.
    • A rising no-answer rate usually means your policies changed or chunking broke.

Common Pitfalls

  • Using stale policy versions

    If you re-index without version metadata, you’ll serve outdated rules. Fix this by tagging every document with version, effectiveDate, and jurisdiction, then filtering at retrieval time.

  • Letting the model answer without enough context

    If you skip a confidence threshold or refuse rule, the agent will hallucinate around missing policy text. In fintech, that becomes a compliance incident fast.

  • Ignoring document structure

    Dumping entire PDFs into one blob gives poor retrieval quality. Split by section headings or clauses so VectorStoreIndex can retrieve precise rule fragments instead of noisy pages.

A good fintech policy Q&A agent is not just a search box with an LLM on top. It’s a controlled retrieval system with strict provenance, refusal behavior, and auditability built in from day one.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides