How to Build a policy Q&A Agent Using LlamaIndex in TypeScript for pension funds

By Cyprian AaronsUpdated 2026-04-21
policy-q-allamaindextypescriptpension-fundspolicy-qanda

A policy Q&A agent for pension funds answers questions like “What is our hardship withdrawal policy?” or “Can this member access benefits before retirement age?” by retrieving the right policy text and generating a grounded answer. It matters because pension operations are compliance-heavy, slow to search manually, and expensive to get wrong.

Architecture

  • Policy document ingestion

    • Load PDFs, DOCX files, and internal policy pages into a controlled corpus.
    • Keep source metadata like policy name, version, effective date, and jurisdiction.
  • Chunking and indexing

    • Split policies into retrieval-friendly chunks.
    • Build a vector index with VectorStoreIndex so the agent can find relevant clauses fast.
  • Retriever layer

    • Use a retriever configured for high recall on legal/policy language.
    • Return top-k chunks with metadata for auditability.
  • Response synthesis

    • Use QueryEngine to generate an answer only from retrieved context.
    • Force citations so staff can trace every answer back to source policy text.
  • Guardrails

    • Reject questions outside the corpus or outside allowed use cases.
    • Prevent the model from inventing policy interpretations.
  • Audit and logging

    • Store question, retrieved sources, answer, timestamp, and policy version.
    • This is non-negotiable for pension fund compliance reviews.

Implementation

1) Install LlamaIndex TypeScript packages

Use the TypeScript SDK and a local or approved model provider. For pension funds, keep the model endpoint in your approved region and avoid sending member PII unless your legal team has cleared it.

npm install llamaindex dotenv

Set your environment variables:

OPENAI_API_KEY=your_key_here

2) Load policy files with metadata

The important part here is metadata. A pension fund answer without source versioning is not production-safe.

import "dotenv/config";
import { Document } from "llamaindex";

const documents = [
  new Document({
    text: `
      Policy: Early Retirement Access
      Effective Date: 2024-01-01
      Members may apply for early access only under conditions defined in section 4.2.
      Applications require trustee approval and supporting documentation.
    `,
    metadata: {
      policyName: "Early Retirement Access",
      version: "2024.1",
      jurisdiction: "ZA",
      effectiveDate: "2024-01-01",
      source: "policy-manual.pdf",
    },
  }),
  new Document({
    text: `
      Policy: Hardship Withdrawals
      Effective Date: 2024-03-15
      Hardship withdrawals are permitted only for qualifying emergencies.
      The fund administrator must retain evidence of assessment for seven years.
    `,
    metadata: {
      policyName: "Hardship Withdrawals",
      version: "2024.3",
      jurisdiction: "ZA",
      effectiveDate: "2024-03-15",
      source: "hardship-policy.pdf",
    },
  }),
];

3) Build the index and query engine

This is the core pattern. VectorStoreIndex.fromDocuments() builds the retrieval layer, then asQueryEngine() gives you grounded Q&A over that corpus.

import { VectorStoreIndex } from "llamaindex";

async function main() {
  const index = await VectorStoreIndex.fromDocuments(documents);

  const queryEngine = index.asQueryEngine({
    similarityTopK: 3,
    responseMode: "compact",
    preFilters: {
      filters: [
        {
          key: "jurisdiction",
          value: "ZA",
          operator: "==",
        },
      ],
    },
  });

  const question = "Can a member access benefits before retirement age?";
  const response = await queryEngine.query({
    queryStr: question,
  });

  console.log("Answer:", response.toString());
}

main().catch(console.error);

A few things matter here:

  • similarityTopK should be tuned for legal/policy retrieval, not generic chat.
  • preFilters help you avoid cross-jurisdiction contamination.
  • responseMode: "compact" keeps answers tight and less likely to hallucinate extra detail.

4) Add audit output for compliance

Pension funds need traceability. Capture the question, answer, and source nodes so compliance teams can review what the system used.

import { BaseNodePostprocessor } from "llamaindex";

async function askPolicy(question: string) {
  const index = await VectorStoreIndex.fromDocuments(documents);
  const queryEngine = index.asQueryEngine({ similarityTopK: 3 });

  const response = await queryEngine.query({ queryStr: question });

  const sourceNodes = response.sourceNodes?.map((node) => ({
    score: node.score,
    textSnippet: node.node.getContent().slice(0, 200),
    metadata: node.node.metadata,
  }));

  return {
    question,
    answer: response.toString(),
    sources: sourceNodes,
    timestamp: new Date().toISOString(),
  };
}

If you want stronger control over what gets returned, add a postprocessor like SimilarityPostprocessor to drop weak matches before synthesis. That reduces bad citations when policies overlap.

Production Considerations

  • Data residency

    • Keep embeddings, indexes, logs, and model calls inside approved regions.
    • Pension data often has strict residency requirements tied to regulator expectations and internal risk policies.
  • Audit trails

    • Log every query with retrieved chunk IDs, document versions, timestamps, and user identity.
    • If a trustee asks why the agent answered a certain way, you need evidence fast.
  • Guardrails

    • Block questions that request legal advice beyond published policy text.
    • Return “I couldn’t find support in the current policy set” instead of guessing.
  • Access control

    • Separate member-facing FAQs from internal administrator policies.
    • Some documents may contain operational or sensitive trustee material that should never be exposed broadly.

Common Pitfalls

  1. Using stale policy versions

    • The agent answers from old rules if you keep multiple versions in one index without filtering.
    • Fix it by storing version, effectiveDate, and status, then filtering to active policies only.
  2. Skipping source citations

    • A plain natural-language answer is useless during audits if nobody can trace it back to policy text.
    • Fix it by always returning source nodes or at least document metadata with each answer.
  3. Letting the model infer missing policy

    • Pension users will ask ambiguous questions like “Can I withdraw now?”
    • Fix it by forcing the agent to ask clarifying questions when age, employment status, jurisdiction, or benefit type is missing.

If you build this with strong metadata discipline, retrieval filters, and audit logging, you get a usable pension policy assistant instead of a risky chatbot. For this domain, correctness beats fluency every time.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides