How to Build a policy Q&A Agent Using LangChain in TypeScript for pension funds

By Cyprian AaronsUpdated 2026-04-21
policy-q-alangchaintypescriptpension-fundspolicy-qanda

A policy Q&A agent for pension funds answers staff and member questions against approved policy documents, scheme rules, investment governance notes, and regulatory guidance. It matters because pension operations are full of edge cases: eligibility, contribution limits, transfer rules, retirement options, and disclosure obligations. If the agent can retrieve the right source and cite it cleanly, it reduces manual support load without turning policy interpretation into guesswork.

Architecture

  • Document ingestion pipeline

    • Pull policy PDFs, Word docs, and HTML pages from controlled sources.
    • Chunk them into retrievable passages with metadata like documentType, effectiveDate, jurisdiction, and version.
  • Vector store

    • Store embeddings for semantic retrieval over scheme rules and internal policy.
    • Keep tenant separation if you serve multiple funds or schemes.
  • Retriever

    • Use a retriever tuned for policy lookup, not open-ended chat.
    • Filter by jurisdiction, effective date, and document status so retired policies do not leak into answers.
  • LLM answer chain

    • Combine retrieved context with a constrained prompt.
    • Force citations and refuse answers when the evidence is weak or missing.
  • Audit logging layer

    • Persist question, retrieved document IDs, model version, answer, and timestamp.
    • This is non-negotiable for compliance review and dispute handling.
  • Guardrails

    • Detect advice-like requests, personal financial planning requests, and PII.
    • Route those to human review or a safe fallback response.

Implementation

1) Index pension policy documents with metadata

Use RecursiveCharacterTextSplitter to chunk source documents and MemoryVectorStore for a local example. In production, swap the store for a managed vector DB with residency controls.

import { Document } from "@langchain/core/documents";
import { RecursiveCharacterTextSplitter } from "@langchain/textsplitters";
import { OpenAIEmbeddings } from "@langchain/openai";
import { MemoryVectorStore } from "langchain/vectorstores/memory";

async function buildIndex() {
  const docs = [
    new Document({
      pageContent: "Members may retire from age 55 subject to scheme rules and tax law.",
      metadata: {
        source: "member_handbook.pdf",
        documentType: "member-handbook",
        jurisdiction: "UK",
        effectiveDate: "2025-01-01",
        version: "12"
      }
    }),
    new Document({
      pageContent: "Transfers out require identity verification and trustee approval where applicable.",
      metadata: {
        source: "admin_policy.docx",
        documentType: "admin-policy",
        jurisdiction: "UK",
        effectiveDate: "2024-11-15",
        version: "4"
      }
    })
  ];

  const splitter = new RecursiveCharacterTextSplitter({
    chunkSize: 500,
    chunkOverlap: 80
  });

  const chunks = await splitter.splitDocuments(docs);
  const embeddings = new OpenAIEmbeddings({ model: "text-embedding-3-small" });

  return MemoryVectorStore.fromDocuments(chunks, embeddings);
}

2) Build a retriever that respects policy scope

For pension funds, retrieval must be filtered. A general semantic search over old policies is how you get wrong answers that still sound plausible.

const vectorStore = await buildIndex();

const retriever = vectorStore.asRetriever({
  k: 4
});

async function retrievePolicyContext(question: string) {
  const docs = await retriever.invoke(question);

  return docs.filter((doc) => {
    const md = doc.metadata as Record<string, string>;
    return md.jurisdiction === "UK" && md.documentType !== "retired-policy";
  });
}

3) Create a constrained answer chain with citations

Use ChatOpenAI, ChatPromptTemplate, and RunnableSequence. The prompt should force the model to answer only from provided context and cite sources by filename or document ID.

import { ChatOpenAI } from "@langchain/openai";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { RunnableLambda, RunnableSequence } from "@langchain/core/runnables";

const llm = new ChatOpenAI({
  model: "gpt-4o-mini",
  temperature: 0
});

const prompt = ChatPromptTemplate.fromMessages([
  ["system", 
   `You answer pension policy questions using only the provided context.
If the context does not contain the answer, say you cannot confirm it.
Do not give financial advice. Cite sources using the metadata.source field.`],
  ["human", 
   `Question: {question}

Context:
{context}

Return:
1. Direct answer
2. Source citations
3. If needed, a note that human review is required`
  ]
]);

const formatContext = (docs: any[]) =>
  docs.map((d) => `[${d.metadata.source}] ${d.pageContent}`).join("\n\n");

const chain = RunnableSequence.from([
  RunnableLambda.from(async (input: { question: string }) => {
    const docs = await retrievePolicyContext(input.question);
    return {
      question: input.question,
      context: formatContext(docs)
    };
  }),
  prompt,
  llm
]);

const result = await chain.invoke({
  question: "Can a member retire at age 54 under this scheme?"
});

console.log(result.content);

4) Add an audit log around every response

For pension funds, you need traceability. Log what was asked, what documents were used, and which model answered it.

type AuditRecord = {
  question: string;
  answer: string;
  sources: string[];
};

async function logAudit(record: AuditRecord) {
  console.log(JSON.stringify({
    eventType: "policy_qa_response",
    ...record,
    timestamp: new Date().toISOString()
  }));
}

Then wire it into your request handler:

async function answerQuestion(question: string) {
  const docs = await retrievePolicyContext(question);
	const sources = docs.map((d) => String(d.metadata.source));

	const response = await chain.invoke({ question });
	const answer = typeof response.content === "string" ? response.content : JSON.stringify(response.content);

	await logAudit({ question, answer, sources });
	return answer;
}

Production Considerations

  • Data residency

    • Keep embeddings, logs, and raw documents in-region if your fund has UK/EU residency requirements.
      • Do not send member-specific data to external services unless your legal basis and vendor controls are explicit.
  • Compliance controls

    • Add refusal logic for advice-seeking prompts like “Should I transfer my pot?”
    • Route those to regulated human support instead of letting the model improvise.
  • Monitoring

    • Track retrieval hit rate, citation coverage, refusal rate, and escalation rate.
    • Sample low-confidence answers weekly against source documents with compliance reviewers.
  • Versioning

    • Store document version alongside every answer.
    • When a rule changes, reindex immediately and mark prior answers as tied to superseded guidance.

Common Pitfalls

  1. Using stale policy documents

    • If retired scheme rules remain in the index, the agent will confidently cite outdated guidance.
    • Fix this by filtering on effectiveDate, status, and explicit document lifecycle metadata during retrieval.
  2. Letting the model answer without evidence

    • A generic chat prompt will produce polished but unsupported responses.
    • Fix this by requiring citations in the prompt and returning “cannot confirm” when context is insufficient.
  3. Mixing member data with general policy Q&A

    • Pension questions often drift into personal circumstances like salary history or retirement projections.
    • Fix this by separating public policy lookup from member-specific workflows and adding PII detection before the LLM call.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides