How to Build a policy Q&A Agent Using LangChain in TypeScript for fintech

By Cyprian AaronsUpdated 2026-04-21

policy-q-alangchaintypescriptfintechpolicy-qanda

A policy Q&A agent answers questions like “Can I waive this fee?”, “What’s the chargeback window?”, or “Is this customer eligible for KYC re-verification?” by grounding every answer in your internal policy docs, product rules, and compliance guidance. For fintech, this matters because bad answers are not just inaccurate — they create audit risk, regulatory exposure, and inconsistent customer operations.

Architecture

•
Policy document ingestion
- •Load PDFs, Markdown, Confluence exports, or internal SOPs into a searchable store.
- •Split content into chunks that preserve section headers and policy context.
•
Vector retrieval layer
- •Use embeddings plus a vector store to fetch the most relevant policy passages.
- •Keep retrieval scoped to approved documents by business line, region, or product.
•
Answering chain
- •Pass retrieved context into a chat model with a strict system prompt.
- •Force the model to answer only from retrieved policy text and cite sources.
•
Guardrails layer
- •Reject unsupported questions, low-confidence retrievals, and requests outside policy scope.
- •Add refusal behavior for legal advice, account-specific decisions, or PII-heavy prompts.
•
Audit logging
- •Store the question, retrieved chunks, model output, document IDs, and timestamps.
- •This is what you need when compliance asks why an answer was given.

Implementation

1) Load policy docs and build the retriever

Use LangChain’s loaders and splitters to turn policy files into chunks. For fintech, keep document metadata like region, product line, and version so retrieval can be filtered later.

import "dotenv/config";
import { PDFLoader } from "@langchain/community/document_loaders/fs/pdf";
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
import { OpenAIEmbeddings } from "@langchain/openai";
import { MemoryVectorStore } from "langchain/vectorstores/memory";

async function buildRetriever() {
  const loader = new PDFLoader("./policies/card-policy.pdf");
  const docs = await loader.load();

  const splitter = new RecursiveCharacterTextSplitter({
    chunkSize: 1000,
    chunkOverlap: 150,
  });

  const chunks = await splitter.splitDocuments(docs);

  const embeddings = new OpenAIEmbeddings({
    model: "text-embedding-3-small",
  });

  const vectorStore = await MemoryVectorStore.fromDocuments(chunks, embeddings);
  return vectorStore.asRetriever(4);
}

2) Create a grounded QA chain

Use ChatOpenAI with createStuffDocumentsChain. The key pattern is: retrieve first, then answer only from those documents.

import { ChatOpenAI } from "@langchain/openai";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { createStuffDocumentsChain } from "langchain/chains/combine_documents";
import { StringOutputParser } from "@langchain/core/output_parsers";

async function buildQaChain() {
  const llm = new ChatOpenAI({
    model: "gpt-4o-mini",
    temperature: 0,
  });

  const prompt = ChatPromptTemplate.fromMessages([
    [
      "system",
      `You are a policy Q&A assistant for a fintech company.
Answer only using the provided context.
If the answer is not in the context, say: "I couldn't find that in the current policy set."
Always include short citations using source metadata when available.`,
    ],
    ["human", "Question: {question}\n\nContext:\n{context}"],
  ]);

  return createStuffDocumentsChain({
    llm,
    prompt,
    outputParser: new StringOutputParser(),
  });
}

3) Wire retrieval + generation together

This is the actual request path. Retrieve relevant chunks, format them into context, then invoke the chain.

import type { Document } from "@langchain/core/documents";

function formatSources(docs: Document[]) {
  return docs
    .map((doc) => {
      const source = doc.metadata?.source ?? "unknown";
      const page = doc.metadata?.loc?.page ?? doc.metadata?.page ?? "n/a";
      return `[source=${source}, page=${page}]\n${doc.pageContent}`;
    })
    .join("\n\n---\n\n");
}

export async function answerPolicyQuestion(question: string) {
  const retriever = await buildRetriever();
  const qaChain = await buildQaChain();

  const docs = await retriever.invoke(question);

  if (docs.length === 0) {
    return {
      answer: "I couldn't find that in the current policy set.",
      sources: [],
    };
  }

  const answer = await qaChain.invoke({
    question,
    context: formatSources(docs),
  });

  return {
    answer,
    sources: docs.map((d) => d.metadata),
  };
}

4) Add a refusal gate for risky prompts

In fintech, don’t let the agent guess on regulated or account-specific requests. Put a small pre-check in front of retrieval for topics that require human review.

const blockedPatterns = [
  /my account/i,
  /customer's ssn/i,
  /bypass kyc/i,
  /override compliance/i,
];

export function shouldRefuse(question: string) {
  return blockedPatterns.some((pattern) => pattern.test(question));
}

export async function safeAnswer(question: string) {
const deniedMessage =
    "This request needs human review because it involves sensitive or account-specific information.";

if (shouldRefuse(question)) {
    return { answer: deniedMessage, sources: [] };
}

return answerPolicyQuestion(question);
}

Production Considerations

•
Deploy per region
- •Keep EU policies and EU customer data in-region if your data residency requirements demand it.
•
Log everything needed for audit
- •Store prompt input, retrieved chunk IDs, model version, response text, and refusal reason.
•
Monitor retrieval quality
- •Track hit rate on approved policies, empty-retrieval rate, and citation coverage.
•Add human escalation paths

For anything involving exceptions, disputes, AML/KYC edge cases, or legal interpretation, route to an ops queue instead of forcing an LLM response.

Common Pitfalls

•Using raw semantic search without metadata filters

If you mix regions or product lines in one index without filters, the agent will surface the wrong policy. Fix it by storing metadata like region, product, policyVersion, and filtering before generation.

•Letting the model answer without evidence

A generic chat prompt will produce confident nonsense when retrieval fails. Force a refusal message when no strong context is found.

•Ignoring version control on policies

Fintech policies change often. If you don’t track effective dates and versions in your index refresh pipeline, your agent will quote stale rules during audits or customer disputes.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit