How to Build a policy Q&A Agent Using LangChain in TypeScript for healthcare

By Cyprian AaronsUpdated 2026-04-21
policy-q-alangchaintypescripthealthcarepolicy-qanda

A policy Q&A agent for healthcare answers questions like “Is this treatment covered?”, “What’s the prior auth rule?”, or “Does this policy require a referral?” by retrieving the right policy text and generating a grounded answer. It matters because healthcare teams need fast, consistent responses without exposing protected data or letting the model invent coverage rules.

Architecture

  • Policy document store

    • Source of truth for plan documents, benefits summaries, clinical guidelines, and internal SOPs.
    • Store PDFs, HTML, and structured policy text in a controlled repository.
  • Document ingestion and chunking

    • Parse policy files into chunks with metadata like policyId, effectiveDate, jurisdiction, and planType.
    • Keep chunk sizes small enough for retrieval, but large enough to preserve rule context.
  • Vector retrieval layer

    • Use embeddings plus a vector store to fetch the most relevant policy chunks.
    • Filter by plan, state, line of business, or date to avoid cross-policy contamination.
  • LLM answer generation

    • Feed retrieved policy snippets into a chat model with strict instructions to answer only from sources.
    • Force citations so reviewers can trace every response back to policy text.
  • Guardrails and audit logging

    • Detect PHI/PII before sending prompts to the model.
    • Log question, retrieved sources, model output, and confidence signals for compliance review.

Implementation

1) Load policies and split them into retrievable chunks

Use LangChain’s RecursiveCharacterTextSplitter so policy sections stay intact as much as possible. Add metadata early; you will need it later for filtering and audit trails.

import { Document } from "@langchain/core/documents";
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";

const rawDocs: Document[] = [
  new Document({
    pageContent: `
      Prior authorization is required for MRI procedures unless performed in an emergency setting.
      For outpatient imaging in California PPO plans, authorization must be obtained before scheduling.
    `,
    metadata: {
      policyId: "IMAGING-001",
      planType: "PPO",
      state: "CA",
      effectiveDate: "2025-01-01",
      source: "benefits-manual.pdf",
    },
  }),
];

const splitter = new RecursiveCharacterTextSplitter({
  chunkSize: 500,
  chunkOverlap: 80,
});

const chunks = await splitter.splitDocuments(rawDocs);
console.log(chunks[0].metadata);

2) Embed the chunks and store them in a vector database

For production, use a real vector store such as Pinecone, pgvector, or OpenSearch. The pattern below uses MemoryVectorStore so the code is runnable without extra infrastructure, but the API shape stays the same when you swap stores.

import { OpenAIEmbeddings } from "@langchain/openai";
import { MemoryVectorStore } from "langchain/vectorstores/memory";

const embeddings = new OpenAIEmbeddings({
  model: "text-embedding-3-small",
});

const vectorStore = await MemoryVectorStore.fromDocuments(chunks, embeddings);

const retriever = vectorStore.asRetriever(4);

3) Build a grounded Q&A chain with citations

Use ChatOpenAI plus RunnableSequence so the prompt stays explicit. The model should answer only from retrieved context and say when the policy does not contain enough information.

import { ChatOpenAI } from "@langchain/openai";
import { PromptTemplate } from "@langchain/core/prompts";
import { RunnableSequence } from "@langchain/core/runnables";

const llm = new ChatOpenAI({
  model: "gpt-4o-mini",
  temperature: 0,
});

const prompt = PromptTemplate.fromTemplate(`
You are a healthcare policy assistant.
Answer only using the provided context.
If the context does not contain enough information, say "I don't have enough policy evidence to answer that."
Always cite policyId and source in your answer.

Context:
{context}

Question:
{question}

Answer:
`);

const formatDocs = (docs: any[]) =>
  docs
    .map(
      (d) =>
        `[policyId=${d.metadata.policyId}; source=${d.metadata.source}; state=${d.metadata.state}]\n${d.pageContent}`
    )
    .join("\n\n");

const qaChain = RunnableSequence.from([
  async (input: { question: string }) => {
    const docs = await retriever.getRelevantDocuments(input.question);
    return {
      question: input.question,
      context: formatDocs(docs),
      sources: docs.map((d) => d.metadata),
    };
  },
  prompt,
  llm,
]);

const result = await qaChain.invoke({
  question: "Does this PPO plan require prior authorization for MRI?",
});

console.log(result.content);

4) Add a healthcare-specific precheck before retrieval

Do not send raw user input straight into retrieval if it contains patient identifiers. Run a lightweight redaction step first, then keep the original question out of logs unless your compliance team has approved it.

function redactPHI(input: string): string {
  return input
    .replace(/\b\d{3}-\d{2}-\d{4}\b/g, "[REDACTED_SSN]")
    .replace(/\b\d{10}\b/g, "[REDACTED_PHONE]")
    .replace(/\b[A-Z][a-z]+ [A-Z][a-z]+\b/g, "[REDACTED_NAME]");
}

const safeQuestion = redactPHI(
  "Does John Smith's plan cover MRI without prior auth?"
);

const safeResult = await qaChain.invoke({ question: safeQuestion });
console.log(safeResult.content);

Production Considerations

  • Deploy in-region

    • Keep embeddings, vector store, logs, and model endpoints inside approved regions.
    • Healthcare customers will care about data residency before they care about latency.
  • Audit every answer

    • Persist question hash, retrieved document IDs, model version, timestamp, and final response.
    • You need this for compliance reviews and dispute resolution when an answer is challenged.
  • Add guardrails for PHI

    • Block questions that include patient identifiers unless you have an approved HIPAA workflow.
    • Separate member-facing Q&A from internal clinical operations use cases.
  • Monitor grounding quality

    • Track retrieval hit rate, citation coverage, refusal rate, and hallucination reports.
    • If answers frequently say “I don’t have enough evidence,” your chunking or metadata filters are too strict.

Common Pitfalls

  • Using one global index for all policies

    • This causes cross-plan leakage.
    • Fix it by filtering on planType, state, effectiveDate, and line of business before generating an answer.
  • Letting the model answer without citations

    • In healthcare, uncited answers become liability fast.
    • Force every response to include policyId and source references from retrieved documents.
  • Ignoring stale policies

    • A correct answer against last year’s handbook is still wrong.
    • Version your documents and exclude expired policies at retrieval time.

If you build this pattern correctly, you get a system that is useful to operations teams and defensible to compliance teams. That is the bar for healthcare Q&A agents.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides