How to Build a policy Q&A Agent Using LangChain in Python for insurance

By Cyprian AaronsUpdated 2026-04-21
policy-q-alangchainpythoninsurancepolicy-qanda

A policy Q&A agent answers questions like “Does this plan cover outpatient physiotherapy?” or “What’s the waiting period for pre-existing conditions?” by retrieving the right policy clauses and generating a grounded response. For insurance, this matters because policy language is dense, customers need fast answers, and every response has to stay aligned with compliance, auditability, and the actual contract text.

Architecture

  • Policy document loader

    • Ingests PDFs, Word docs, or HTML policy booklets.
    • Normalizes them into text with metadata like policy_id, jurisdiction, product_type, and effective_date.
  • Chunking and embedding pipeline

    • Splits policies into retrievable chunks.
    • Generates embeddings for semantic search over clauses, exclusions, endorsements, and definitions.
  • Vector store retriever

    • Stores chunks in a vector database such as FAISS, Pinecone, or pgvector.
    • Returns top-k relevant passages for each user question.
  • Grounded answer chain

    • Uses retrieved policy text as context.
    • Forces the model to answer only from the supplied evidence.
  • Guardrails layer

    • Detects unsupported questions, missing policy context, and risky outputs.
    • Handles escalation to a human agent when the model cannot answer confidently.
  • Audit and trace logging

    • Stores question, retrieved chunks, answer, policy version, and timestamp.
    • Gives compliance teams a trail for dispute resolution.

Implementation

1) Load policy documents and split them into chunks

Use PyPDFLoader for PDFs and RecursiveCharacterTextSplitter to keep clauses intact as much as possible. In insurance, chunk size matters because you want exclusions and definitions to stay close to the coverage language they modify.

from langchain_community.document_loaders import PyPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter

loader = PyPDFLoader("policy_documents/health_policy_v3.pdf")
documents = loader.load()

splitter = RecursiveCharacterTextSplitter(
    chunk_size=900,
    chunk_overlap=150,
    separators=["\n\n", "\n", ". ", " ", ""],
)

chunks = splitter.split_documents(documents)

for chunk in chunks:
    chunk.metadata["product"] = "health"
    chunk.metadata["jurisdiction"] = "ZA"
    chunk.metadata["policy_version"] = "v3"

2) Build the vector index with embeddings

For a production setup you can swap FAISS for a managed store later. The API pattern stays the same: embed documents once, then retrieve by similarity at query time.

from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

vectorstore = FAISS.from_documents(chunks, embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 4})

3) Create a grounded LangChain QA chain

Use ChatOpenAI plus RetrievalQA so the model answers from retrieved policy text instead of inventing coverage rules. The prompt should explicitly say that if the answer is not in context, the assistant must say it cannot confirm coverage.

from langchain_openai import ChatOpenAI
from langchain.chains import RetrievalQA
from langchain_core.prompts import PromptTemplate

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

prompt = PromptTemplate.from_template(
    """You are an insurance policy assistant.
Answer only using the context below.
If the context does not contain enough information, say: "I can't confirm that from this policy."
Do not guess. Do not provide legal advice.

Context:
{context}

Question:
{question}

Answer:"""
)

qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever,
    return_source_documents=True,
    chain_type_kwargs={"prompt": prompt},
)

result = qa_chain.invoke({"query": "Does this policy cover outpatient physiotherapy?"})

print(result["result"])
for doc in result["source_documents"]:
    print(doc.metadata)

This is the core pattern you want in insurance: retrieve evidence first, then generate a constrained answer. If you need stronger control over formatting or citations, move to create_retrieval_chain with a custom combine-documents chain later.

4) Add an audit trail and escalation path

Insurance teams will ask where an answer came from. Persist the query, answer, source metadata, and policy version so you can reconstruct every response later.

import json
from datetime import datetime

def log_interaction(question: str, result: dict):
    record = {
        "timestamp": datetime.utcnow().isoformat(),
        "question": question,
        "answer": result["result"],
        "sources": [
            {
                "page": doc.metadata.get("page"),
                "source": doc.metadata.get("source"),
                "policy_version": doc.metadata.get("policy_version"),
            }
            for doc in result["source_documents"]
        ],
    }
    with open("audit_log.jsonl", "a", encoding="utf-8") as f:
        f.write(json.dumps(record) + "\n")

question = "Is pre-existing condition treatment covered after waiting period?"
result = qa_chain.invoke({"query": question})
log_interaction(question, result)

Production Considerations

  • Data residency

    • Keep embeddings and raw policy data inside approved regions.
    • If your insurer operates across jurisdictions, separate indexes by country or legal entity.
  • Compliance controls

    • Add disclaimers where required by regulation.
    • Block answers that drift into legal interpretation or underwriting advice unless reviewed by counsel.
  • Monitoring

    • Track retrieval hit rate, unanswered queries, escalation rate, and source coverage by product line.
    • Alert when answers frequently come from low-confidence or irrelevant chunks.
  • Versioning

    • Index policies by effective date and product version.
    • Never mix retired wording with active wording in the same retrieval namespace.

Common Pitfalls

  1. Using too-large chunks

    • Problem: exclusions get detached from coverage clauses.
    • Fix: tune chunk size around clause boundaries and test retrieval on real policy questions.
  2. Letting the model answer without evidence

    • Problem: hallucinated coverage statements create compliance risk.
    • Fix: force grounded prompts like “answer only using context” and return “I can’t confirm that” when evidence is missing.
  3. Ignoring policy versioning

    • Problem: users get answers from outdated terms after endorsements or renewals.
    • Fix: attach policy_version, effective_date, and jurisdiction metadata to every document and filter retrieval accordingly.
  4. Skipping audit logs

    • Problem: claims disputes become impossible to explain.
    • Fix: store question text, retrieved passages, model output, timestamps, and source metadata for every interaction.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides