How to Build a policy Q&A Agent Using LlamaIndex in Python for insurance

By Cyprian AaronsUpdated 2026-04-21
policy-q-allamaindexpythoninsurancepolicy-qanda

A policy Q&A agent answers customer or internal staff questions against insurance policy documents, endorsements, exclusions, and underwriting guides. It matters because most policy confusion comes from buried clauses, and a well-built agent can reduce call center load, speed up claims triage, and give consistent answers with citations.

Architecture

  • Document ingestion layer

    • Loads PDFs, DOCX, and HTML policy docs into LlamaIndex Document objects.
    • Splits by section so exclusions, definitions, and coverage limits stay intact.
  • Indexing layer

    • Uses a vector index for semantic retrieval over policy language.
    • Optionally adds a keyword or metadata filter for product line, jurisdiction, and effective date.
  • Retrieval layer

    • Pulls the top relevant chunks with VectorStoreIndex.as_query_engine().
    • Returns source nodes so every answer can be traced back to the policy wording.
  • Response synthesis layer

    • Generates a grounded answer with citations.
    • Keeps the model constrained to the retrieved policy text instead of general insurance knowledge.
  • Guardrails layer

    • Rejects unsupported questions like legal advice or claim decisions.
    • Detects missing context such as state, product type, or policy period before answering.
  • Audit and observability layer

    • Logs question, retrieved chunks, answer, and source document IDs.
    • Supports compliance review and incident investigation.

Implementation

1) Install dependencies

Use LlamaIndex plus a local or hosted LLM backend. For production insurance workloads, pin versions and keep the model provider configurable.

pip install llama-index llama-index-llms-openai llama-index-embeddings-openai pypdf

Set your API key:

export OPENAI_API_KEY="your-key"

2) Load policy documents with metadata

Insurance queries depend on jurisdiction, product line, and effective date. Put that information in metadata so you can filter later.

from llama_index.core import SimpleDirectoryReader

documents = SimpleDirectoryReader(
    input_dir="./policy_docs",
    filename_as_id=True
).load_data()

for doc in documents:
    doc.metadata.update({
        "product_line": "homeowners",
        "jurisdiction": "CA",
        "source_system": "policy_admin"
    })

If you have multiple products or states in one corpus, split them before indexing. Mixing all policies together will produce bad retrieval.

3) Build the index and query engine

This is the core pattern: chunk the policies, embed them, then query with source citations. The VectorStoreIndex API is stable and production-friendly for this use case.

from llama_index.core import VectorStoreIndex
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

Settings.llm = OpenAI(model="gpt-4o-mini", temperature=0)
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
Settings.node_parser = SentenceSplitter(chunk_size=512, chunk_overlap=80)

index = VectorStoreIndex.from_documents(documents)

query_engine = index.as_query_engine(
    similarity_top_k=4,
    response_mode="compact"
)

response = query_engine.query(
    "Does this homeowners policy cover water damage from a burst pipe?"
)

print(response.response)
for node in response.source_nodes:
    print(node.node.metadata.get("file_name"), node.score)

The important part is similarity_top_k=4. In insurance docs, one clause rarely tells the full story; you usually need coverage grant plus exclusions plus definitions.

4) Add a simple insurance guardrail before answering

You should not answer every question directly. If the user asks for claim approval or legal interpretation beyond the document set, route to human review.

def needs_human_review(question: str) -> bool:
    blocked_terms = [
        "approve my claim",
        "should i sue",
        "legal advice",
        "coverage guaranteed",
        "binding decision"
    ]
    q = question.lower()
    return any(term in q for term in blocked_terms)

def answer_question(question: str):
    if needs_human_review(question):
        return {
            "answer": "This question needs human review by claims or legal.",
            "status": "escalated"
        }

    result = query_engine.query(question)
    return {
        "answer": str(result),
        "status": "answered",
        "sources": [
            {
                "file_name": n.node.metadata.get("file_name"),
                "score": n.score
            }
            for n in result.source_nodes
        ]
    }

print(answer_question("Is mold damage covered?"))

That pattern keeps the agent inside its lane. In insurance, being wrong with confidence is worse than escalating early.

Production Considerations

  • Deploy behind an authenticated internal service

    • Policy Q&A often exposes regulated content.
    • Require SSO or service-to-service auth before any retrieval call.
  • Log every answer with source traceability

    • Store user question, retrieved node IDs, model version, timestamp, and final response.
    • This helps with audit requests and complaint handling.
  • Control data residency

    • Keep embeddings and vector stores in-region when policies are tied to specific jurisdictions.
    • If you operate in multiple countries, separate indexes by region instead of one global corpus.
  • Add monitoring for retrieval quality

    • Track “no answer,” low similarity scores, and escalation rate.
    • A spike usually means broken ingestion, stale policies, or bad chunking.

Common Pitfalls

  1. Chunking policies too aggressively

    • If you split every paragraph into tiny chunks, exclusions lose context.
    • Use section-aware chunking and keep definitions near coverage clauses.
  2. Skipping metadata filters

    • A California homeowners policy should not answer a Texas commercial auto question.
    • Filter by product line, jurisdiction, and effective date before retrieval.
  3. Letting the model freewheel without citations

    • Insurance users need defensible answers tied to source text.
    • Always return source nodes from response.source_nodes and show exact document references.
  4. Treating the agent as a claims decision engine

    • Policy Q&A is not claims adjudication.
    • Escalate anything that requires interpretation beyond the retrieved wording or involves reserved rights.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides