How to Integrate LangChain for insurance with PostgreSQL for RAG

By Cyprian AaronsUpdated 2026-04-21
langchain-for-insurancepostgresqlrag

Combining LangChain for insurance with PostgreSQL gives you a practical RAG stack for policy search, claims support, and underwriting assistants. LangChain handles retrieval and orchestration; PostgreSQL stores your insurance documents, embeddings, and metadata in a system your team already knows how to operate.

Prerequisites

  • Python 3.10+
  • PostgreSQL 14+ with the pgvector extension enabled
  • A working LangChain installation for your insurance agent workflow
  • Access to an embedding model and an LLM provider
  • A PostgreSQL database URL like postgresql+psycopg2://user:password@localhost:5432/insurance_rag
  • Insurance documents ready for ingestion:
    • policy wordings
    • claims manuals
    • underwriting guidelines
    • product brochures

Install the core packages:

pip install langchain langchain-community langchain-openai psycopg2-binary pgvector sqlalchemy

Integration Steps

1. Prepare PostgreSQL for vector search

You need a table that can store text, metadata, and embeddings. If you are using pgvector, enable the extension first.

import psycopg2

conn = psycopg2.connect(
    host="localhost",
    dbname="insurance_rag",
    user="postgres",
    password="postgres"
)

with conn.cursor() as cur:
    cur.execute("CREATE EXTENSION IF NOT EXISTS vector;")
    conn.commit()

conn.close()

If you want LangChain to manage the table shape for you, use its PostgreSQL vector store integration instead of hand-rolling schema.

2. Load insurance documents into LangChain

Use LangChain loaders to ingest source material. In insurance, keep documents split by section so retrieval returns narrow, auditable chunks.

from langchain_community.document_loaders import TextLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter

loader = TextLoader("data/claims_manual.txt")
documents = loader.load()

splitter = RecursiveCharacterTextSplitter(
    chunk_size=800,
    chunk_overlap=120
)

chunks = splitter.split_documents(documents)
print(f"Loaded {len(chunks)} chunks")

This is the point where you should attach metadata like product_line, jurisdiction, document_type, and version. That metadata becomes critical when adjusters or underwriters ask for jurisdiction-specific answers.

3. Create embeddings and store them in PostgreSQL

LangChain’s PostgreSQL vector stores let you persist embeddings directly into Postgres. Use a real embedding model so similarity search is meaningful.

from langchain_openai import OpenAIEmbeddings
from langchain_postgres import PGVector

CONNECTION_STRING = "postgresql+psycopg://postgres:postgres@localhost:5432/insurance_rag"

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

vectorstore = PGVector(
    connection=CONNECTION_STRING,
    embeddings=embeddings,
    collection_name="insurance_docs",
    use_jsonb=True,
)

vectorstore.add_documents(chunks)

If your stack uses a different embedding provider, swap OpenAIEmbeddings for the equivalent class. The integration pattern stays the same: embed, persist, retrieve.

4. Build the retriever and RAG chain

Now wire retrieval into a LangChain chain that can answer questions from stored insurance content.

from langchain_openai import ChatOpenAI
from langchain.chains import RetrievalQA

retriever = vectorstore.as_retriever(search_kwargs={"k": 4})

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever,
    return_source_documents=True,
)

question = "Does this policy cover water damage from burst pipes?"
result = qa_chain.invoke({"query": question})

print(result["result"])

For insurance workflows, keep temperature at zero and return source documents. You want grounded answers with traceability, not creative summaries.

5. Add metadata filtering for insurance-specific queries

In production, retrieval without filters is too broad. You usually need to scope by line of business or jurisdiction.

retriever = vectorstore.as_retriever(
    search_kwargs={
        "k": 5,
        "filter": {
            "document_type": "policy",
            "jurisdiction": "US"
        }
    }
)

answer = qa_chain.invoke({
    "query": "What exclusions apply to flood-related losses?"
})

print(answer["result"])

This pattern matters when one assistant serves multiple products. A life insurance query should not retrieve homeowners policy language just because the semantic match is close.

Testing the Integration

Run a direct similarity search first, then test the full QA chain.

results = vectorstore.similarity_search(
    "coverage for burst pipe water damage",
    k=3
)

for i, doc in enumerate(results, start=1):
    print(f"{i}. {doc.metadata}")
    print(doc.page_content[:200])
    print("---")

Expected output:

1. {'document_type': 'policy', 'jurisdiction': 'US', 'version': '2024.1'}
Coverage includes sudden and accidental discharge of water...
---
2. {'document_type': 'claims_manual', 'jurisdiction': 'US', 'version': '2024.1'}
Adjusters should verify cause of loss before approving...
---
3. {'document_type': 'policy', 'jurisdiction': 'US', 'version': '2024.1'}
Exclusions include seepage, wear and tear...
---

Then verify the RAG response includes citations or source docs:

response = qa_chain.invoke({"query": "Is burst pipe damage covered?"})
print(response["result"])
print(len(response["source_documents"]))

If source_documents is empty, your ingestion or retriever configuration is off.

Real-World Use Cases

  • Claims triage assistant
    • Pulls relevant policy clauses and claims guidance from PostgreSQL before suggesting next steps.
  • Underwriting copilot
    • Answers questions about risk appetite, exclusions, and referral rules using versioned underwriting docs.
  • Policy servicing bot
    • Helps customer service teams explain coverage limits, endorsements, deductibles, and waiting periods with grounded responses.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides