How to Integrate LangChain for insurance with PostgreSQL for RAG
Combining LangChain for insurance with PostgreSQL gives you a practical RAG stack for policy search, claims support, and underwriting assistants. LangChain handles retrieval and orchestration; PostgreSQL stores your insurance documents, embeddings, and metadata in a system your team already knows how to operate.
Prerequisites
- •Python 3.10+
- •PostgreSQL 14+ with the
pgvectorextension enabled - •A working LangChain installation for your insurance agent workflow
- •Access to an embedding model and an LLM provider
- •A PostgreSQL database URL like
postgresql+psycopg2://user:password@localhost:5432/insurance_rag - •Insurance documents ready for ingestion:
- •policy wordings
- •claims manuals
- •underwriting guidelines
- •product brochures
Install the core packages:
pip install langchain langchain-community langchain-openai psycopg2-binary pgvector sqlalchemy
Integration Steps
1. Prepare PostgreSQL for vector search
You need a table that can store text, metadata, and embeddings. If you are using pgvector, enable the extension first.
import psycopg2
conn = psycopg2.connect(
host="localhost",
dbname="insurance_rag",
user="postgres",
password="postgres"
)
with conn.cursor() as cur:
cur.execute("CREATE EXTENSION IF NOT EXISTS vector;")
conn.commit()
conn.close()
If you want LangChain to manage the table shape for you, use its PostgreSQL vector store integration instead of hand-rolling schema.
2. Load insurance documents into LangChain
Use LangChain loaders to ingest source material. In insurance, keep documents split by section so retrieval returns narrow, auditable chunks.
from langchain_community.document_loaders import TextLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
loader = TextLoader("data/claims_manual.txt")
documents = loader.load()
splitter = RecursiveCharacterTextSplitter(
chunk_size=800,
chunk_overlap=120
)
chunks = splitter.split_documents(documents)
print(f"Loaded {len(chunks)} chunks")
This is the point where you should attach metadata like product_line, jurisdiction, document_type, and version. That metadata becomes critical when adjusters or underwriters ask for jurisdiction-specific answers.
3. Create embeddings and store them in PostgreSQL
LangChain’s PostgreSQL vector stores let you persist embeddings directly into Postgres. Use a real embedding model so similarity search is meaningful.
from langchain_openai import OpenAIEmbeddings
from langchain_postgres import PGVector
CONNECTION_STRING = "postgresql+psycopg://postgres:postgres@localhost:5432/insurance_rag"
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = PGVector(
connection=CONNECTION_STRING,
embeddings=embeddings,
collection_name="insurance_docs",
use_jsonb=True,
)
vectorstore.add_documents(chunks)
If your stack uses a different embedding provider, swap OpenAIEmbeddings for the equivalent class. The integration pattern stays the same: embed, persist, retrieve.
4. Build the retriever and RAG chain
Now wire retrieval into a LangChain chain that can answer questions from stored insurance content.
from langchain_openai import ChatOpenAI
from langchain.chains import RetrievalQA
retriever = vectorstore.as_retriever(search_kwargs={"k": 4})
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
qa_chain = RetrievalQA.from_chain_type(
llm=llm,
chain_type="stuff",
retriever=retriever,
return_source_documents=True,
)
question = "Does this policy cover water damage from burst pipes?"
result = qa_chain.invoke({"query": question})
print(result["result"])
For insurance workflows, keep temperature at zero and return source documents. You want grounded answers with traceability, not creative summaries.
5. Add metadata filtering for insurance-specific queries
In production, retrieval without filters is too broad. You usually need to scope by line of business or jurisdiction.
retriever = vectorstore.as_retriever(
search_kwargs={
"k": 5,
"filter": {
"document_type": "policy",
"jurisdiction": "US"
}
}
)
answer = qa_chain.invoke({
"query": "What exclusions apply to flood-related losses?"
})
print(answer["result"])
This pattern matters when one assistant serves multiple products. A life insurance query should not retrieve homeowners policy language just because the semantic match is close.
Testing the Integration
Run a direct similarity search first, then test the full QA chain.
results = vectorstore.similarity_search(
"coverage for burst pipe water damage",
k=3
)
for i, doc in enumerate(results, start=1):
print(f"{i}. {doc.metadata}")
print(doc.page_content[:200])
print("---")
Expected output:
1. {'document_type': 'policy', 'jurisdiction': 'US', 'version': '2024.1'}
Coverage includes sudden and accidental discharge of water...
---
2. {'document_type': 'claims_manual', 'jurisdiction': 'US', 'version': '2024.1'}
Adjusters should verify cause of loss before approving...
---
3. {'document_type': 'policy', 'jurisdiction': 'US', 'version': '2024.1'}
Exclusions include seepage, wear and tear...
---
Then verify the RAG response includes citations or source docs:
response = qa_chain.invoke({"query": "Is burst pipe damage covered?"})
print(response["result"])
print(len(response["source_documents"]))
If source_documents is empty, your ingestion or retriever configuration is off.
Real-World Use Cases
- •Claims triage assistant
- •Pulls relevant policy clauses and claims guidance from PostgreSQL before suggesting next steps.
- •Underwriting copilot
- •Answers questions about risk appetite, exclusions, and referral rules using versioned underwriting docs.
- •Policy servicing bot
- •Helps customer service teams explain coverage limits, endorsements, deductibles, and waiting periods with grounded responses.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit