How to Integrate OpenAI for insurance with Pinecone for RAG
OpenAI for insurance gives you the generation and reasoning layer. Pinecone gives you the retrieval layer. Put them together and you get a RAG system that can answer policy questions, summarize claims docs, and ground every response in approved insurance knowledge instead of hallucinating.
Prerequisites
- •Python 3.10+
- •An OpenAI API key with access to the models you plan to use
- •A Pinecone account and API key
- •A Pinecone index created with the right vector dimension for your embedding model
- •Insurance content ready for ingestion:
- •policy PDFs
- •underwriting guidelines
- •claims manuals
- •FAQ documents
- •Installed packages:
- •
openai - •
pinecone - •
python-dotenv
- •
pip install openai pinecone python-dotenv
Integration Steps
- •Set up environment variables
Keep credentials out of code. For insurance workloads, this matters because you’ll eventually run this in a controlled environment with audit logs and access boundaries.
import os
from dotenv import load_dotenv
load_dotenv()
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
PINECONE_API_KEY = os.getenv("PINECONE_API_KEY")
PINECONE_INDEX_NAME = os.getenv("PINECONE_INDEX_NAME")
- •Create clients for OpenAI and Pinecone
Use the official SDKs directly. For OpenAI, the OpenAI() client handles embeddings and chat completions. For Pinecone, create a client and connect to your index.
from openai import OpenAI
from pinecone import Pinecone
openai_client = OpenAI(api_key=OPENAI_API_KEY)
pc = Pinecone(api_key=PINECONE_API_KEY)
index = pc.Index(PINECONE_INDEX_NAME)
- •Embed insurance text and upsert it into Pinecone
For RAG, chunk your documents first. Then generate embeddings with OpenAI and store vectors in Pinecone with metadata like document type, policy number, or jurisdiction.
insurance_chunks = [
{
"id": "policy_001_chunk_01",
"text": "The deductible for collision coverage is $500 unless otherwise stated in endorsements.",
"metadata": {"doc_type": "auto_policy", "policy_id": "POL-001", "page": 4}
},
{
"id": "policy_001_chunk_02",
"text": "Claims must be reported within 30 days of discovery for theft-related losses.",
"metadata": {"doc_type": "claims_manual", "policy_id": "CLM-014", "page": 12}
}
]
texts = [chunk["text"] for chunk in insurance_chunks]
embeddings_response = openai_client.embeddings.create(
model="text-embedding-3-small",
input=texts
)
vectors = []
for chunk, emb in zip(insurance_chunks, embeddings_response.data):
vectors.append({
"id": chunk["id"],
"values": emb.embedding,
"metadata": {
**chunk["metadata"],
"text": chunk["text"]
}
})
index.upsert(vectors=vectors)
- •Retrieve relevant context from Pinecone
At query time, embed the user question with the same embedding model, then ask Pinecone for the top matches. In insurance workflows, use metadata filters when you need jurisdiction-specific or product-specific answers.
query = "What is the deductible for collision coverage?"
query_embedding = openai_client.embeddings.create(
model="text-embedding-3-small",
input=[query]
).data[0].embedding
results = index.query(
vector=query_embedding,
top_k=3,
include_metadata=True,
filter={"doc_type": {"$eq": "auto_policy"}}
)
contexts = [match["metadata"]["text"] for match in results["matches"]]
print(contexts)
- •Generate a grounded answer with OpenAI
Pass the retrieved context into the prompt and force the model to answer only from that material. This is where RAG becomes useful for insurance: fewer unsupported claims, better traceability.
context_block = "\n\n".join(contexts)
response = openai_client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{
"role": "system",
"content": (
"You are an insurance assistant. Answer only using the provided context. "
"If the context is insufficient, say you don't have enough information."
)
},
{
"role": "user",
"content": f"Context:\n{context_block}\n\nQuestion: {query}"
}
],
temperature=0.1
)
print(response.choices[0].message.content)
Testing the Integration
Run an end-to-end test: embed a known insurance snippet, retrieve it, then generate a response from it.
test_question = "What is the deductible for collision coverage?"
q_emb = openai_client.embeddings.create(
model="text-embedding-3-small",
input=[test_question]
).data[0].embedding
search = index.query(
vector=q_emb,
top_k=1,
include_metadata=True,
filter={"doc_type": {"$eq": "auto_policy"}}
)
top_text = search["matches"][0]["metadata"]["text"]
answer = openai_client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "Answer using only the supplied context."},
{"role": "user", "content": f"Context: {top_text}\n\nQuestion: {test_question}"}
]
)
print("Retrieved:", top_text)
print("Answer:", answer.choices[0].message.content)
Expected output:
Retrieved: The deductible for collision coverage is $500 unless otherwise stated in endorsements.
Answer: The deductible for collision coverage is $500 unless otherwise stated in endorsements.
Real-World Use Cases
- •
Policy Q&A assistant
Let agents or customers ask about deductibles, exclusions, waiting periods, renewals, and endorsements with answers grounded in approved policy text. - •
Claims intake copilot
Retrieve claims procedures, required documents, and SLA rules so adjusters can draft accurate next-step instructions. - •
Underwriting knowledge assistant
Surface underwriting guidelines by product line or jurisdiction so underwriters can make faster decisions without searching shared drives manually.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit