How to Integrate OpenAI for insurance with Pinecone for multi-agent systems
Combining OpenAI for insurance with Pinecone gives you a clean pattern for building agent systems that can answer policy questions, retrieve claim context, and keep multiple specialized agents on the same source of truth. The OpenAI side handles reasoning and response generation, while Pinecone gives your agents low-latency semantic retrieval over policy docs, claims notes, underwriting guidelines, and customer interactions.
Prerequisites
- •Python 3.10+
- •An OpenAI API key
- •A Pinecone API key
- •Access to your insurance knowledge base:
- •policy PDFs or text exports
- •underwriting rules
- •claims procedures
- •FAQ content
- •Installed packages:
- •
openai - •
pinecone - •
tiktokenor your preferred chunking/tokenization library
- •
- •Environment variables set:
- •
OPENAI_API_KEY - •
PINECONE_API_KEY - •
PINECONE_INDEX_NAME
- •
Install dependencies:
pip install openai pinecone tiktoken
Integration Steps
1) Initialize both clients
Start by creating the OpenAI client for generation and the Pinecone client for retrieval. In a multi-agent system, this usually lives in a shared infrastructure module so every agent uses the same index and model settings.
import os
from openai import OpenAI
from pinecone import Pinecone
openai_client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])
index_name = os.environ["PINECONE_INDEX_NAME"]
index = pc.Index(index_name)
If you are using separate agents for claims, underwriting, and customer service, keep this initialization in a shared package. That avoids drift in embeddings, model versions, and index names.
2) Chunk insurance documents and create embeddings
You need to turn policy content into searchable vectors before Pinecone can retrieve anything useful. Use OpenAI embeddings for consistent semantic matching across all agents.
from typing import List
def chunk_text(text: str, chunk_size: int = 800) -> List[str]:
words = text.split()
return [
" ".join(words[i:i + chunk_size])
for i in range(0, len(words), chunk_size)
]
policy_text = """
Coverage applies to accidental water damage caused by burst pipes.
Exclusions include gradual leaks, mold-related damage, and wear-and-tear.
Claims must be filed within 30 days of discovery.
"""
chunks = chunk_text(policy_text)
embeddings_response = openai_client.embeddings.create(
model="text-embedding-3-small",
input=chunks
)
vectors = []
for i, item in enumerate(embeddings_response.data):
vectors.append({
"id": f"policy-chunk-{i}",
"values": item.embedding,
"metadata": {
"source": "policy_guide",
"chunk": chunks[i]
}
})
For insurance use cases, metadata matters as much as the vector. Store document type, product line, jurisdiction, effective date, and claim category so downstream agents can filter correctly.
3) Upsert vectors into Pinecone
Once you have embeddings, push them into your Pinecone index. This is the retrieval layer your agents will query later.
upsert_response = index.upsert(vectors=vectors)
print(upsert_response)
For production systems, batch upserts by document type or line of business.
| Pattern | When to use | Why |
|---|---|---|
| Single index | Small-to-medium insurance knowledge base | Simpler ops |
| Namespace per tenant | Multi-carrier or multi-region deployments | Isolation |
| Namespace per agent type | Claims vs underwriting vs servicing | Cleaner retrieval boundaries |
A good default is one index with namespaces like claims, underwriting, and customer_service.
4) Build the retrieval + generation flow
This is where the integration becomes useful. The agent asks a question, Pinecone returns relevant chunks, and OpenAI turns that context into an answer grounded in policy text.
question = "Does this policy cover water damage from a burst pipe?"
question_embedding = openai_client.embeddings.create(
model="text-embedding-3-small",
input=[question]
).data[0].embedding
search_results = index.query(
vector=question_embedding,
top_k=3,
include_metadata=True
)
context_blocks = [
match["metadata"]["chunk"]
for match in search_results["matches"]
]
prompt = f"""
You are an insurance assistant.
Answer only using the provided context.
Context:
{chr(10).join(context_blocks)}
Question:
{question}
"""
response = openai_client.responses.create(
model="gpt-4o-mini",
input=prompt
)
print(response.output_text)
That pattern is what you want across agents:
- •retrieval agent finds evidence
- •reasoning agent drafts answer
- •policy/compliance agent checks output before sending it to a user
5) Wire it into a multi-agent workflow
In multi-agent systems, each agent should have a narrow job. One agent retrieves facts from Pinecone; another uses OpenAI to summarize; another validates compliance language.
def retrieve_context(query: str):
emb = openai_client.embeddings.create(
model="text-embedding-3-small",
input=[query]
).data[0].embedding
results = index.query(
vector=emb,
top_k=5,
include_metadata=True,
namespace="claims"
)
return [m["metadata"]["chunk"] for m in results["matches"]]
def answer_claim_question(query: str):
context = retrieve_context(query)
messages = [
{
"role": "system",
"content": "You are a claims assistant. Use only retrieved context."
},
{
"role": "user",
"content": f"Context:\n{'\n'.join(context)}\n\nQuestion: {query}"
}
]
result = openai_client.responses.create(
model="gpt-4o-mini",
input=messages
)
return result.output_text
This keeps your architecture clean. Retrieval stays deterministic enough to audit; generation stays flexible enough to handle natural language.
Testing the Integration
Run a simple end-to-end test with a known policy question.
test_question = "Is mold damage covered if it comes from a gradual leak?"
answer = answer_claim_question(test_question)
print("ANSWER:", answer)
Expected output:
ANSWER: Based on the retrieved policy context, mold-related damage is excluded when it results from gradual leaks. Coverage applies to accidental water damage such as burst pipes.
If you get an empty or vague response:
- •check that embeddings were upserted into the right namespace
- •verify your query text matches the domain language in your documents
- •inspect
top_kresults to confirm relevant chunks are being returned
Real-World Use Cases
- •
Claims triage assistant
- •Retrieves claim history, policy terms, and adjuster notes from Pinecone.
- •Uses OpenAI to draft next-step recommendations and customer-facing explanations.
- •
Underwriting copilot
- •Searches underwriting guidelines by product line and jurisdiction.
- •Lets an agent explain risk exceptions and required documentation.
- •
Customer service knowledge agent
- •Answers coverage questions using approved policy content only.
- •Routes edge cases to human handlers with retrieved evidence attached.
The main pattern here is simple: Pinecone holds the memory layer, OpenAI handles reasoning and response generation. For insurance teams building multi-agent systems, that separation is what makes the stack auditable enough for production.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit