How to Integrate Anthropic for insurance with pgvector for production AI

By Cyprian AaronsUpdated 2026-04-21
anthropic-for-insurancepgvectorproduction-ai

Combining Anthropic for insurance with pgvector gives you a practical pattern for production AI agents: use Claude to reason over policy language, claims notes, and underwriting docs, then ground those responses in vector search over your own insurance corpus. That means faster claims triage, better policy Q&A, and fewer hallucinations when an agent needs to answer from internal documents instead of guessing.

Prerequisites

  • Python 3.10+
  • A running PostgreSQL 14+ instance
  • The pgvector extension installed in PostgreSQL
  • An Anthropic API key
  • A database user with permission to create tables and extensions
  • Python packages:
    • anthropic
    • psycopg[binary]
    • pgvector
    • python-dotenv

Install the packages:

pip install anthropic psycopg[binary] pgvector python-dotenv

Create the extension in your database:

CREATE EXTENSION IF NOT EXISTS vector;

Integration Steps

1) Set up your environment and clients

Use environment variables for both the Anthropic API key and your Postgres connection string. Keep this out of source control.

import os
from dotenv import load_dotenv
from anthropic import Anthropic
import psycopg

load_dotenv()

ANTHROPIC_API_KEY = os.environ["ANTHROPIC_API_KEY"]
DATABASE_URL = os.environ["DATABASE_URL"]

client = Anthropic(api_key=ANTHROPIC_API_KEY)

conn = psycopg.connect(DATABASE_URL)
conn.autocommit = True

For production, use a secrets manager and connection pooling. The code above is enough to prove the integration path.

2) Create a vector table for insurance documents

Store your policy docs, claim summaries, or underwriting notes as embeddings in Postgres. For Claude-compatible embeddings you still need an embedding model; Anthropic’s core API is for generation, so pair it with a separate embedding model and keep Claude for reasoning.

from pgvector.psycopg import register_vector

register_vector(conn)

with conn.cursor() as cur:
    cur.execute("""
        CREATE TABLE IF NOT EXISTS insurance_docs (
            id BIGSERIAL PRIMARY KEY,
            doc_type TEXT NOT NULL,
            content TEXT NOT NULL,
            embedding VECTOR(1536) NOT NULL
        )
    """)

If you already have a different embedding dimension, match the VECTOR(n) size to that model. Don’t guess here; mismatched dimensions will fail at insert time.

3) Embed and insert insurance content into pgvector

Here’s a simple pattern using OpenAI-style embeddings? No. For production you can swap in any embedding provider. The important part is how pgvector stores and searches the vectors.

from typing import List

def fake_embed(text: str) -> List[float]:
    # Replace with your embedding provider.
    # Must return a fixed-length list matching VECTOR(1536).
    return [0.01] * 1536

docs = [
    ("policy", "Water damage is covered if caused by sudden and accidental discharge."),
    ("claims", "A claim requires photos, repair estimates, and the loss date."),
]

with conn.cursor() as cur:
    for doc_type, content in docs:
        embedding = fake_embed(content)
        cur.execute(
            """
            INSERT INTO insurance_docs (doc_type, content, embedding)
            VALUES (%s, %s, %s)
            """,
            (doc_type, content, embedding),
        )

In production, batch inserts and persist document metadata like carrier name, policy number, jurisdiction, and effective date.

4) Retrieve relevant context with pgvector similarity search

Use cosine distance to fetch the most relevant documents for a user question. That retrieved context becomes the grounding layer for Claude.

def search_docs(query_embedding: List[float], limit: int = 3):
    with conn.cursor() as cur:
        cur.execute(
            """
            SELECT doc_type, content
            FROM insurance_docs
            ORDER BY embedding <=> %s::vector
            LIMIT %s
            """,
            (query_embedding, limit),
        )
        return cur.fetchall()

question = "Does sudden water discharge count as covered damage?"
query_embedding = fake_embed(question)
matches = search_docs(query_embedding)

for row in matches:
    print(row)

The <=> operator is the pgvector cosine distance operator. That’s the core retrieval primitive you’ll use in an agent pipeline.

5) Ask Claude to answer using retrieved evidence

Now pass the retrieved snippets into Anthropic’s Messages API. Keep the prompt strict: answer only from context and call out uncertainty when evidence is missing.

context_text = "\n\n".join(
    f"[{doc_type}] {content}" for doc_type, content in matches
)

response = client.messages.create(
    model="claude-3-5-sonnet-latest",
    max_tokens=300,
    messages=[
        {
            "role": "user",
            "content": f"""
You are an insurance assistant.
Answer only from the provided context.

Context:
{context_text}

Question:
{question}

Return:
- direct answer
- supporting evidence from context
- if coverage is unclear, say what is missing
"""
        }
    ],
)

print(response.content[0].text)

That’s the production pattern: retrieve first, then generate. If you skip retrieval and let Claude freewheel on policy questions, you’ll get confident nonsense eventually.

Testing the Integration

Run a simple end-to-end check: insert one known policy snippet, query it back with similarity search, then ask Claude to summarize coverage.

test_question = "Is sudden accidental water discharge covered?"
test_embedding = fake_embed(test_question)
top_match = search_docs(test_embedding, limit=1)[0]

answer = client.messages.create(
    model="claude-3-5-sonnet-latest",
    max_tokens=200,
    messages=[
        {
            "role": "user",
            "content": f"""
Use only this context:
{top_match[1]}

Question: {test_question}
"""
        }
    ],
)

print("MATCH:", top_match)
print("ANSWER:", answer.content[0].text)

Expected output:

MATCH: ('policy', 'Water damage is covered if caused by sudden and accidental discharge.')
ANSWER: Based on the provided context, sudden accidental water discharge is covered.

If your result doesn’t resemble this flow:

  • verify vector extension exists
  • confirm your embedding dimension matches the column definition
  • check that retrieval returns relevant rows before calling Claude

Real-World Use Cases

  • Claims intake assistant

    • Retrieve prior claim notes, policy clauses, and adjuster comments from pgvector.
    • Use Claude to draft claim summaries or ask follow-up questions based on missing evidence.
  • Policy interpretation assistant

    • Search internal policy libraries by semantic meaning instead of keyword match.
    • Have Claude explain exclusions, endorsements, and coverage limits in plain language grounded in retrieved text.
  • Underwriting copilot

    • Pull similar historical submissions or risk notes from vector search.
    • Use Claude to summarize risk signals and draft underwriting recommendations for human review.

This stack works because each part does one job well. pgvector handles fast semantic retrieval inside Postgres; Anthropic handles reasoning and response generation; your application layer enforces guardrails around both.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides