How to Integrate Anthropic for insurance with pgvector for RAG

By Cyprian AaronsUpdated 2026-04-21
anthropic-for-insurancepgvectorrag

Combining Anthropic for insurance with pgvector gives you a practical RAG stack for policy Q&A, claims triage, and underwriting support. The database handles retrieval over your internal documents, while Anthropic turns those retrieved chunks into grounded answers your agents can use in production.

Prerequisites

  • Python 3.10+
  • A running PostgreSQL 14+ instance
  • pgvector installed on the database
  • An Anthropic API key
  • Access to your insurance document corpus:
    • policy wordings
    • claims manuals
    • underwriting guidelines
    • broker FAQs
  • Python packages:
    • anthropic
    • psycopg[binary]
    • pgvector
    • python-dotenv

Install the dependencies:

pip install anthropic psycopg[binary] pgvector python-dotenv

Enable the extension in Postgres:

CREATE EXTENSION IF NOT EXISTS vector;

Integration Steps

1) Set up your environment and clients

Keep secrets out of code. Load the Anthropic key and PostgreSQL connection string from environment variables.

import os
from dotenv import load_dotenv
from anthropic import Anthropic
import psycopg

load_dotenv()

ANTHROPIC_API_KEY = os.environ["ANTHROPIC_API_KEY"]
DATABASE_URL = os.environ["DATABASE_URL"]

client = Anthropic(api_key=ANTHROPIC_API_KEY)
conn = psycopg.connect(DATABASE_URL)
conn.autocommit = True

print("Connected to Anthropic and PostgreSQL")

2) Create a pgvector-backed documents table

For RAG, store chunks plus embeddings in PostgreSQL. Use a fixed embedding dimension that matches the model you use for embeddings.

from pgvector.psycopg import register_vector

register_vector(conn)

EMBEDDING_DIM = 1536

with conn.cursor() as cur:
    cur.execute(f"""
        CREATE EXTENSION IF NOT EXISTS vector;

        CREATE TABLE IF NOT EXISTS insurance_docs (
            id BIGSERIAL PRIMARY KEY,
            source ტექst TEXT,
            chunk TEXT NOT NULL,
            embedding VECTOR({EMBEDDING_DIM}) NOT NULL,
            metadata JSONB DEFAULT '{{}}'::jsonb
        );
    """)

If you are using a different embedding model, adjust EMBEDDING_DIM accordingly.

3) Generate embeddings and store policy chunks

Anthropic is used for generation. For embeddings, use an embedding model from your stack; the pattern below assumes you already have vectors ready to insert. In production, many teams generate embeddings in a separate service.

from typing import List

def insert_chunk(source: str, chunk: str, embedding: List[float], metadata: dict):
    with conn.cursor() as cur:
        cur.execute(
            """
            INSERT INTO insurance_docs (source, chunk, embedding, metadata)
            VALUES (%s, %s, %s, %s)
            """,
            (source, chunk, embedding, metadata),
        )

# Example payload from your ingestion pipeline
policy_chunk = """
Coverage applies when the insured event occurs during the policy period and is reported within 30 days.
"""
fake_embedding = [0.01] * EMBEDDING_DIM

insert_chunk(
    source="policy_wording_v3.pdf",
    chunk=policy_chunk,
    embedding=fake_embedding,
    metadata={"line_of_business": "property", "section": "claims_notification"},
)

If you already have an embedding pipeline, wire it here before insert. The important part is that the vectors land in pgvector so retrieval stays inside Postgres.

4) Retrieve top-k context with pgvector similarity search

Use cosine distance or inner product depending on how your embeddings are normalized. This query returns the most relevant chunks for a user question.

def retrieve_context(query_embedding: List[float], limit: int = 5):
    with conn.cursor() as cur:
        cur.execute(
            """
            SELECT source, chunk, metadata,
                   1 - (embedding <=> %s::vector) AS similarity
            FROM insurance_docs
            ORDER BY embedding <=> %s::vector
            LIMIT %s;
            """,
            (query_embedding, query_embedding, limit),
        )
        return cur.fetchall()

query_embedding = [0.01] * EMBEDDING_DIM
rows = retrieve_context(query_embedding)

for row in rows:
    print(row[0], row[3], row[1][:80])

That <=> operator is the standard pgvector cosine distance operator. It gives you deterministic retrieval behavior that works well for insurance document search.

5) Ask Anthropic to answer using retrieved context

Now pass the retrieved chunks into Anthropic’s Messages API. Keep the answer grounded in source text and force it to cite what it used.

def build_prompt(question: str, contexts):
    context_block = "\n\n".join(
        f"[Source: {source} | Similarity: {similarity:.3f}]\n{chunk}"
        for source, chunk, metadata, similarity in contexts
    )

    return f"""
You are an insurance assistant.
Answer only using the provided context.
If the context is insufficient, say so clearly.

Question:
{question}

Context:
{context_block}
""".strip()

question = "Does this policy cover late claim notification?"
contexts = retrieve_context(query_embedding)

message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=300,
    temperature=0,
    messages=[
        {"role": "user", "content": build_prompt(question, contexts)}
    ],
)

print(message.content[0].text)

This is the core RAG loop:

  • embed question
  • search pgvector
  • send retrieved chunks to Anthropic Messages API
  • return grounded output

Testing the Integration

Run a full smoke test with one known chunk and one question that should match it.

test_question = "How long do I have to report a claim?"
test_query_embedding = [0.01] * EMBEDDING_DIM

results = retrieve_context(test_query_embedding, limit=3)

response = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=200,
    temperature=0,
    messages=[
        {
            "role": "user",
            "content": build_prompt(test_question, results),
        }
    ],
)

print("Retrieved rows:", len(results))
print("Answer:", response.content[0].text)

Expected output:

Retrieved rows: 1
Answer: The policy requires claims to be reported within 30 days of the insured event occurring.

If you get zero rows back or an irrelevant answer:

  • verify vectors were inserted correctly
  • confirm your embedding dimension matches the column definition
  • check your distance operator choice (<=>, <->, or <#>)

Real-World Use Cases

  • Claims intake assistant

    • Pull claim-handling procedures from internal docs and have Anthropic draft next-step guidance for adjusters.
  • Underwriting policy copilot

    • Search underwriting rules by risk type and ask Anthropic to summarize eligibility constraints for brokers or underwriters.
  • Customer service policy Q&A

    • Let service agents ask natural-language questions about coverage limits, exclusions, deductibles, and reporting windows without digging through PDFs.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides