How to Integrate Anthropic for pension funds with pgvector for startups

By Cyprian AaronsUpdated 2026-04-21
anthropic-for-pension-fundspgvectorstartups

Anthropic gives you the reasoning layer for policy-heavy workflows. pgvector gives you persistent semantic memory. Put them together and you can build an AI agent for pension-fund operations that answers document-grounded questions, retrieves prior cases, and keeps responses tied to approved internal knowledge instead of free-form guessing.

Prerequisites

  • Python 3.10+
  • A running PostgreSQL instance with the pgvector extension enabled
  • An Anthropic API key
  • pip installed
  • A database user with permission to create tables and extensions
  • Basic familiarity with embeddings and retrieval-augmented generation

Install the Python packages:

pip install anthropic psycopg[binary] pgvector

If you’re using Docker for Postgres, make sure the image includes pgvector, or install it on the server side before creating vector columns.

Integration Steps

  1. Set up your environment variables.
import os

os.environ["ANTHROPIC_API_KEY"] = "your-anthropic-api-key"
os.environ["DATABASE_URL"] = "postgresql://postgres:postgres@localhost:5432/pension_ai"

In production, load these from your secret manager, not hardcoded values.

  1. Create the vector table in Postgres.

Use pgvector’s Python type plus a standard Postgres connection. For startup-scale systems, keep metadata alongside embeddings so you can filter by document type, tenant, or pension plan.

import psycopg
from pgvector.psycopg import register_vector

conn = psycopg.connect(os.environ["DATABASE_URL"])
register_vector(conn)

with conn.cursor() as cur:
    cur.execute("CREATE EXTENSION IF NOT EXISTS vector;")
    cur.execute("""
        CREATE TABLE IF NOT EXISTS knowledge_chunks (
            id SERIAL PRIMARY KEY,
            source ტექxt,
            chunk_text TEXT NOT NULL,
            embedding VECTOR(1024)
        );
    """)
    conn.commit()

If your embedding model returns a different dimension, change VECTOR(1024) to match it exactly.

  1. Generate embeddings with Anthropic and store them in pgvector.

Anthropic’s core SDK is for message generation, not embeddings. For this pattern, use a dedicated embedding model from your stack for vectors, then use Anthropic for reasoning over retrieved context. If you already have embeddings from another service, store them here; if not, wire in your embedding provider at this step.

from anthropic import Anthropic
import psycopg
from pgvector.psycopg import register_vector

client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
conn = psycopg.connect(os.environ["DATABASE_URL"])
register_vector(conn)

def store_chunk(source: str, chunk_text: str, embedding: list[float]):
    with conn.cursor() as cur:
        cur.execute(
            """
            INSERT INTO knowledge_chunks (source, chunk_text, embedding)
            VALUES (%s, %s, %s)
            """,
            (source, chunk_text, embedding),
        )
        conn.commit()

# Example placeholder embedding vector from your embedding service
store_chunk(
    "pension_policy_2025.pdf",
    "Members may request a lump-sum transfer under approved conditions.",
    [0.01] * 1024,
)

For an actual production agent system, keep Anthropic on the generation side and use a separate embedding model or provider for the vector side.

  1. Retrieve relevant chunks from pgvector and pass them to Anthropic.

This is the core integration: similarity search first, then grounded generation second.

import psycopg
from pgvector.psycopg import register_vector

conn = psycopg.connect(os.environ["DATABASE_URL"])
register_vector(conn)

def retrieve_context(query_embedding: list[float], limit: int = 5):
    with conn.cursor() as cur:
        cur.execute(
            """
            SELECT source, chunk_text
            FROM knowledge_chunks
            ORDER BY embedding <-> %s
            LIMIT %s
            """,
            (query_embedding, limit),
        )
        return cur.fetchall()

# Example query embedding from your embedding service
matches = retrieve_context([0.02] * 1024)
context = "\n\n".join([f"Source: {row[0]}\nText: {row[1]}" for row in matches])

Then call Anthropic with that context:

from anthropic import Anthropic

client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

response = client.messages.create(
    model="claude-3-5-sonnet-latest",
    max_tokens=500,
    messages=[
        {
            "role": "user",
            "content": f"""
You are assisting with pension-fund operations.
Answer only using the provided context.

Context:
{context}

Question:
Can a member request a lump-sum transfer under approved conditions?
"""
        }
    ],
)

print(response.content[0].text)

This pattern keeps answers anchored to retrieved policy text instead of relying on model memory.

  1. Wrap retrieval + generation into one agent function.

This is what you actually ship inside a startup AI agent service.

from anthropic import Anthropic

client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

def answer_question(question: str, question_embedding: list[float]) -> str:
    matches = retrieve_context(question_embedding)
    context = "\n\n".join([f"[{src}] {txt}" for src, txt in matches])

    resp = client.messages.create(
        model="claude-3-5-sonnet-latest",
        max_tokens=400,
        messages=[
            {
                "role": "user",
                "content": f"""
Use only this context to answer the question.

Context:
{context}

Question:
{question}
"""
            }
        ],
    )
    return resp.content[0].text

print(answer_question(
    "What are the approved conditions for a lump-sum transfer?",
    [0.02] * 1024,
))

Testing the Integration

Run a quick end-to-end check by inserting one known policy chunk and querying it back through the agent function.

store_chunk(
    "policy_note.txt",
    "A lump-sum transfer is allowed only after compliance review and trustee approval.",
    [0.03] * 1024,
)

result = answer_question(
    "When is a lump-sum transfer allowed?",
    [0.03] * 1024,
)

print(result)

Expected output:

A lump-sum transfer is allowed only after compliance review and trustee approval.

If you get an unrelated answer, check three things:

  • Your query embedding dimension matches the table definition
  • The similarity search is returning the right chunks
  • Your prompt says to answer only from context

Real-World Use Cases

  • Pension policy assistant: Let staff ask natural-language questions about contribution rules, transfer conditions, benefit eligibility, and internal procedures.
  • Document-grounded case triage: Retrieve similar historical cases from pgvector and have Anthropic draft next-step recommendations for analysts.
  • Compliance support bot: Build an internal agent that answers only from approved fund documents and flags missing evidence before escalation.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides