How to Integrate Anthropic for investment banking with pgvector for RAG

By Cyprian AaronsUpdated 2026-04-21

anthropic-for-investment-bankingpgvectorrag

Investment banking teams need answers grounded in internal documents, not generic model output. Combining Anthropic with pgvector gives you a clean RAG path: store deal docs, CIMs, earnings call notes, and policy memos as embeddings in Postgres, then let Claude answer with retrieved context and tighter auditability.

Prerequisites

•Python 3.10+
•PostgreSQL 14+ with the pgvector extension installed
•An Anthropic API key
•A database user with permission to create tables and extensions
•
These Python packages:
- •anthropic
- •psycopg[binary]
- •pgvector
- •openai is not needed here; keep the stack tight
•
A corpus of investment banking documents:
- •pitch books
- •company filings
- •credit memos
- •internal research notes

Install the dependencies:

pip install anthropic psycopg[binary] pgvector

Integration Steps

1) Create the vector table in Postgres

Use pgvector to store embeddings alongside metadata you can filter on later, like deal name, sector, or document type.

import psycopg
from pgvector.psycopg import register_vector

conn = psycopg.connect("postgresql://bank_user:bank_pass@localhost:5432/banking_rag")
register_vector(conn)

with conn.cursor() as cur:
    cur.execute("CREATE EXTENSION IF NOT EXISTS vector;")
    cur.execute("""
        CREATE TABLE IF NOT EXISTS ib_documents (
            id BIGSERIAL PRIMARY KEY,
            doc_id TEXT NOT NULL,
            title TEXT NOT NULL,
            content TEXT NOT NULL,
            sector TEXT,
            source TEXT,
            embedding vector(1536)
        );
    """)
    cur.execute("""
        CREATE INDEX IF NOT EXISTS ib_documents_embedding_idx
        ON ib_documents USING ivfflat (embedding vector_cosine_ops)
        WITH (lists = 100);
    """)
    conn.commit()

2) Generate embeddings with Anthropic-compatible workflow

Anthropic’s core SDK is for messages generation. For embeddings, use a dedicated embedding model provider in your pipeline or precompute vectors from your document processing layer. If your architecture already standardizes on Anthropic for generation, keep Claude for reasoning and use embeddings from a separate embedding service that returns fixed-size vectors for pgvector.

Here’s the ingestion pattern with a placeholder embed_text() function that returns a 1536-dimension vector:

from typing import List
import psycopg
from pgvector.psycopg import register_vector

def embed_text(text: str) -> List[float]:
    # Replace with your embedding provider call.
    # Must return a list[float] matching vector(1536).
    raise NotImplementedError

docs = [
    {
        "doc_id": "cim_001",
        "title": "Acme Corp CIM",
        "content": "Acme reported EBITDA growth of 18% YoY ...",
        "sector": "Industrials",
        "source": "cim"
    }
]

conn = psycopg.connect("postgresql://bank_user:bank_pass@localhost:5432/banking_rag")
register_vector(conn)

with conn.cursor() as cur:
    for doc in docs:
        embedding = embed_text(doc["content"])
        cur.execute(
            """
            INSERT INTO ib_documents (doc_id, title, content, sector, source, embedding)
            VALUES (%s, %s, %s, %s, %s, %s)
            """,
            (doc["doc_id"], doc["title"], doc["content"], doc["sector"], doc["source"], embedding),
        )
    conn.commit()

3) Retrieve the top-k relevant chunks from pgvector

At query time, embed the user question and run cosine similarity search against Postgres.

import psycopg
from pgvector.psycopg import register_vector

def embed_query(query: str):
    # Same embedding provider as ingestion.
    raise NotImplementedError

query = "What was Acme's EBITDA margin trend last quarter?"

conn = psycopg.connect("postgresql://bank_user:bank_pass@localhost:5432/banking_rag")
register_vector(conn)

query_vec = embed_query(query)

with conn.cursor() as cur:
    cur.execute(
        """
        SELECT doc_id, title, content,
               1 - (embedding <=> %s) AS similarity
        FROM ib_documents
        ORDER BY embedding <=> %s
        LIMIT 5;
        """,
        (query_vec, query_vec),
    )
    rows = cur.fetchall()

context = "\n\n".join(
    f"[{r[0]}] {r[1]} (similarity={r[3]:.3f})\n{r[2]}"
    for r in rows
)
print(context)

4) Send retrieved context to Anthropic Claude

Now pass the retrieved context into Claude using the Messages API. For investment banking workflows, keep the prompt strict: cite sources and avoid unsupported claims.

import anthropic

client = anthropic.Anthropic(api_key="ANTHROPIC_API_KEY")

messages = [
    {
        "role": "user",
        "content": f"""
You are an investment banking analyst assistant.
Answer only using the provided context.
If the answer is not in the context, say you don't have enough information.

Question:
What was Acme's EBITDA margin trend last quarter?

Context:
{context}
"""
    }
]

response = client.messages.create(
    model="claude-3-5-sonnet-latest",
    max_tokens=400,
    temperature=0.1,
    messages=messages,
)

print(response.content[0].text)

5) Wrap it into one RAG function

This is the production shape: retrieve from Postgres first, then ask Claude to synthesize. Keep retrieval and generation separate so you can inspect both steps during audits.

import anthropic
import psycopg
from pgvector.psycopg import register_vector

client = anthropic.Anthropic(api_key="ANTHROPIC_API_KEY")

def rag_answer(question: str) -> str:
    qvec = embed_query(question)

    conn = psycopg.connect("postgresql://bank_user:bank_pass@localhost:5432/banking_rag")
    register_vector(conn)

    with conn.cursor() as cur:
        cur.execute(
            """
            SELECT title, content
            FROM ib_documents
            ORDER BY embedding <=> %s
            LIMIT 4;
            """,
            (qvec,),
        )
        docs = cur.fetchall()

    context = "\n\n".join([f"{title}\n{content}" for title, content in docs])

    resp = client.messages.create(
        model="claude-3-5-sonnet-latest",
        max_tokens=300,
        temperature=0,
        messages=[{
            "role": "user",
            "content": f"Use only this context:\n\n{context}\n\nQuestion: {question}"
        }],
    )

    return resp.content[0].text

print(rag_answer("Summarize Acme's margin performance and any headwinds mentioned."))

Testing the Integration

Run a simple end-to-end test with one known document and one known question.

test_question = "What did Acme say about margin pressure?"
answer = rag_answer(test_question)
print(answer)

Expected output:

Acme noted margin pressure from higher input costs and lower utilization. The document indicates EBITDA margins compressed last quarter and management expects partial recovery next quarter.

If you want to validate retrieval separately before calling Claude:

qvec = embed_query("What did Acme say about margin pressure?")
with conn.cursor() as cur:
    cur.execute(
        """
        SELECT title, content
        FROM ib_documents
        ORDER BY embedding <=> %s
        LIMIT 1;
        """,
        (qvec,),
    )
    print(cur.fetchone())

Real-World Use Cases

•
Deal team Q&A assistant
Ask questions across CIMs, diligence notes, and board materials without manually searching shared drives.
•
Earnings call and filing copilot
Retrieve prior commentary from stored transcripts and filings to draft concise market updates or management summaries.
•
Credit memo drafting support
Pull relevant historical risk factors, covenant language, and financial trends into one grounded response for analysts.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit