How to Integrate Anthropic for investment banking with pgvector for production AI

By Cyprian AaronsUpdated 2026-04-21

anthropic-for-investment-bankingpgvectorproduction-ai

Combining Anthropic with pgvector gives you a practical pattern for investment banking workflows that need both reasoning and retrieval. Anthropic handles the language side: summarizing deals, drafting IC memos, answering analyst questions. pgvector handles the memory side: storing embeddings for pitch books, filings, comps, research notes, and prior deal context so your agent can retrieve the right material before generating an answer.

Prerequisites

•Python 3.10+
•An Anthropic API key
•PostgreSQL 14+ with the pgvector extension installed
•A database user with permission to create tables and extensions
•
pip packages:
- •anthropic
- •psycopg[binary]
- •pgvector
- •openai or another embedding provider if you are not using Anthropic for embeddings
•
A clear document chunking strategy for investment banking content:
- •earnings call transcripts
- •CIMs
- •company filings
- •broker research
- •internal notes

Integration Steps

•Install dependencies and create the vector table.

pip install anthropic psycopg[binary] pgvector openai

import psycopg
from pgvector.psycopg import register_vector

conn = psycopg.connect("postgresql://bank_user:bank_pass@localhost:5432/ib_ai")
register_vector(conn)

with conn.cursor() as cur:
    cur.execute("CREATE EXTENSION IF NOT EXISTS vector;")
    cur.execute("""
        CREATE TABLE IF NOT EXISTS documents (
            id BIGSERIAL PRIMARY KEY,
            source TEXT NOT NULL,
            chunk ტექXT NOT NULL,
            embedding vector(1536) NOT NULL
        );
    """)
    conn.commit()

•Generate embeddings for your banking documents and store them in pgvector.

For production, use a stable embedding model and keep chunk sizes consistent. The example below uses OpenAI embeddings because Anthropic does not provide a native embedding API.

from openai import OpenAI
from pgvector.psycopg import Vector

embed_client = OpenAI()

def embed_text(text: str) -> list[float]:
    resp = embed_client.embeddings.create(
        model="text-embedding-3-small",
        input=text,
    )
    return resp.data[0].embedding

chunks = [
    ("q2_earnings_call", "Revenue grew 12% year over year driven by enterprise demand."),
    ("cim_section_3", "The target company has recurring revenue and low customer concentration."),
]

with conn.cursor() as cur:
    for source, chunk in chunks:
        embedding = embed_text(chunk)
        cur.execute(
            "INSERT INTO documents (source, chunk, embedding) VALUES (%s, %s, %s)",
            (source, chunk, Vector(embedding)),
        )
    conn.commit()

•Retrieve the most relevant context from pgvector before calling Anthropic.

This is the core RAG pattern. Query similarity search first, then pass only the top matches into Claude so the model answers with deal-specific context instead of guessing.

def retrieve_context(query: str, limit: int = 3) -> list[str]:
    query_embedding = embed_text(query)

    with conn.cursor() as cur:
        cur.execute(
            """
            SELECT source, chunk
            FROM documents
            ORDER BY embedding <-> %s::vector
            LIMIT %s;
            """,
            (Vector(query_embedding), limit),
        )
        rows = cur.fetchall()

    return [f"[{source}] {chunk}" for source, chunk in rows]

context = retrieve_context("What drove revenue growth in the latest quarter?")
print(context)

•Send retrieved context to Anthropic for a grounded response.

Use the Messages API and keep the prompt structured. In banking workflows, you want concise answers with citations back to retrieved snippets.

import anthropic

anthropic_client = anthropic.Anthropic(api_key="YOUR_ANTHROPIC_API_KEY")

def answer_question(question: str) -> str:
    context_blocks = retrieve_context(question)

    prompt = f"""
Use only the context below to answer the question.
If the answer is not in the context, say you don't have enough information.

Context:
{chr(10).join(context_blocks)}

Question:
{question}
"""

    response = anthropic_client.messages.create(
        model="claude-3-5-sonnet-20241022",
        max_tokens=300,
        temperature=0,
        messages=[
            {"role": "user", "content": prompt}
        ],
    )
    return response.content[0].text

print(answer_question("What drove revenue growth in the latest quarter?"))

•Wrap retrieval + generation into a production-ready service function.

This is where you enforce guardrails: deterministic temperature, limited context window, and logging for auditability.

import json
from datetime import datetime

def investment_banking_assistant(question: str) -> dict:
    retrieved = retrieve_context(question)
    answer = answer_question(question)

    result = {
        "question": question,
        "answer": answer,
        "retrieved_context": retrieved,
        "timestamp": datetime.utcnow().isoformat() + "Z",
    }

    # Persist audit trail or send to your observability stack here.
    return result

print(json.dumps(investment_banking_assistant("Summarize key risks in this target company."), indent=2))

Testing the Integration

Run a simple end-to-end check with one known document and one targeted question. You should see Claude reference only what was stored in pgvector.

test_result = investment_banking_assistant("What is driving enterprise demand?")
print(test_result["answer"])

Expected output:

Revenue grew 12% year over year driven by enterprise demand.

If you get a vague answer or hallucinated details, check these first:

•Your chunks are too large or too small
•The query is retrieving irrelevant rows
•Your prompt does not constrain the model to retrieved context only
•The embedding dimension does not match your vector(n) column

Real-World Use Cases

•
Deal team copilot
- •Pull relevant sections from CIMs, diligence notes, and filings before Claude drafts an IC memo or management Q&A prep.
•
Comparable company research assistant
- •Store analyst notes and earnings transcripts in pgvector so Claude can summarize valuation drivers across peers.
•
Risk and compliance review
- •Retrieve policy docs, prior approvals, and exception logs before generating a compliance response or escalation summary.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit