How to Integrate Anthropic for retail banking with pgvector for RAG

By Cyprian AaronsUpdated 2026-04-21

anthropic-for-retail-bankingpgvectorrag

Combining Anthropic for retail banking with pgvector gives you a clean pattern for bank-grade RAG: store policy, product, and customer-service knowledge in Postgres, then let Anthropic answer with grounded context instead of hallucinating from memory. For retail banking teams, this is the difference between a generic chatbot and an agent that can explain overdraft fees, card replacement steps, or mortgage document requirements using approved source material.

Prerequisites

•Python 3.10+
•PostgreSQL 14+ with the pgvector extension installed
•An Anthropic API key
•
Access to your retail banking knowledge base:
- •FAQ docs
- •product guides
- •policy PDFs converted to text chunks
•
Python packages:
- •anthropic
- •psycopg[binary]
- •pgvector
- •python-dotenv

Install them:

pip install anthropic psycopg[binary] pgvector python-dotenv

Integration Steps

•
Enable pgvector and create a table for embeddings

Start by enabling the extension and creating a simple documents table. Use one row per chunk so retrieval stays precise.

import psycopg

conn = psycopg.connect("postgresql://postgres:postgres@localhost:5432/banking")
conn.execute("CREATE EXTENSION IF NOT EXISTS vector;")

conn.execute("""
CREATE TABLE IF NOT EXISTS banking_docs (
    id SERIAL PRIMARY KEY,
    source TEXT NOT NULL,
    content TEXT NOT NULL,
    embedding vector(1536)
)
""")
conn.commit()
conn.close()

•
Generate embeddings with Anthropic-compatible workflows

Anthropic’s Claude models are for generation, not embeddings. In production RAG, pair Claude with an embedding model from your stack, then use Anthropic for the answer step. The important part is that the retrieved context later gets passed into client.messages.create().

from anthropic import Anthropic
import os

client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

def build_prompt(question: str, context: str) -> str:
    return f"""
You are a retail banking assistant.
Answer only using the provided context.
If the context does not contain the answer, say you don't know.

Context:
{context}

Question:
{question}
""".strip()

•
Insert chunks and vectors into pgvector

Here’s a practical pattern using OpenAI-style embeddings or any internal embedding service. The storage layer doesn’t care where the vector came from as long as dimensions match.

import psycopg
from pgvector.psycopg import register_vector

def save_chunk(source: str, content: str, embedding: list[float]):
    conn = psycopg.connect("postgresql://postgres:postgres@localhost:5432/banking")
    register_vector(conn)

    with conn.cursor() as cur:
        cur.execute(
            "INSERT INTO banking_docs (source, content, embedding) VALUES (%s, %s, %s)",
            (source, content, embedding),
        )
    conn.commit()
    conn.close()

•
Retrieve top-k similar chunks from pgvector

Use cosine distance to fetch the most relevant policy snippets for each user question.

import psycopg
from pgvector.psycopg import register_vector

def search_docs(query_embedding: list[float], k: int = 5):
    conn = psycopg.connect("postgresql://postgres:postgres@localhost:5432/banking")
    register_vector(conn)

    with conn.cursor() as cur:
        cur.execute(
            """
            SELECT source, content
            FROM banking_docs
            ORDER BY embedding <=> %s
            LIMIT %s
            """,
            (query_embedding, k),
        )
        rows = cur.fetchall()

    conn.close()
    return rows

•
Send retrieved context to Anthropic for grounded answers

This is the actual RAG loop. Retrieve first, then ask Claude to answer strictly from those chunks.

from anthropic import Anthropic

client = Anthropic(api_key="YOUR_ANTHROPIC_API_KEY")

def answer_question(question: str, query_embedding: list[float]):
    docs = search_docs(query_embedding, k=4)
    context = "\n\n".join([f"[{source}] {content}" for source, content in docs])

    prompt = build_prompt(question, context)

    response = client.messages.create(
        model="claude-3-5-sonnet-latest",
        max_tokens=300,
        messages=[
            {"role": "user", "content": prompt}
        ],
    )

    return response.content[0].text

Testing the Integration

Run a quick end-to-end test with one known policy chunk and one question. Replace embed_text() with your real embedding function.

def embed_text(text: str) -> list[float]:
    # Replace with your embedding provider.
    return [0.01] * 1536

save_chunk(
    source="card_policy",
    content="Debit card replacement takes 3 to 5 business days and costs $10 unless waived.",
    embedding=embed_text("Debit card replacement takes 3 to 5 business days and costs $10 unless waived.")
)

question = "How long does debit card replacement take?"
result = answer_question(question, embed_text(question))
print(result)

Expected output:

Debit card replacement takes 3 to 5 business days and costs $10 unless waived.

If your retrieval is working correctly, Claude should echo the policy-backed answer instead of inventing one.

Real-World Use Cases

•
Customer support copilot
- •Answer questions about fees, limits, account opening requirements, and dispute timelines from approved bank documents.
•
Branch staff assistant
- •Help frontline staff retrieve product rules quickly during customer conversations without searching multiple internal systems.
•
Policy-grounded virtual agent
- •Build an agent that handles repetitive retail banking queries while staying aligned with compliance-approved knowledge stored in Postgres.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit