How to Integrate Anthropic for retail banking with pgvector for RAG
Combining Anthropic for retail banking with pgvector gives you a clean pattern for bank-grade RAG: store policy, product, and customer-service knowledge in Postgres, then let Anthropic answer with grounded context instead of hallucinating from memory. For retail banking teams, this is the difference between a generic chatbot and an agent that can explain overdraft fees, card replacement steps, or mortgage document requirements using approved source material.
Prerequisites
- •Python 3.10+
- •PostgreSQL 14+ with the
pgvectorextension installed - •An Anthropic API key
- •Access to your retail banking knowledge base:
- •FAQ docs
- •product guides
- •policy PDFs converted to text chunks
- •Python packages:
- •
anthropic - •
psycopg[binary] - •
pgvector - •
python-dotenv
- •
Install them:
pip install anthropic psycopg[binary] pgvector python-dotenv
Integration Steps
- •
Enable pgvector and create a table for embeddings
Start by enabling the extension and creating a simple documents table. Use one row per chunk so retrieval stays precise.
import psycopg
conn = psycopg.connect("postgresql://postgres:postgres@localhost:5432/banking")
conn.execute("CREATE EXTENSION IF NOT EXISTS vector;")
conn.execute("""
CREATE TABLE IF NOT EXISTS banking_docs (
id SERIAL PRIMARY KEY,
source TEXT NOT NULL,
content TEXT NOT NULL,
embedding vector(1536)
)
""")
conn.commit()
conn.close()
- •
Generate embeddings with Anthropic-compatible workflows
Anthropic’s Claude models are for generation, not embeddings. In production RAG, pair Claude with an embedding model from your stack, then use Anthropic for the answer step. The important part is that the retrieved context later gets passed into
client.messages.create().
from anthropic import Anthropic
import os
client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
def build_prompt(question: str, context: str) -> str:
return f"""
You are a retail banking assistant.
Answer only using the provided context.
If the context does not contain the answer, say you don't know.
Context:
{context}
Question:
{question}
""".strip()
- •
Insert chunks and vectors into pgvector
Here’s a practical pattern using OpenAI-style embeddings or any internal embedding service. The storage layer doesn’t care where the vector came from as long as dimensions match.
import psycopg
from pgvector.psycopg import register_vector
def save_chunk(source: str, content: str, embedding: list[float]):
conn = psycopg.connect("postgresql://postgres:postgres@localhost:5432/banking")
register_vector(conn)
with conn.cursor() as cur:
cur.execute(
"INSERT INTO banking_docs (source, content, embedding) VALUES (%s, %s, %s)",
(source, content, embedding),
)
conn.commit()
conn.close()
- •
Retrieve top-k similar chunks from pgvector
Use cosine distance to fetch the most relevant policy snippets for each user question.
import psycopg
from pgvector.psycopg import register_vector
def search_docs(query_embedding: list[float], k: int = 5):
conn = psycopg.connect("postgresql://postgres:postgres@localhost:5432/banking")
register_vector(conn)
with conn.cursor() as cur:
cur.execute(
"""
SELECT source, content
FROM banking_docs
ORDER BY embedding <=> %s
LIMIT %s
""",
(query_embedding, k),
)
rows = cur.fetchall()
conn.close()
return rows
- •
Send retrieved context to Anthropic for grounded answers
This is the actual RAG loop. Retrieve first, then ask Claude to answer strictly from those chunks.
from anthropic import Anthropic
client = Anthropic(api_key="YOUR_ANTHROPIC_API_KEY")
def answer_question(question: str, query_embedding: list[float]):
docs = search_docs(query_embedding, k=4)
context = "\n\n".join([f"[{source}] {content}" for source, content in docs])
prompt = build_prompt(question, context)
response = client.messages.create(
model="claude-3-5-sonnet-latest",
max_tokens=300,
messages=[
{"role": "user", "content": prompt}
],
)
return response.content[0].text
Testing the Integration
Run a quick end-to-end test with one known policy chunk and one question. Replace embed_text() with your real embedding function.
def embed_text(text: str) -> list[float]:
# Replace with your embedding provider.
return [0.01] * 1536
save_chunk(
source="card_policy",
content="Debit card replacement takes 3 to 5 business days and costs $10 unless waived.",
embedding=embed_text("Debit card replacement takes 3 to 5 business days and costs $10 unless waived.")
)
question = "How long does debit card replacement take?"
result = answer_question(question, embed_text(question))
print(result)
Expected output:
Debit card replacement takes 3 to 5 business days and costs $10 unless waived.
If your retrieval is working correctly, Claude should echo the policy-backed answer instead of inventing one.
Real-World Use Cases
- •
Customer support copilot
- •Answer questions about fees, limits, account opening requirements, and dispute timelines from approved bank documents.
- •
Branch staff assistant
- •Help frontline staff retrieve product rules quickly during customer conversations without searching multiple internal systems.
- •
Policy-grounded virtual agent
- •Build an agent that handles repetitive retail banking queries while staying aligned with compliance-approved knowledge stored in Postgres.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit