How to Integrate Anthropic for banking with pgvector for production AI
Combining Anthropic for banking with pgvector gives you a practical pattern for production AI agents: use Anthropic to reason over customer intent, policy, and workflow decisions, then use pgvector to retrieve the right bank documents, product terms, and case history from your own data. That is the difference between a chat demo and an agent that can answer regulated questions with grounded context.
Prerequisites
- •Python 3.10+
- •PostgreSQL 14+ with the
pgvectorextension installed - •An Anthropic API key
- •A database user with permission to create extensions, tables, and indexes
- •A local or hosted Postgres connection string
- •
pippackages:- •
anthropic - •
psycopg[binary] - •
pgvector - •
python-dotenv
- •
Integration Steps
- •Install dependencies and set environment variables.
pip install anthropic psycopg[binary] pgvector python-dotenv
export ANTHROPIC_API_KEY="your_key"
export DATABASE_URL="postgresql://postgres:password@localhost:5432/bank_ai"
- •Create the vector table in Postgres.
Use pgvector to store embeddings for bank policies, FAQs, product docs, and case notes.
import os
import psycopg
from pgvector.psycopg import register_vector
conn = psycopg.connect(os.environ["DATABASE_URL"])
register_vector(conn)
with conn.cursor() as cur:
cur.execute("CREATE EXTENSION IF NOT EXISTS vector;")
cur.execute("""
CREATE TABLE IF NOT EXISTS bank_docs (
id SERIAL PRIMARY KEY,
doc_type TEXT NOT NULL,
content TEXT NOT NULL,
embedding VECTOR(1536)
);
""")
cur.execute("""
CREATE INDEX IF NOT EXISTS bank_docs_embedding_idx
ON bank_docs USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
""")
conn.commit()
conn.close()
- •Generate embeddings and insert your banking content.
For production, keep chunk sizes consistent and embed each chunk separately. Here I’m using Anthropic for generation decisions later; for embeddings you typically use a dedicated embedding model or service. If your stack already has embeddings from another provider, store them in pgvector the same way.
import os
import psycopg
from pgvector.psycopg import register_vector
docs = [
("kba", "Customers can reset their debit card PIN through mobile banking after identity verification."),
("policy", "Wire transfers above $10,000 require enhanced due diligence and manual review."),
]
# Example embedding placeholder: replace with your embedding pipeline output.
def fake_embedding(text):
return [0.01] * 1536
conn = psycopg.connect(os.environ["DATABASE_URL"])
register_vector(conn)
with conn.cursor() as cur:
for doc_type, content in docs:
cur.execute(
"INSERT INTO bank_docs (doc_type, content, embedding) VALUES (%s, %s, %s)",
(doc_type, content, fake_embedding(content)),
)
conn.commit()
conn.close()
- •Query pgvector for relevant context, then pass it to Anthropic.
This is the core RAG loop: retrieve first, then ask Claude to answer only from retrieved context.
import os
import anthropic
import psycopg
from pgvector.psycopg import register_vector
client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
def fake_embedding(text):
return [0.01] * 1536
query = "How do customers reset their debit card PIN?"
query_embedding = fake_embedding(query)
conn = psycopg.connect(os.environ["DATABASE_URL"])
register_vector(conn)
with conn.cursor() as cur:
cur.execute(
"""
SELECT doc_type, content
FROM bank_docs
ORDER BY embedding <=> %s
LIMIT 3;
""",
(query_embedding,),
)
rows = cur.fetchall()
context = "\n".join([f"[{doc_type}] {content}" for doc_type, content in rows])
response = client.messages.create(
model="claude-3-5-sonnet-latest",
max_tokens=300,
temperature=0,
messages=[
{
"role": "user",
"content": f"""Use only the context below to answer the banking question.
Context:
{context}
Question:
{query}
"""
}
],
)
print(response.content[0].text)
conn.close()
- •Add guardrails for banking workflows.
In production you should separate retrieval from decisioning. Let pgvector provide evidence, then let Anthropic classify intent or draft a response with strict instructions.
import anthropic
client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
system_prompt = """
You are a banking assistant.
Only answer using provided context.
If the context is insufficient, say you need a human review.
Never invent policy details.
"""
result = client.messages.create(
model="claude-3-5-sonnet-latest",
max_tokens=200,
temperature=0,
system=system_prompt,
messages=[
{
"role": "user",
"content": "Classify this request: customer wants to increase wire transfer limit to $50,000."
}
],
)
print(result.content[0].text)
Testing the Integration
Run a retrieval-plus-generation test against one known policy record.
test_question = "What happens when a wire transfer is above $10,000?"
# expected: retrieve the policy chunk and produce a grounded answer
print("Question:", test_question)
print("Answer:", response.content[0].text)
Expected output:
Question: What happens when a wire transfer is above $10,000?
Answer: Wire transfers above $10,000 require enhanced due diligence and manual review.
If you get an empty or off-topic answer:
- •Check that your embeddings are populated correctly
- •Verify the
vectorextension is enabled - •Confirm cosine distance query syntax uses
<=> - •Make sure Claude receives retrieved context in the prompt
Real-World Use Cases
- •Customer support copilots that answer account-policy questions from internal docs instead of guessing.
- •Compliance assistants that retrieve KYC/AML procedures from pgvector and have Claude draft reviewer notes.
- •Agent workflows that classify incoming requests, fetch the right policy snippet, and generate a human-readable action summary for ops teams.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit