How to Integrate Anthropic for banking with pgvector for AI agents
Combining Anthropic for banking with pgvector gives you a clean pattern for building AI agents that can answer policy-heavy banking questions, retrieve relevant internal knowledge, and keep responses grounded in approved source material. Anthropic handles the reasoning and response generation, while pgvector gives your agent semantic retrieval over policies, product docs, call notes, KYC playbooks, and risk guidance.
That matters in banking because most agent failures come from hallucinated answers or missing context. With this setup, the agent can pull the right documents first, then ask Anthropic to draft a response that stays inside your compliance boundaries.
Prerequisites
- •Python 3.10+
- •PostgreSQL 15+ with the
pgvectorextension enabled - •An Anthropic API key
- •A banking knowledge base you can index:
- •policy PDFs
- •internal SOPs
- •product FAQs
- •compliance notes
- •Python packages:
- •
anthropic - •
psycopg[binary] - •
pgvector - •
python-dotenv
- •
- •A database user with permission to create tables and enable extensions
Install the dependencies:
pip install anthropic psycopg[binary] pgvector python-dotenv
Integration Steps
1) Set up PostgreSQL with pgvector
Create the extension and a table for embeddings. Use a fixed embedding size that matches your model output.
import os
import psycopg
from pgvector.psycopg import register_vector
DB_URL = os.getenv("DATABASE_URL")
with psycopg.connect(DB_URL) as conn:
register_vector(conn)
with conn.cursor() as cur:
cur.execute("CREATE EXTENSION IF NOT EXISTS vector;")
cur.execute("""
CREATE TABLE IF NOT EXISTS bank_docs (
id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
content TEXT NOT NULL,
embedding VECTOR(1536)
);
""")
conn.commit()
If you already have a schema, keep the same idea: store text plus a vector column and index it later with HNSW or IVFFLAT.
2) Generate embeddings for your banking documents
Anthropic is used here for generation and reasoning. For embeddings, use a dedicated embedding model from your stack if you already have one; if not, keep the retrieval layer separate from the chat model. The important part is that whatever embedding service you use must produce vectors with the same dimension as your VECTOR(...) column.
Here is a simple ingestion pattern using placeholder embeddings so the database wiring is clear:
import os
import psycopg
from pgvector.psycopg import register_vector
DB_URL = os.getenv("DATABASE_URL")
def fake_embedding(text: str) -> list[float]:
# Replace with real embeddings from your embedding provider.
return [0.0] * 1536
docs = [
("KYC Policy", "Customers must provide government ID and proof of address."),
("Wire Transfer Limits", "Daily outbound wire limit is $25,000 for retail accounts."),
]
with psycopg.connect(DB_URL) as conn:
register_vector(conn)
with conn.cursor() as cur:
for title, content in docs:
emb = fake_embedding(content)
cur.execute(
"INSERT INTO bank_docs (title, content, embedding) VALUES (%s, %s, %s)",
(title, content, emb),
)
conn.commit()
In production, swap fake_embedding() for your real embedding pipeline and batch inserts.
3) Query pgvector for relevant context
When a user asks a question, embed the query and retrieve the nearest documents. This is the retrieval step your agent depends on before calling Anthropic.
import os
import psycopg
from pgvector.psycopg import register_vector
DB_URL = os.getenv("DATABASE_URL")
def fake_embedding(text: str) -> list[float]:
return [0.0] * 1536
question = "What documents are required for KYC onboarding?"
query_vec = fake_embedding(question)
with psycopg.connect(DB_URL) as conn:
register_vector(conn)
with conn.cursor() as cur:
cur.execute(
"""
SELECT title, content
FROM bank_docs
ORDER BY embedding <-> %s
LIMIT 3;
""",
(query_vec,),
)
results = cur.fetchall()
context = "\n\n".join([f"{title}: {content}" for title, content in results])
print(context)
The <-> operator is the standard vector distance operator used by pgvector. In most agent systems, this retrieval step runs on every user turn.
4) Call Anthropic with retrieved context
Now pass the retrieved context into Anthropic’s Messages API. Keep the prompt tight: define role, constraints, and expected output format.
import os
from anthropic import Anthropic
client = Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))
question = "What documents are required for KYC onboarding?"
context = """
KYC Policy: Customers must provide government ID and proof of address.
Wire Transfer Limits: Daily outbound wire limit is $25,000 for retail accounts.
"""
response = client.messages.create(
model="claude-3-5-sonnet-latest",
max_tokens=300,
temperature=0,
system=(
"You are a banking assistant. Answer only using provided context. "
"If context is insufficient, say what is missing."
),
messages=[
{
"role": "user",
"content": f"Question: {question}\n\nContext:\n{context}"
}
],
)
print(response.content[0].text)
This is the core pattern: retrieve first, generate second. That keeps responses grounded and audit-friendly.
5) Wrap both pieces into an AI agent function
For an actual agent service, hide retrieval and generation behind one function. This makes it easy to plug into FastAPI, Celery workers, or your orchestration layer.
import os
import psycopg
from anthropic import Anthropic
from pgvector.psycopg import register_vector
client = Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))
DB_URL = os.getenv("DATABASE_URL")
def embed_query(text: str) -> list[float]:
# Replace with real embeddings.
return [0.0] * 1536
def answer_banking_question(question: str) -> str:
qvec = embed_query(question)
with psycopg.connect(DB_URL) as conn:
register_vector(conn)
with conn.cursor() as cur:
cur.execute(
"""
SELECT title, content
FROM bank_docs
ORDER BY embedding <-> %s
LIMIT 3;
""",
(qvec,),
)
rows = cur.fetchall()
context = "\n".join([f"{t}: {c}" for t, c in rows])
resp = client.messages.create(
model="claude-3-5-sonnet-latest",
max_tokens=250,
temperature=0,
system="Answer strictly from retrieved banking context.",
messages=[{"role": "user", "content": f"{question}\n\n{context}"}],
)
return resp.content[0].text.strip()
Testing the Integration
Run a direct smoke test against both systems:
answer = answer_banking_question("What documents do I need for KYC onboarding?")
print(answer)
Expected output:
Customers must provide government ID and proof of address for KYC onboarding.
If additional verification is needed based on risk profile, request supporting documentation from compliance.
If you get empty or irrelevant answers:
- •check that your vectors are being stored correctly
- •verify your query embedding matches the document embedding model/dimension
- •confirm
pgvectorordering is returning sensible nearest neighbors - •keep
temperature=0while validating behavior
Real-World Use Cases
- •Compliance assistant
- •Retrieve AML/KYC policies from pgvector and have Anthropic draft compliant responses for ops teams.
- •Agentic customer support
- •Let an internal support agent answer product questions using approved bank documentation only.
- •Analyst copilot
- •Search call transcripts or incident notes semantically and generate concise summaries for relationship managers or risk analysts.
This pattern scales well because each part has one job. pgvector handles grounding; Anthropic handles language and reasoning; your agent layer handles orchestration and guardrails.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit