How to Integrate Anthropic for retail banking with pgvector for startups
Combining Anthropic for retail banking with pgvector gives you a practical pattern for building bank-grade AI agents that can answer customer questions from internal knowledge, retrieve relevant policy snippets, and keep responses grounded in your own data. For startups, this is the fastest way to ship a support or operations assistant that can handle retail banking workflows without stuffing everything into the prompt.
Prerequisites
- •Python 3.10+
- •A PostgreSQL database with the
pgvectorextension enabled - •An Anthropic API key
- •Access to your retail banking knowledge base, such as:
- •product FAQs
- •fee schedules
- •KYC/AML policy docs
- •card dispute procedures
- •Installed packages:
- •
anthropic - •
psycopg2-binary - •
pgvector - •
python-dotenv
- •
- •A valid embedding strategy for your documents
- •Network access from your app to PostgreSQL and Anthropic
Install the dependencies:
pip install anthropic psycopg2-binary pgvector python-dotenv
Integration Steps
1) Set up pgvector in PostgreSQL
Create the extension and a table for document chunks plus embeddings.
import psycopg2
conn = psycopg2.connect(
host="localhost",
dbname="bank_ai",
user="postgres",
password="postgres"
)
conn.autocommit = True
with conn.cursor() as cur:
cur.execute("CREATE EXTENSION IF NOT EXISTS vector;")
cur.execute("""
CREATE TABLE IF NOT EXISTS bank_docs (
id SERIAL PRIMARY KEY,
doc_type TEXT NOT NULL,
content TEXT NOT NULL,
embedding VECTOR(1536)
);
""")
conn.close()
If you are using OpenAI-style embeddings elsewhere in your stack, VECTOR(1536) is a common size. Match this dimension to the embedding model you actually use.
2) Generate embeddings for retail banking content
Anthropic is used for generation and reasoning. For vector search, you still need embeddings, so store document vectors in pgvector using an embedding model from your stack.
from openai import OpenAI
import psycopg2
from pgvector.psycopg2 import register_vector
client = OpenAI()
def embed_text(text: str):
resp = client.embeddings.create(
model="text-embedding-3-small",
input=text
)
return resp.data[0].embedding
docs = [
("faq", "Debit card replacements take 5 to 7 business days."),
("policy", "Cash deposits above $10,000 require enhanced due diligence."),
]
conn = psycopg2.connect(
host="localhost",
dbname="bank_ai",
user="postgres",
password="postgres"
)
register_vector(conn)
with conn.cursor() as cur:
for doc_type, content in docs:
emb = embed_text(content)
cur.execute(
"INSERT INTO bank_docs (doc_type, content, embedding) VALUES (%s, %s, %s)",
(doc_type, content, emb)
)
conn.commit()
conn.close()
This gives you retrieval over policy snippets, product terms, and operational runbooks.
3) Retrieve relevant context from pgvector
Use cosine distance to fetch the closest chunks for a customer question.
import psycopg2
from pgvector.psycopg2 import register_vector
from openai import OpenAI
embed_client = OpenAI()
def embed_query(query: str):
resp = embed_client.embeddings.create(
model="text-embedding-3-small",
input=query
)
return resp.data[0].embedding
query = "How long does it take to replace a lost debit card?"
query_vec = embed_query(query)
conn = psycopg2.connect(
host="localhost",
dbname="bank_ai",
user="postgres",
password="postgres"
)
register_vector(conn)
with conn.cursor() as cur:
cur.execute("""
SELECT doc_type, content
FROM bank_docs
ORDER BY embedding <=> %s::vector
LIMIT 3;
""", (query_vec,))
results = cur.fetchall()
conn.close()
context = "\n".join([f"[{doc_type}] {content}" for doc_type, content in results])
print(context)
The <=> operator is the standard pgvector cosine distance operator. That is the retrieval layer your agent will depend on.
4) Call Anthropic with retrieved context
Now pass the retrieved banking context into Anthropic’s Messages API so the model answers from your data instead of guessing.
import anthropic
client = anthropic.Anthropic(api_key="your-anthropic-api-key")
prompt_context = """
[faq] Debit card replacements take 5 to 7 business days.
[policy] Cash deposits above $10,000 require enhanced due diligence.
"""
question = "How long does it take to replace a lost debit card?"
message = client.messages.create(
model="claude-3-5-sonnet-latest",
max_tokens=300,
temperature=0,
messages=[
{
"role": "user",
"content": f"""
You are a retail banking support assistant.
Answer only using the provided context.
Context:
{prompt_context}
Question:
{question}
"""
}
]
)
print(message.content[0].text)
Use temperature=0 for support workflows where consistency matters more than creativity.
5) Wrap retrieval + generation into one agent function
This is the production shape: retrieve first, then generate grounded output.
import anthropic
import psycopg2
from openai import OpenAI
from pgvector.psycopg2 import register_vector
anthropic_client = anthropic.Anthropic(api_key="your-anthropic-api-key")
embed_client = OpenAI()
def get_embedding(text: str):
resp = embed_client.embeddings.create(
model="text-embedding-3-small",
input=text
)
return resp.data[0].embedding
def retrieve_context(question: str):
qvec = get_embedding(question)
conn = psycopg2.connect(
host="localhost",
dbname="bank_ai",
user="postgres",
password="postgres"
)
register_vector(conn)
with conn.cursor() as cur:
cur.execute("""
SELECT content
FROM bank_docs
ORDER BY embedding <=> %s::vector
LIMIT 3;
""", (qvec,))
rows = cur.fetchall()
conn.close()
return "\n".join(row[0] for row in rows)
def answer_question(question: str):
context = retrieve_context(question)
response = anthropic_client.messages.create(
model="claude-3-5-sonnet-latest",
max_tokens=250,
temperature=0,
messages=[{
"role": "user",
"content": f"""
You are a retail banking assistant.
Use only this context:
{context}
Question: {question}
"""
}]
)
return response.content[0].text
print(answer_question("What is the timeline for debit card replacement?"))
Testing the Integration
Run a simple end-to-end check against a known banking question.
result = answer_question("How long does it take to replace a lost debit card?")
print(result)
Expected output:
Debit card replacements take 5 to 7 business days.
If you get an unrelated answer, check these first:
- •Your embedding dimension matches the table definition.
- •The retrieval query returns the right chunks.
- •The prompt says “use only this context.”
- •Your database has enough domain-specific documents loaded.
Real-World Use Cases
- •
Retail banking support agent
- •Answers questions about fees, limits, replacement cards, account opening steps, and wire transfer timelines using approved internal docs.
- •
Ops copilot for bank staff
- •Retrieves policy snippets for KYC/AML checks, dispute handling, and escalation paths before generating staff-facing guidance.
- •
Customer self-service assistant
- •Handles common account servicing questions while keeping responses anchored in product terms and compliance-approved language.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit