How to Integrate Anthropic for retail banking with pgvector for startups

By Cyprian AaronsUpdated 2026-04-21

anthropic-for-retail-bankingpgvectorstartups

Combining Anthropic for retail banking with pgvector gives you a practical pattern for building bank-grade AI agents that can answer customer questions from internal knowledge, retrieve relevant policy snippets, and keep responses grounded in your own data. For startups, this is the fastest way to ship a support or operations assistant that can handle retail banking workflows without stuffing everything into the prompt.

Prerequisites

•Python 3.10+
•A PostgreSQL database with the pgvector extension enabled
•An Anthropic API key
•
Access to your retail banking knowledge base, such as:
- •product FAQs
- •fee schedules
- •KYC/AML policy docs
- •card dispute procedures
•
Installed packages:
- •anthropic
- •psycopg2-binary
- •pgvector
- •python-dotenv
•A valid embedding strategy for your documents
•Network access from your app to PostgreSQL and Anthropic

Install the dependencies:

pip install anthropic psycopg2-binary pgvector python-dotenv

Integration Steps

1) Set up pgvector in PostgreSQL

Create the extension and a table for document chunks plus embeddings.

import psycopg2

conn = psycopg2.connect(
    host="localhost",
    dbname="bank_ai",
    user="postgres",
    password="postgres"
)
conn.autocommit = True

with conn.cursor() as cur:
    cur.execute("CREATE EXTENSION IF NOT EXISTS vector;")
    cur.execute("""
        CREATE TABLE IF NOT EXISTS bank_docs (
            id SERIAL PRIMARY KEY,
            doc_type TEXT NOT NULL,
            content TEXT NOT NULL,
            embedding VECTOR(1536)
        );
    """)

conn.close()

If you are using OpenAI-style embeddings elsewhere in your stack, VECTOR(1536) is a common size. Match this dimension to the embedding model you actually use.

2) Generate embeddings for retail banking content

Anthropic is used for generation and reasoning. For vector search, you still need embeddings, so store document vectors in pgvector using an embedding model from your stack.

from openai import OpenAI
import psycopg2
from pgvector.psycopg2 import register_vector

client = OpenAI()

def embed_text(text: str):
    resp = client.embeddings.create(
        model="text-embedding-3-small",
        input=text
    )
    return resp.data[0].embedding

docs = [
    ("faq", "Debit card replacements take 5 to 7 business days."),
    ("policy", "Cash deposits above $10,000 require enhanced due diligence."),
]

conn = psycopg2.connect(
    host="localhost",
    dbname="bank_ai",
    user="postgres",
    password="postgres"
)
register_vector(conn)

with conn.cursor() as cur:
    for doc_type, content in docs:
        emb = embed_text(content)
        cur.execute(
            "INSERT INTO bank_docs (doc_type, content, embedding) VALUES (%s, %s, %s)",
            (doc_type, content, emb)
        )

conn.commit()
conn.close()

This gives you retrieval over policy snippets, product terms, and operational runbooks.

3) Retrieve relevant context from pgvector

Use cosine distance to fetch the closest chunks for a customer question.

import psycopg2
from pgvector.psycopg2 import register_vector
from openai import OpenAI

embed_client = OpenAI()

def embed_query(query: str):
    resp = embed_client.embeddings.create(
        model="text-embedding-3-small",
        input=query
    )
    return resp.data[0].embedding

query = "How long does it take to replace a lost debit card?"
query_vec = embed_query(query)

conn = psycopg2.connect(
    host="localhost",
    dbname="bank_ai",
    user="postgres",
    password="postgres"
)
register_vector(conn)

with conn.cursor() as cur:
    cur.execute("""
        SELECT doc_type, content
        FROM bank_docs
        ORDER BY embedding <=> %s::vector
        LIMIT 3;
    """, (query_vec,))
    results = cur.fetchall()

conn.close()

context = "\n".join([f"[{doc_type}] {content}" for doc_type, content in results])
print(context)

The <=> operator is the standard pgvector cosine distance operator. That is the retrieval layer your agent will depend on.

4) Call Anthropic with retrieved context

Now pass the retrieved banking context into Anthropic’s Messages API so the model answers from your data instead of guessing.

import anthropic

client = anthropic.Anthropic(api_key="your-anthropic-api-key")

prompt_context = """
[faq] Debit card replacements take 5 to 7 business days.
[policy] Cash deposits above $10,000 require enhanced due diligence.
"""

question = "How long does it take to replace a lost debit card?"

message = client.messages.create(
    model="claude-3-5-sonnet-latest",
    max_tokens=300,
    temperature=0,
    messages=[
        {
            "role": "user",
            "content": f"""
You are a retail banking support assistant.
Answer only using the provided context.

Context:
{prompt_context}

Question:
{question}
"""
        }
    ]
)

print(message.content[0].text)

Use temperature=0 for support workflows where consistency matters more than creativity.

5) Wrap retrieval + generation into one agent function

This is the production shape: retrieve first, then generate grounded output.

import anthropic
import psycopg2
from openai import OpenAI
from pgvector.psycopg2 import register_vector

anthropic_client = anthropic.Anthropic(api_key="your-anthropic-api-key")
embed_client = OpenAI()

def get_embedding(text: str):
    resp = embed_client.embeddings.create(
        model="text-embedding-3-small",
        input=text
    )
    return resp.data[0].embedding

def retrieve_context(question: str):
    qvec = get_embedding(question)

    conn = psycopg2.connect(
        host="localhost",
        dbname="bank_ai",
        user="postgres",
        password="postgres"
    )
    register_vector(conn)

    with conn.cursor() as cur:
        cur.execute("""
            SELECT content
            FROM bank_docs
            ORDER BY embedding <=> %s::vector
            LIMIT 3;
        """, (qvec,))
        rows = cur.fetchall()

    conn.close()
    return "\n".join(row[0] for row in rows)

def answer_question(question: str):
    context = retrieve_context(question)

    response = anthropic_client.messages.create(
        model="claude-3-5-sonnet-latest",
        max_tokens=250,
        temperature=0,
        messages=[{
            "role": "user",
            "content": f"""
You are a retail banking assistant.
Use only this context:

{context}

Question: {question}
"""
        }]
    )
    return response.content[0].text

print(answer_question("What is the timeline for debit card replacement?"))

Testing the Integration

Run a simple end-to-end check against a known banking question.

result = answer_question("How long does it take to replace a lost debit card?")
print(result)

Expected output:

Debit card replacements take 5 to 7 business days.

If you get an unrelated answer, check these first:

•Your embedding dimension matches the table definition.
•The retrieval query returns the right chunks.
•The prompt says “use only this context.”
•Your database has enough domain-specific documents loaded.

Real-World Use Cases

•
Retail banking support agent
- •Answers questions about fees, limits, replacement cards, account opening steps, and wire transfer timelines using approved internal docs.
•
Ops copilot for bank staff
- •Retrieves policy snippets for KYC/AML checks, dispute handling, and escalation paths before generating staff-facing guidance.
•
Customer self-service assistant
- •Handles common account servicing questions while keeping responses anchored in product terms and compliance-approved language.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit