How to Integrate Anthropic for lending with pgvector for AI agents

By Cyprian AaronsUpdated 2026-04-21

anthropic-for-lendingpgvectorai-agents

Combining Anthropic for lending workflows with pgvector gives you a practical pattern for AI agents that need both reasoning and retrieval. In lending, that usually means answering policy questions, summarizing borrower context, and pulling the right supporting documents without stuffing everything into the prompt.

The useful part is simple: Anthropic handles the language reasoning, while pgvector stores embeddings for loan docs, underwriting notes, product guides, and compliance policies. Your agent can retrieve the most relevant context first, then ask Claude to produce a grounded answer or action.

Prerequisites

•Python 3.10+
•A running PostgreSQL 15+ instance
•pgvector installed in PostgreSQL
•An Anthropic API key
•Access to an embedding model for vector generation
•psycopg[binary], pgvector, and anthropic Python packages installed
•
A lending knowledge base to index:
- •policy PDFs
- •underwriting rules
- •product sheets
- •servicing playbooks

Install the packages:

pip install anthropic psycopg[binary] pgvector

Enable the extension in Postgres:

CREATE EXTENSION IF NOT EXISTS vector;

Integration Steps

1) Create the table for lending documents

Store text chunks and embeddings together. Keep metadata in JSONB so you can filter by document type, product line, or jurisdiction.

import psycopg

conn = psycopg.connect("postgresql://postgres:postgres@localhost:5432/lending")
conn.execute("CREATE EXTENSION IF NOT EXISTS vector")

conn.execute("""
CREATE TABLE IF NOT EXISTS lending_chunks (
    id BIGSERIAL PRIMARY KEY,
    doc_id TEXT NOT NULL,
    chunk ტექXT NOT NULL,
    metadata JSONB DEFAULT '{}'::jsonb,
    embedding vector(1536)
)
""")
conn.commit()

If you use Claude embeddings from another provider later, keep the column dimension aligned with that model. The table design stays the same.

2) Generate embeddings and insert chunks into pgvector

Use an embedding model to convert lending content into vectors. If your pipeline already chunks documents, insert each chunk with its embedding.

import os
import json
import anthropic
import psycopg

client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
conn = psycopg.connect("postgresql://postgres:postgres@localhost:5432/lending")

def embed_text(text: str) -> list[float]:
    # Replace with your embedding provider call.
    # Keep this function isolated so you can swap models without changing retrieval code.
    raise NotImplementedError("Use your embedding model here")

chunks = [
    {
        "doc_id": "underwriting_policy_2025",
        "chunk": "Debt-to-income ratio must be below 43% for standard consumer loans.",
        "metadata": {"type": "policy", "product": "consumer_loan"}
    },
    {
        "doc_id": "servicing_playbook",
        "chunk": "Escalate any payment dispute older than 30 days to servicing operations.",
        "metadata": {"type": "playbook", "team": "servicing"}
    }
]

with conn.cursor() as cur:
    for item in chunks:
        embedding = embed_text(item["chunk"])
        cur.execute(
            """
            INSERT INTO lending_chunks (doc_id, chunk, metadata, embedding)
            VALUES (%s, %s, %s::jsonb, %s)
            """,
            (item["doc_id"], item["chunk"], json.dumps(item["metadata"]), embedding)
        )
conn.commit()

For production, batch inserts and normalize chunk size around 300–800 tokens. That keeps retrieval precise.

3) Retrieve relevant context with pgvector similarity search

Use cosine distance to pull the nearest chunks for a user question. This is the retrieval step your agent will run before calling Claude.

import psycopg

def search_lending_context(query_embedding: list[float], limit: int = 5):
    conn = psycopg.connect("postgresql://postgres:postgres@localhost:5432/lending")
    with conn.cursor() as cur:
        cur.execute(
            """
            SELECT doc_id, chunk, metadata,
                   1 - (embedding <=> %s::vector) AS similarity
            FROM lending_chunks
            ORDER BY embedding <=> %s::vector
            LIMIT %s
            """,
            (query_embedding, query_embedding, limit)
        )
        return cur.fetchall()

Add an index once your table grows:

CREATE INDEX IF NOT EXISTS lending_chunks_embedding_idx
ON lending_chunks USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);

That matters when you move beyond a few thousand chunks.

4) Call Anthropic with retrieved lending context

Now pass only the top matches into Claude. Use Anthropic’s Messages API so the agent can answer based on retrieved policy text instead of guessing.

import os
import anthropic

client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

def answer_lending_question(question: str, contexts: list[tuple]) -> str:
    context_text = "\n\n".join(
        f"[{row[0]} | {row[2]} | score={row[3]:.3f}]\n{row[1]}"
        for row in contexts
    )

    response = client.messages.create(
        model="claude-3-5-sonnet-latest",
        max_tokens=400,
        temperature=0,
        system="You are a lending operations assistant. Use only provided context when possible.",
        messages=[
            {
                "role": "user",
                "content": f"""Question: {question}

Retrieved context:
{context_text}

Answer clearly and cite which chunk supports your answer."""
            }
        ]
    )
    return response.content[0].text

This pattern works well for underwriting Q&A, exception handling, and compliance support because it keeps answers tied to source material.

5) Wrap it in one agent function

Put retrieval and generation behind a single function so your agent runtime can call it directly.

def handle_lending_query(question: str):
    query_embedding = embed_text(question)
    contexts = search_lending_context(query_embedding=query_embedding, limit=4)
    return answer_lending_question(question, contexts)

print(handle_lending_query("What is our DTI threshold for standard consumer loans?"))

Keep this boundary clean. Retrieval belongs to pgvector; reasoning belongs to Anthropic; orchestration belongs to your agent layer.

Testing the Integration

Run a focused test against a known policy statement. You want to verify both retrieval relevance and answer grounding.

question = "What is our DTI threshold for standard consumer loans?"
result = handle_lending_query(question)
print(result)

Expected output:

The standard consumer loan DTI threshold is below 43%.
Source support: underwriting_policy_2025.

If the model starts hallucinating or ignoring retrieved text:

•lower temperature to 0
•reduce retrieved chunks to the top 3–5 results
•improve chunking around policy boundaries
•add metadata filters like product='consumer_loan'

Real-World Use Cases

•
Underwriting copilot
- •Agents answer policy questions from loan officers using current underwriting docs stored in pgvector.
•
Servicing assistant
- •The agent retrieves payment dispute procedures or escalation rules before drafting responses.
•
Compliance QA
- •Ask natural-language questions like “Can we approve this exception?” and ground answers in internal policy snippets plus Claude reasoning.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit