How to Integrate Anthropic for lending with pgvector for RAG

By Cyprian AaronsUpdated 2026-04-21
anthropic-for-lendingpgvectorrag

If you’re building lending workflows, the hard part is not generating text. It’s grounding decisions in policy docs, loan product terms, underwriting rules, and customer history without letting the model improvise. Pairing Anthropic with pgvector gives you a clean RAG stack: Anthropic handles reasoning and response generation, while pgvector stores and retrieves the most relevant lending knowledge.

Prerequisites

  • Python 3.10+
  • A running PostgreSQL instance with the pgvector extension enabled
  • An Anthropic API key
  • Access to your lending documents:
    • credit policy PDFs
    • underwriting guidelines
    • product terms
    • FAQ and servicing playbooks
  • These Python packages:
    • anthropic
    • psycopg[binary]
    • pgvector
    • sentence-transformers or another embedding provider
  • A basic understanding of:
    • SQL
    • embeddings
    • retrieval-augmented generation

Install dependencies:

pip install anthropic psycopg[binary] pgvector sentence-transformers

Integration Steps

1) Set up pgvector in PostgreSQL

Create the extension and a table for chunks plus embeddings. Use a fixed embedding dimension that matches your model.

import psycopg

conn = psycopg.connect("postgresql://postgres:postgres@localhost:5432/lending")
conn.execute("CREATE EXTENSION IF NOT EXISTS vector;")

conn.execute("""
CREATE TABLE IF NOT EXISTS lending_chunks (
    id SERIAL PRIMARY KEY,
    doc_name TEXT NOT NULL,
    chunk ტექxt TEXT NOT NULL,
    embedding vector(384)
);
""")

conn.commit()
conn.close()

If you use sentence-transformers/all-MiniLM-L6-v2, the embedding size is 384. Match the column dimension to your model or inserts will fail.

2) Chunk and embed lending documents

Split your policy text into chunks, generate embeddings, and store them in pgvector.

from sentence_transformers import SentenceTransformer
import psycopg

model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")

docs = [
    {
        "doc_name": "underwriting_policy",
        "text": "Minimum FICO score is 680 for unsecured personal loans. Debt-to-income ratio must be below 40%..."
    },
    {
        "doc_name": "product_terms",
        "text": "Loan amounts range from $5,000 to $50,000. Terms are available from 24 to 60 months..."
    }
]

def chunk_text(text, size=300):
    return [text[i:i+size] for i in range(0, len(text), size)]

conn = psycopg.connect("postgresql://postgres:postgres@localhost:5432/lending")

for doc in docs:
    for chunk in chunk_text(doc["text"]):
        embedding = model.encode(chunk).tolist()
        conn.execute(
            "INSERT INTO lending_chunks (doc_name, chunk_text, embedding) VALUES (%s, %s, %s)",
            (doc["doc_name"], chunk, embedding)
        )

conn.commit()
conn.close()

This is the ingestion side of RAG. In production, do this asynchronously and store source metadata like version, effective date, and jurisdiction.

3) Retrieve relevant context from pgvector

At query time, embed the user question and pull back the closest chunks using cosine distance.

from sentence_transformers import SentenceTransformer
import psycopg

model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")

def retrieve_context(question: str, limit: int = 3):
    q_emb = model.encode(question).tolist()

    conn = psycopg.connect("postgresql://postgres:postgres@localhost:5432/lending")
    rows = conn.execute(
        """
        SELECT doc_name, chunk_text
        FROM lending_chunks
        ORDER BY embedding <=> %s::vector
        LIMIT %s;
        """,
        (q_emb, limit)
    ).fetchall()
    conn.close()

    return rows

question = "What are the eligibility requirements for an unsecured personal loan?"
context_rows = retrieve_context(question)

for row in context_rows:
    print(row)

The <=> operator is the pgvector cosine distance operator. That’s the core retrieval primitive you’ll use for RAG.

4) Call Anthropic with retrieved context

Now send the retrieved chunks into Anthropic as grounded context. Use the Messages API and keep the prompt structured.

import os
from anthropic import Anthropic

client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

def answer_lending_question(question: str):
    context_rows = retrieve_context(question)

    context_block = "\n\n".join(
        [f"[Source: {doc_name}]\n{chunk_text}" for doc_name, chunk_text in context_rows]
    )

    response = client.messages.create(
        model="claude-3-5-sonnet-latest",
        max_tokens=400,
        temperature=0,
        messages=[
            {
                "role": "user",
                "content": f"""
You are a lending assistant.
Use only the provided policy context to answer.
If the context does not contain enough information, say so clearly.

Question:
{question}

Policy context:
{context_block}
"""
            }
        ]
    )

    return response.content[0].text

print(answer_lending_question("What are the eligibility requirements for an unsecured personal loan?"))

This is where Anthropic adds value: it synthesizes retrieved policy snippets into a clean answer without hallucinating outside your stored documents.

5) Add a production guardrail for citations

For lending systems, don’t ship answers without traceability. Return sources alongside the generated response so compliance teams can audit them.

def answer_with_sources(question: str):
    context_rows = retrieve_context(question)

    sources = [{"doc_name": d, "snippet": c[:200]} for d, c in context_rows]
    answer = answer_lending_question(question)

    return {
        "answer": answer,
        "sources": sources
    }

result = answer_with_sources("Can a borrower qualify with DTI above 40%?")
print(result["answer"])
print(result["sources"])

That pattern matters when underwriters or ops teams need to verify why an agent gave a specific recommendation.

Testing the Integration

Run a simple end-to-end check against a known policy question.

test_question = "What is the minimum FICO score for an unsecured personal loan?"
result = answer_with_sources(test_question)

print("ANSWER:")
print(result["answer"])
print("\nSOURCES:")
for src in result["sources"]:
    print(src["doc_name"], "-", src["snippet"])

Expected output:

ANSWER:
The minimum FICO score for an unsecured personal loan is 680 based on the underwriting policy provided.

SOURCES:
underwriting_policy - Minimum FICO score is 680 for unsecured personal loans...

If you get irrelevant sources back:

  • verify your chunking strategy
  • confirm embedding dimensions match your pgvector column
  • check that your query embedding model matches ingestion embeddings

Real-World Use Cases

  • Loan policy assistant
    • Answer underwriter questions about eligibility rules, exceptions, pricing bands, and document requirements with citations back to source policy.
  • Customer servicing agent
    • Let support bots explain repayment terms, late fee rules, payoff procedures, and hardship options grounded in approved product docs.
  • Pre-screening copilot
    • Combine customer inputs with retrieved lending criteria to flag likely qualification issues before formal application submission.

This stack works because each component does one job well. pgvector handles retrieval over private lending knowledge; Anthropic turns that retrieved context into usable answers with controlled language and better reasoning.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides