How to Integrate FastAPI for retail banking with PostgreSQL for RAG

By Cyprian AaronsUpdated 2026-04-21
fastapi-for-retail-bankingpostgresqlrag

FastAPI for retail banking gives you a clean way to expose banking workflows as APIs. PostgreSQL gives you durable storage for customer profiles, transaction history, and embeddings for retrieval. Put them together and you get a solid backend for RAG agents that can answer account questions, surface relevant policy snippets, and pull structured banking data without bolting on a separate search stack.

Prerequisites

  • Python 3.10+
  • FastAPI installed and configured
  • PostgreSQL 14+ running locally or in your cloud environment
  • psycopg2-binary or psycopg installed
  • sqlalchemy installed if you want ORM-backed access
  • uvicorn for serving the API
  • A PostgreSQL database with:
    • a customers table
    • a transactions table
    • a documents or embeddings table for RAG chunks
  • Banking API credentials or service account access if your FastAPI layer talks to internal retail banking services
  • Optional but recommended:
    • pgvector extension enabled in PostgreSQL for vector search

Integration Steps

1) Define the FastAPI service boundary

Start by exposing only the banking operations your agent actually needs. In retail banking, that usually means customer lookup, account summary, and transaction retrieval.

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from typing import List

app = FastAPI(title="Retail Banking RAG API")

class CustomerQuery(BaseModel):
    customer_id: str

@app.get("/customers/{customer_id}")
def get_customer(customer_id: str):
    # Replace with real bank service call or repository lookup
    if customer_id != "cust_001":
        raise HTTPException(status_code=404, detail="Customer not found")
    return {"customer_id": customer_id, "name": "Amina Patel", "segment": "retail"}

@app.get("/transactions/{customer_id}")
def get_transactions(customer_id: str):
    return {
        "customer_id": customer_id,
        "transactions": [
            {"id": "txn_1", "amount": -120.50, "description": "Grocery Store"},
            {"id": "txn_2", "amount": 2500.00, "description": "Salary Credit"},
        ],
    }

Keep this layer thin. The agent should call stable endpoints like /customers/{id} and /transactions/{id}, not internal database tables directly.

2) Connect PostgreSQL with SQLAlchemy

Use PostgreSQL as the system of record for both structured banking data and RAG artifacts. For production, SQLAlchemy gives you pooling and predictable query behavior.

from sqlalchemy import create_engine, text
from sqlalchemy.orm import sessionmaker

POSTGRES_URL = "postgresql+psycopg2://bank_user:bank_pass@localhost:5432/banking_rag"

engine = create_engine(
    POSTGRES_URL,
    pool_size=10,
    max_overflow=20,
    pool_pre_ping=True,
)

SessionLocal = sessionmaker(bind=engine)

def fetch_customer_profile(customer_id: str):
    with SessionLocal() as session:
        row = session.execute(
            text("""
                SELECT customer_id, full_name, segment
                FROM customers
                WHERE customer_id = :customer_id
            """),
            {"customer_id": customer_id},
        ).mappings().first()

        return dict(row) if row else None

For RAG, store chunked policy text or product docs in PostgreSQL too. If you enable pgvector, keep embeddings in the same database so retrieval stays close to the source data.

3) Store document chunks and embeddings for retrieval

This is where PostgreSQL becomes part of the RAG pipeline. You insert normalized document chunks and their vectors, then retrieve top matches at query time.

from sqlalchemy import Column, String, Text, Integer
from sqlalchemy.orm import declarative_base
from pgvector.sqlalchemy import Vector

Base = declarative_base()

class KnowledgeChunk(Base):
    __tablename__ = "knowledge_chunks"

    id = Column(Integer, primary_key=True)
    doc_id = Column(String(64), nullable=False)
    chunk_text = Column(Text, nullable=False)
    embedding = Column(Vector(1536), nullable=False)

def insert_chunk(session, doc_id: str, chunk_text: str, embedding: list[float]):
    chunk = KnowledgeChunk(
        doc_id=doc_id,
        chunk_text=chunk_text,
        embedding=embedding,
    )
    session.add(chunk)
    session.commit()

A typical flow is:

  • split policy docs into chunks
  • generate embeddings with your model provider
  • store them in knowledge_chunks
  • query nearest neighbors when the agent asks a question

4) Add a retrieval endpoint that joins bank data with context

Now wire FastAPI to PostgreSQL so the agent can fetch both structured account data and relevant knowledge base context in one request path.

from fastapi import Depends

def get_db():
    db = SessionLocal()
    try:
        yield db
    finally:
        db.close()

@app.get("/rag/context/{customer_id}")
def rag_context(customer_id: str, db=Depends(get_db)):
    customer = db.execute(
        text("SELECT customer_id, full_name, segment FROM customers WHERE customer_id=:customer_id"),
        {"customer_id": customer_id},
    ).mappings().first()

    if not customer:
        raise HTTPException(status_code=404, detail="Customer not found")

    transactions = db.execute(
        text("""
            SELECT id, amount, description
            FROM transactions
            WHERE customer_id=:customer_id
            ORDER BY created_at DESC
            LIMIT 5
        """),
        {"customer_id": customer_id},
    ).mappings().all()

    return {
        "customer": dict(customer),
        "recent_transactions": [dict(t) for t in transactions],
    }

In an agent system, this endpoint becomes a tool call. The LLM gets concise structured facts instead of guessing from raw prompts.

5) Serve the API and connect your agent runtime

Run FastAPI with Uvicorn and point your agent runtime at these endpoints. This keeps the retrieval path explicit and observable.

# app.py already contains the routes above

# Run from terminal:
# uvicorn app:app --host 0.0.0.0 --port 8000 --reload

If your agent framework supports tool calling, define tools around these endpoints rather than embedding SQL in prompts. That keeps authentication, retries, and audit logging inside the API boundary.

Testing the Integration

Use FastAPI’s test client to verify that PostgreSQL-backed routes return usable data.

from fastapi.testclient import TestClient

client = TestClient(app)

def test_rag_context():
    response = client.get("/rag/context/cust_001")
    assert response.status_code == 200

    payload = response.json()
    assert payload["customer"]["customer_id"] == "cust_001"
    assert len(payload["recent_transactions"]) > 0

print(client.get("/customers/cust_001").json())
print(client.get("/rag/context/cust_001").json())

Expected output:

{
  "customer_id": "cust_001",
  "name": "Amina Patel",
  "segment": "retail"
}
{
  "customer": {
    "customer_id": "cust_001",
    "full_name": "Amina Patel",
    "segment": "retail"
  },
  "recent_transactions": [
    {
      "id": 1,
      "amount": -120.5,
      "description": "Grocery Store"
    }
  ]
}

Real-World Use Cases

  • Customer support copilot
    • Answer balance-related questions by combining live account data from FastAPI with policy snippets retrieved from PostgreSQL.
  • Fraud triage assistant
    • Pull recent transactions from PostgreSQL and surface related internal playbooks for suspicious activity review.
  • Product recommendation agent
    • Use customer segment data plus stored product knowledge to recommend savings accounts, cards, or loan products with grounded explanations.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides