How to Integrate FastAPI for pension funds with PostgreSQL for RAG

By Cyprian AaronsUpdated 2026-04-21

fastapi-for-pension-fundspostgresqlrag

Why this integration matters

If you’re building an AI agent for pension operations, you need two things working together: a stable API layer for business workflows and a durable retrieval store for policy, member records, and document context. FastAPI for pension funds gives you the HTTP surface for plan administration flows, while PostgreSQL gives you structured storage plus vector search for RAG.

That combination lets an agent answer questions like “What’s the withdrawal policy for this plan?” or “Summarize the member’s latest contribution history” without hardcoding logic into prompts.

Prerequisites

•Python 3.11+
•A running PostgreSQL 15+ instance
•A PostgreSQL database with pgvector enabled
•FastAPI installed in your service
•psycopg or asyncpg installed for database access
•sqlalchemy if you want ORM-backed models
•Access to your FastAPI for pension funds app and its OpenAPI docs
•A set of pension policy documents, FAQs, or plan rules to embed

Install the core packages:

pip install fastapi uvicorn psycopg[binary] sqlalchemy pgvector openai pydantic

Integration Steps

1) Create the PostgreSQL schema for RAG

Store pension documents as chunks with embeddings. For production, keep metadata separate from raw text so you can filter by plan, jurisdiction, or document type.

from sqlalchemy import create_engine, Column, Integer, Text, String, MetaData
from sqlalchemy.orm import declarative_base, Session
from pgvector.sqlalchemy import Vector

DATABASE_URL = "postgresql+psycopg://rag_user:rag_pass@localhost:5432/pension_rag"

engine = create_engine(DATABASE_URL)
Base = declarative_base()

class PensionChunk(Base):
    __tablename__ = "pension_chunks"

    id = Column(Integer, primary_key=True)
    plan_id = Column(String(50), nullable=False)
    source_type = Column(String(50), nullable=False)  # policy | faq | statement
    content = Column(Text, nullable=False)
    embedding = Column(Vector(1536), nullable=False)

Base.metadata.create_all(engine)

This table is enough to support semantic retrieval with filters like plan_id='UK-PLAN-01'.

2) Add a FastAPI endpoint that accepts pension queries

Your FastAPI app should expose a route that receives the user question and returns an answer from the agent layer. The key is to keep the API thin and let retrieval happen before generation.

from fastapi import FastAPI
from pydantic import BaseModel

app = FastAPI(title="Pension Funds Agent API")

class QueryRequest(BaseModel):
    plan_id: str
    question: str

@app.post("/v1/pension/query")
async def query_pension_fund(payload: QueryRequest):
    return {
        "plan_id": payload.plan_id,
        "question": payload.question,
        "answer": "stub"
    }

In a real deployment, this endpoint becomes the entry point for member-service bots, advisor tools, or internal ops assistants.

3) Embed documents and store them in PostgreSQL

Use your embedding model once during ingestion. Then write chunks into PostgreSQL with their vectors attached.

from openai import OpenAI
from sqlalchemy.orm import Session

client = OpenAI()
text = "Members may withdraw benefits only after reaching retirement age subject to vesting rules."

embedding_response = client.embeddings.create(
    model="text-embedding-3-small",
    input=text
)

vector = embedding_response.data[0].embedding

with Session(engine) as session:
    row = PensionChunk(
        plan_id="UK-PLAN-01",
        source_type="policy",
        content=text,
        embedding=vector,
    )
    session.add(row)
    session.commit()

For RAG systems in regulated environments, ingestion should be deterministic. Chunk sizes should be stable, and metadata should always include source traceability.

4) Retrieve relevant context from PostgreSQL inside your agent flow

When the user asks a question, embed the query and perform vector similarity search against stored chunks. With pgvector, this is straightforward.

from sqlalchemy import select
from sqlalchemy.orm import Session

query_text = "When can a member withdraw benefits?"
query_embedding = client.embeddings.create(
    model="text-embedding-3-small",
    input=query_text
).data[0].embedding

with Session(engine) as session:
    stmt = (
        select(PensionChunk)
        .where(PensionChunk.plan_id == "UK-PLAN-01")
        .order_by(PensionChunk.embedding.cosine_distance(query_embedding))
        .limit(3)
    )

    results = session.execute(stmt).scalars().all()

context = "\n\n".join([r.content for r in results])
print(context)

That retrieved context becomes the grounding material for your LLM prompt. In practice, this is where most pension-agent failures are prevented: bad answers usually come from weak retrieval.

5) Wire retrieval into the FastAPI handler

Now connect the API endpoint to retrieval and response generation. This keeps your service stateless and easy to scale behind a load balancer.

from fastapi import FastAPI
from pydantic import BaseModel
from openai import OpenAI

app = FastAPI()
client = OpenAI()

class QueryRequest(BaseModel):
    plan_id: str
    question: str

@app.post("/v1/pension/query")
async def query_pension_fund(payload: QueryRequest):
    q_embed = client.embeddings.create(
        model="text-embedding-3-small",
        input=payload.question
    ).data[0].embedding

    with Session(engine) as session:
        stmt = (
            select(PensionChunk)
            .where(PensionChunk.plan_id == payload.plan_id)
            .order_by(PensionChunk.embedding.cosine_distance(q_embed))
            .limit(3)
        )
        docs = session.execute(stmt).scalars().all()

    context = "\n".join([d.content for d in docs])

    completion = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "Answer using only the provided pension context."},
            {"role": "user", "content": f"Context:\n{context}\n\nQuestion: {payload.question}"}
        ]
    )

    return {
        "plan_id": payload.plan_id,
        "answer": completion.choices[0].message.content,
        "sources": [d.source_type for d in docs],
    }

Testing the Integration

Run the API:

uvicorn main:app --reload --port 8000

Then verify both retrieval and response generation with a simple request:

import requests

resp = requests.post(
    "http://localhost:8000/v1/pension/query",
    json={
        "plan_id": "UK-PLAN-01",
        "question": "When can a member withdraw benefits?"
    }
)

print(resp.status_code)
print(resp.json())

Expected output:

200
{
  "plan_id": "UK-PLAN-01",
  "answer": "...member may withdraw benefits after reaching retirement age...",
  "sources": ["policy", "faq", "policy"]
}

If you get empty sources or irrelevant answers, check these first:

•Embedding dimensions match your Vector(1536) column size.
•plan_id filters are correct.
•Your chunks are not too large.
•The OpenAI key and DB credentials are loaded correctly.

Real-World Use Cases

•
Member self-service assistant
- •Answer benefit eligibility questions from policy documents and stored member data.
- •Reduce manual back-and-forth with operations teams.
•
Advisor support workflow
- •Let advisors query plan rules, vesting schedules, and contribution policies through one API.
- •Return grounded answers with citations from PostgreSQL-stored sources.
•
Internal compliance copilot
- •Search historical notices, fund policies, and regulatory updates.
- •Help compliance teams trace every answer back to source documents stored in Postgres.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit