How to Integrate FastAPI for pension funds with PostgreSQL for RAG
Why this integration matters
If you’re building an AI agent for pension operations, you need two things working together: a stable API layer for business workflows and a durable retrieval store for policy, member records, and document context. FastAPI for pension funds gives you the HTTP surface for plan administration flows, while PostgreSQL gives you structured storage plus vector search for RAG.
That combination lets an agent answer questions like “What’s the withdrawal policy for this plan?” or “Summarize the member’s latest contribution history” without hardcoding logic into prompts.
Prerequisites
- •Python 3.11+
- •A running PostgreSQL 15+ instance
- •A PostgreSQL database with
pgvectorenabled - •FastAPI installed in your service
- •
psycopgorasyncpginstalled for database access - •
sqlalchemyif you want ORM-backed models - •Access to your FastAPI for pension funds app and its OpenAPI docs
- •A set of pension policy documents, FAQs, or plan rules to embed
Install the core packages:
pip install fastapi uvicorn psycopg[binary] sqlalchemy pgvector openai pydantic
Integration Steps
1) Create the PostgreSQL schema for RAG
Store pension documents as chunks with embeddings. For production, keep metadata separate from raw text so you can filter by plan, jurisdiction, or document type.
from sqlalchemy import create_engine, Column, Integer, Text, String, MetaData
from sqlalchemy.orm import declarative_base, Session
from pgvector.sqlalchemy import Vector
DATABASE_URL = "postgresql+psycopg://rag_user:rag_pass@localhost:5432/pension_rag"
engine = create_engine(DATABASE_URL)
Base = declarative_base()
class PensionChunk(Base):
__tablename__ = "pension_chunks"
id = Column(Integer, primary_key=True)
plan_id = Column(String(50), nullable=False)
source_type = Column(String(50), nullable=False) # policy | faq | statement
content = Column(Text, nullable=False)
embedding = Column(Vector(1536), nullable=False)
Base.metadata.create_all(engine)
This table is enough to support semantic retrieval with filters like plan_id='UK-PLAN-01'.
2) Add a FastAPI endpoint that accepts pension queries
Your FastAPI app should expose a route that receives the user question and returns an answer from the agent layer. The key is to keep the API thin and let retrieval happen before generation.
from fastapi import FastAPI
from pydantic import BaseModel
app = FastAPI(title="Pension Funds Agent API")
class QueryRequest(BaseModel):
plan_id: str
question: str
@app.post("/v1/pension/query")
async def query_pension_fund(payload: QueryRequest):
return {
"plan_id": payload.plan_id,
"question": payload.question,
"answer": "stub"
}
In a real deployment, this endpoint becomes the entry point for member-service bots, advisor tools, or internal ops assistants.
3) Embed documents and store them in PostgreSQL
Use your embedding model once during ingestion. Then write chunks into PostgreSQL with their vectors attached.
from openai import OpenAI
from sqlalchemy.orm import Session
client = OpenAI()
text = "Members may withdraw benefits only after reaching retirement age subject to vesting rules."
embedding_response = client.embeddings.create(
model="text-embedding-3-small",
input=text
)
vector = embedding_response.data[0].embedding
with Session(engine) as session:
row = PensionChunk(
plan_id="UK-PLAN-01",
source_type="policy",
content=text,
embedding=vector,
)
session.add(row)
session.commit()
For RAG systems in regulated environments, ingestion should be deterministic. Chunk sizes should be stable, and metadata should always include source traceability.
4) Retrieve relevant context from PostgreSQL inside your agent flow
When the user asks a question, embed the query and perform vector similarity search against stored chunks. With pgvector, this is straightforward.
from sqlalchemy import select
from sqlalchemy.orm import Session
query_text = "When can a member withdraw benefits?"
query_embedding = client.embeddings.create(
model="text-embedding-3-small",
input=query_text
).data[0].embedding
with Session(engine) as session:
stmt = (
select(PensionChunk)
.where(PensionChunk.plan_id == "UK-PLAN-01")
.order_by(PensionChunk.embedding.cosine_distance(query_embedding))
.limit(3)
)
results = session.execute(stmt).scalars().all()
context = "\n\n".join([r.content for r in results])
print(context)
That retrieved context becomes the grounding material for your LLM prompt. In practice, this is where most pension-agent failures are prevented: bad answers usually come from weak retrieval.
5) Wire retrieval into the FastAPI handler
Now connect the API endpoint to retrieval and response generation. This keeps your service stateless and easy to scale behind a load balancer.
from fastapi import FastAPI
from pydantic import BaseModel
from openai import OpenAI
app = FastAPI()
client = OpenAI()
class QueryRequest(BaseModel):
plan_id: str
question: str
@app.post("/v1/pension/query")
async def query_pension_fund(payload: QueryRequest):
q_embed = client.embeddings.create(
model="text-embedding-3-small",
input=payload.question
).data[0].embedding
with Session(engine) as session:
stmt = (
select(PensionChunk)
.where(PensionChunk.plan_id == payload.plan_id)
.order_by(PensionChunk.embedding.cosine_distance(q_embed))
.limit(3)
)
docs = session.execute(stmt).scalars().all()
context = "\n".join([d.content for d in docs])
completion = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "Answer using only the provided pension context."},
{"role": "user", "content": f"Context:\n{context}\n\nQuestion: {payload.question}"}
]
)
return {
"plan_id": payload.plan_id,
"answer": completion.choices[0].message.content,
"sources": [d.source_type for d in docs],
}
Testing the Integration
Run the API:
uvicorn main:app --reload --port 8000
Then verify both retrieval and response generation with a simple request:
import requests
resp = requests.post(
"http://localhost:8000/v1/pension/query",
json={
"plan_id": "UK-PLAN-01",
"question": "When can a member withdraw benefits?"
}
)
print(resp.status_code)
print(resp.json())
Expected output:
200
{
"plan_id": "UK-PLAN-01",
"answer": "...member may withdraw benefits after reaching retirement age...",
"sources": ["policy", "faq", "policy"]
}
If you get empty sources or irrelevant answers, check these first:
- •Embedding dimensions match your
Vector(1536)column size. - •
plan_idfilters are correct. - •Your chunks are not too large.
- •The OpenAI key and DB credentials are loaded correctly.
Real-World Use Cases
- •
Member self-service assistant
- •Answer benefit eligibility questions from policy documents and stored member data.
- •Reduce manual back-and-forth with operations teams.
- •
Advisor support workflow
- •Let advisors query plan rules, vesting schedules, and contribution policies through one API.
- •Return grounded answers with citations from PostgreSQL-stored sources.
- •
Internal compliance copilot
- •Search historical notices, fund policies, and regulatory updates.
- •Help compliance teams trace every answer back to source documents stored in Postgres.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit