How to Integrate FastAPI for insurance with PostgreSQL for RAG

By Cyprian AaronsUpdated 2026-04-21

fastapi-for-insurancepostgresqlrag

FastAPI for insurance gives you a clean API layer for policy, claims, and underwriting workflows. PostgreSQL gives you durable storage for documents, embeddings, and retrieval state. Put them together and you get a practical RAG backend for insurance agents that can answer policy questions, retrieve claim history, and ground responses in internal records instead of guessing.

Prerequisites

•Python 3.10+
•FastAPI installed and running
•PostgreSQL 14+
•A PostgreSQL database with the pgvector extension enabled
•psycopg or asyncpg for database access
•An embedding model provider, such as OpenAI or a local embedding service
•
Insurance data sources ready to ingest:
- •policy documents
- •claims notes
- •underwriting guidelines
- •customer correspondence

Install the core packages:

pip install fastapi uvicorn psycopg[binary] sqlalchemy pgvector openai python-dotenv

Integration Steps

•Set up FastAPI with a database connection pool.

Use FastAPI to expose endpoints for ingestion and retrieval. For production, keep the PostgreSQL connection in the app lifecycle so every request does not create a new connection.

from contextlib import asynccontextmanager
from fastapi import FastAPI
import psycopg
from psycopg_pool import AsyncConnectionPool

DATABASE_URL = "postgresql://postgres:postgres@localhost:5432/insurance_rag"

pool = AsyncConnectionPool(conninfo=DATABASE_URL, min_size=1, max_size=10)

@asynccontextmanager
async def lifespan(app: FastAPI):
    await pool.open()
    yield
    await pool.close()

app = FastAPI(lifespan=lifespan)

•Create the PostgreSQL schema for RAG storage.

Store chunks of insurance content plus embeddings in PostgreSQL. If you are using pgvector, the vector column becomes your retrieval index.

import psycopg

DDL = """
CREATE EXTENSION IF NOT EXISTS vector;

CREATE TABLE IF NOT EXISTS insurance_chunks (
    id BIGSERIAL PRIMARY KEY,
    doc_type TEXT NOT NULL,
    doc_id TEXT NOT NULL,
    chunk ტექXT NOT NULL,
    embedding VECTOR(1536) NOT NULL,
    metadata JSONB DEFAULT '{}'::jsonb,
    created_at TIMESTAMPTZ DEFAULT NOW()
);
"""

with psycopg.connect(DATABASE_URL) as conn:
    with conn.cursor() as cur:
        cur.execute(DDL)
    conn.commit()

If your embedding dimension differs, adjust VECTOR(1536) to match your model.

•Ingest insurance documents from FastAPI into PostgreSQL.

This is the part that turns an API into a RAG system. Your FastAPI endpoint receives a policy document or claim note, chunks it, generates embeddings, and writes both content and vectors to PostgreSQL.

from fastapi import UploadFile, File
from openai import OpenAI

client = OpenAI()

def embed_text(text: str) -> list[float]:
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=text,
    )
    return response.data[0].embedding

@app.post("/ingest")
async def ingest_document(doc_type: str, doc_id: str, file: UploadFile = File(...)):
    raw_text = (await file.read()).decode("utf-8")

    chunks = [raw_text[i:i+1000] for i in range(0, len(raw_text), 1000)]

    async with pool.connection() as conn:
        async with conn.cursor() as cur:
            for chunk in chunks:
                emb = embed_text(chunk)
                await cur.execute(
                    """
                    INSERT INTO insurance_chunks (doc_type, doc_id, chunk, embedding)
                    VALUES (%s, %s, %s, %s)
                    """,
                    (doc_type, doc_id, chunk, emb),
                )
        await conn.commit()

    return {"status": "ok", "chunks_saved": len(chunks)}

•Add a retrieval endpoint that queries PostgreSQL by vector similarity.

For RAG, retrieval is the critical step. The query comes in through FastAPI, gets embedded, and then PostgreSQL returns the closest matching insurance chunks.

from pydantic import BaseModel

class QueryRequest(BaseModel):
    question: str
    top_k: int = 3

@app.post("/retrieve")
async def retrieve(req: QueryRequest):
    q_emb = embed_text(req.question)

    sql = """
        SELECT doc_type, doc_id, chunk,
               1 - (embedding <=> %s::vector) AS similarity
        FROM insurance_chunks
        ORDER BY embedding <=> %s::vector
        LIMIT %s;
    """

    async with pool.connection() as conn:
        async with conn.cursor(row_factory=psycopg.rows.dict_row) as cur:
            await cur.execute(sql, (q_emb, q_emb))
            rows = await cur.fetchall()

    return {"matches": rows[:req.top_k]}

•Wrap retrieval into an answer endpoint for your agent.

This is where the system becomes useful to downstream AI agents. The agent can call FastAPI /answer, get grounded context from PostgreSQL, and generate a response using only retrieved insurance evidence.

from openai import OpenAI

llm_client = OpenAI()

@app.post("/answer")
async def answer(req: QueryRequest):
    retrieved = await retrieve(req)
    context = "\n\n".join(
        f"[{r['doc_type']}:{r['doc_id']}] {r['chunk']}"
        for r in retrieved["matches"]
    )

    prompt = f"""
You are an insurance assistant.
Answer only from the context below.

Context:
{context}

Question:
{req.question}
"""

    completion = llm_client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
    )

    return {
        "answer": completion.choices[0].message.content,
        "sources": retrieved["matches"],
    }

Testing the Integration

Start the app:

uvicorn main:app --reload

Then test ingestion and retrieval:

import requests

with open("sample_policy.txt", "rb") as f:
    r1 = requests.post(
        "http://localhost:8000/ingest",
        params={"doc_type": "policy", "doc_id": "POL123"},
        files={"file": f},
    )
print(r1.json())

r2 = requests.post(
    "http://localhost:8000/retrieve",
    json={"question": "Does this policy cover water damage?", "top_k": 2},
)
print(r2.json())

Expected output:

{
  "status": "ok",
  "chunks_saved": 4
}

{
  "matches": [
    {
      "doc_type": "policy",
      "doc_id": "POL123",
      "chunk": "...",
      "similarity": 0.87
    }
  ]
}

Real-World Use Cases

•Policy Q&A agents that answer coverage questions using stored policy language and endorsements.
•Claims triage assistants that retrieve prior claim notes before routing cases to adjusters.
•Underwriting copilots that pull risk rules and historical decisions from PostgreSQL before making recommendations.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit