How to Integrate LlamaIndex for insurance with Supabase for RAG

By Cyprian AaronsUpdated 2026-04-22
llamaindex-for-insurancesupabaserag

Combining LlamaIndex for insurance with Supabase gives you a practical RAG stack for policy search, claims triage, and underwriting support. LlamaIndex handles document ingestion, chunking, retrieval, and query orchestration; Supabase gives you Postgres-backed storage with pgvector for persistent embeddings and metadata filters.

Prerequisites

  • Python 3.10+
  • A Supabase project with:
    • a database URL
    • an API key
    • pgvector enabled
  • Access to your insurance documents:
    • policy wordings
    • claims manuals
    • underwriting guidelines
    • broker FAQs
  • An embedding model provider configured for LlamaIndex:
    • OpenAI, Azure OpenAI, or another supported embedder
  • Installed packages:
    • llama-index
    • llama-index-vector-stores-supabase
    • supabase
    • python-dotenv

Install them:

pip install llama-index llama-index-vector-stores-supabase supabase python-dotenv

Integration Steps

  1. Create the Supabase vector table

You need a table that can store chunks, metadata, and embeddings. In Supabase SQL editor, run this:

create extension if not exists vector;

create table if not exists insurance_docs (
  id bigserial primary key,
  content text,
  metadata jsonb,
  embedding vector(1536)
);

create index if not exists insurance_docs_embedding_idx
on insurance_docs using ivfflat (embedding vector_cosine_ops)
with (lists = 100);

If your embedding model uses a different dimension, change 1536 to match it.

  1. Connect to Supabase from Python

Use the Supabase client for auth and basic connectivity. Keep secrets in environment variables.

import os
from dotenv import load_dotenv
from supabase import create_client, Client

load_dotenv()

SUPABASE_URL = os.environ["SUPABASE_URL"]
SUPABASE_KEY = os.environ["SUPABASE_SERVICE_ROLE_KEY"]

supabase: Client = create_client(SUPABASE_URL, SUPABASE_KEY)

# simple smoke test
response = supabase.table("insurance_docs").select("id").limit(1).execute()
print(response.data)

For production agents, use the service role key only on the backend. Never ship it to the browser.

  1. Configure LlamaIndex with a Supabase vector store

LlamaIndex can write embeddings directly into Supabase through its vector store integration.

import os
from dotenv import load_dotenv

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, StorageContext
from llama_index.core.embeddings import resolve_embed_model
from llama_index.vector_stores.supabase import SupabaseVectorStore

load_dotenv()

db_url = os.environ["SUPABASE_DB_URL"]   # postgres connection string
db_password = os.environ["SUPABASE_DB_PASSWORD"]

vector_store = SupabaseVectorStore(
    postgres_connection_string=db_url,
    collection_name="insurance_docs",
    dimension=1536,
    password=db_password,
)

embed_model = resolve_embed_model("local:BAAI/bge-small-en-v1.5")
storage_context = StorageContext.from_defaults(vector_store=vector_store)

This pattern works well when you want persistent retrieval across multiple agents or workflows.

  1. Ingest insurance documents into the index

Load PDFs or text files, chunk them, embed them, and persist them in Supabase.

from llama_index.core.node_parser import SentenceSplitter

documents = SimpleDirectoryReader("./insurance_policies").load_data()

index = VectorStoreIndex.from_documents(
    documents,
    storage_context=storage_context,
    embed_model=embed_model,
    transformations=[SentenceSplitter(chunk_size=512, chunk_overlap=64)],
)

print("Indexed documents:", len(documents))

If you have structured metadata like policy_type, region, or effective_date, attach it before indexing so retrieval can filter on business rules later.

  1. Query the index with RAG

Now build a retriever-backed query engine that pulls relevant policy sections from Supabase and generates answers.

query_engine = index.as_query_engine(similarity_top_k=3)

response = query_engine.query(
    "Does this travel policy cover emergency medical evacuation?"
)

print(response)

For insurance workflows, keep prompts grounded. Ask for citations or source nodes so adjusters and underwriters can trace answers back to policy text.

Testing the Integration

Run a direct retrieval test first. This checks that embeddings were stored and that similarity search returns relevant chunks.

retriever = index.as_retriever(similarity_top_k=2)
nodes = retriever.retrieve("water damage exclusion in homeowners policy")

for node in nodes:
    print("SCORE:", node.score)
    print("TEXT:", node.node.text[:200])
    print("META:", node.node.metadata)
    print("---")

Expected output looks like this:

SCORE: 0.84
TEXT: Water damage caused by gradual seepage is excluded unless...
META: {'policy_type': 'homeowners', 'region': 'US'}
---
SCORE: 0.79
TEXT: This policy does not cover losses arising from flooding...
META: {'policy_type': 'homeowners', 'region': 'US'}
---

If you get empty results:

  • confirm the table name matches collection_name
  • verify the embedding dimension matches your model
  • check that your documents were actually loaded from disk

Real-World Use Cases

  • Claims intake assistant

    • Retrieve relevant coverage clauses from policy docs while a claims handler asks questions.
    • Use metadata filters like line of business or jurisdiction to narrow results.
  • Underwriting copilot

    • Search underwriting manuals and referral rules before approving edge-case risks.
    • Keep answers tied to source passages so underwriters can audit decisions.
  • Broker support bot

    • Answer questions about endorsements, exclusions, waiting periods, and renewal terms.
    • Persist all indexed content in Supabase so multiple bots share one retrieval layer.

This setup is simple enough to ship quickly and solid enough for regulated workflows. LlamaIndex handles retrieval logic; Supabase gives you durable storage and operational control; together they form a clean RAG backbone for insurance agents.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides