How to Integrate Azure OpenAI for pension funds with CosmosDB for RAG

By Cyprian AaronsUpdated 2026-04-21

azure-openai-for-pension-fundscosmosdbrag

Why this integration matters

Pension fund teams need answers that are grounded in policy documents, member communications, actuarial notes, and regulatory guidance. Azure OpenAI gives you the reasoning layer, while CosmosDB gives you a durable retrieval layer for RAG so your agent can answer with context pulled from approved internal data instead of guessing.

Prerequisites

•
Azure subscription with:
- •Azure OpenAI resource deployed
- •Cosmos DB account provisioned
•An Azure OpenAI model deployment name, for example gpt-4o-mini
•A Cosmos DB for NoSQL database and container for document chunks
•Python 3.10+
•
Installed packages:
- •openai
- •azure-cosmos
- •python-dotenv
•
Environment variables configured:
- •AZURE_OPENAI_ENDPOINT
- •AZURE_OPENAI_API_KEY
- •AZURE_OPENAI_API_VERSION
- •AZURE_OPENAI_DEPLOYMENT
- •COSMOS_ENDPOINT
- •COSMOS_KEY
- •COSMOS_DATABASE_NAME
- •COSMOS_CONTAINER_NAME

Integration Steps

•Set up the clients

Start by wiring both SDKs into the same Python service. Keep credentials in environment variables and fail fast if anything is missing.

import os
from dotenv import load_dotenv
from openai import AzureOpenAI
from azure.cosmos import CosmosClient

load_dotenv()

azure_openai_client = AzureOpenAI(
    api_key=os.environ["AZURE_OPENAI_API_KEY"],
    api_version=os.environ["AZURE_OPENAI_API_VERSION"],
    azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
)

cosmos_client = CosmosClient(
    url=os.environ["COSMOS_ENDPOINT"],
    credential=os.environ["COSMOS_KEY"]
)

database = cosmos_client.get_database_client(os.environ["COSMOS_DATABASE_NAME"])
container = database.get_container_client(os.environ["COSMOS_CONTAINER_NAME"])

•Create a chunk schema in CosmosDB

For RAG, store chunks as small retrievable units with metadata. For pension funds, that metadata matters: document type, jurisdiction, effective date, and policy version.

from datetime import datetime

def upsert_chunk(doc_id: str, chunk_id: str, text: str, embedding: list[float], metadata: dict):
    item = {
        "id": chunk_id,
        "docId": doc_id,
        "text": text,
        "embedding": embedding,
        "metadata": metadata,
        "createdAt": datetime.utcnow().isoformat()
    }
    container.upsert_item(item)

sample_metadata = {
    "source": "member_benefits_policy.pdf",
    "documentType": "policy",
    "jurisdiction": "ZA",
    "effectiveDate": "2025-01-01",
    "version": "3.2"
}

•Generate embeddings with Azure OpenAI and store them in CosmosDB

Use the embeddings endpoint to turn each chunk into a vector. In production, chunk before embedding; don’t send entire policy manuals as one prompt.

def embed_text(text: str) -> list[float]:
    response = azure_openai_client.embeddings.create(
        model="text-embedding-3-large",
        input=text
    )
    return response.data[0].embedding

chunk_text = (
    "Members may request pension benefit statements quarterly. "
    "Early withdrawal is subject to board approval and statutory limits."
)

embedding = embed_text(chunk_text)
upsert_chunk(
    doc_id="policy-001",
    chunk_id="policy-001-chunk-01",
    text=chunk_text,
    embedding=embedding,
    metadata=sample_metadata
)

•Retrieve relevant context from CosmosDB for RAG

CosmosDB supports vector search through its query APIs when configured for vector indexing. Pull the top matches first, then pass only those chunks into the model.

def retrieve_chunks(query_embedding: list[float], top_k: int = 3):
    query = """
    SELECT TOP @top_k c.id, c.text, c.metadata
    FROM c
    ORDER BY VectorDistance(c.embedding, @query_embedding)
    """

    params = [
        {"name": "@top_k", "value": top_k},
        {"name": "@query_embedding", "value": query_embedding},
    ]

    items = list(container.query_items(
        query=query,
        parameters=params,
        enable_cross_partition_query=True
    ))
    return items

user_question = "Can a member request a statement every quarter?"
question_embedding = embed_text(user_question)
chunks = retrieve_chunks(question_embedding)

•Call Azure OpenAI with retrieved context

Now build the prompt from the retrieved chunks and ask Azure OpenAI to answer only from that context. This is the part that makes it usable for pension operations and compliance workflows.

def answer_with_rag(question: str, retrieved_chunks: list[dict]) -> str:
    context = "\n\n".join(
        f"[Source: {c['metadata']['source']}]\n{c['text']}"
        for c in retrieved_chunks
    )

    messages = [
        {
            "role": "system",
            "content": (
                "You are a pension fund assistant. "
                "Answer only using the provided context. "
                "If the context is insufficient, say so."
            )
        },
        {
            "role": "user",
            "content": f"Context:\n{context}\n\nQuestion: {question}"
        }
    ]

    response = azure_openai_client.chat.completions.create(
        model=os.environ["AZURE_OPENAI_DEPLOYMENT"],
        messages=messages,
        temperature=0.1
    )
    return response.choices[0].message.content

answer = answer_with_rag(user_question, chunks)
print(answer)

Testing the Integration

Use one known policy question and verify that retrieval returns a relevant chunk before you trust the generation step.

test_question = "How often can members request pension benefit statements?"
test_embedding = embed_text(test_question)
results = retrieve_chunks(test_embedding, top_k=2)

print("Retrieved chunks:", len(results))
for r in results:
    print(r["id"], r["metadata"]["source"])

final_answer = answer_with_rag(test_question, results)
print("\nAnswer:\n", final_answer)

Expected output:

Retrieved chunks: 2
policy-001-chunk-01 member_benefits_policy.pdf

Answer:
Members may request pension benefit statements quarterly...

If retrieval returns zero relevant chunks or the answer starts inventing policy details, fix your chunking strategy or tighten your system prompt.

Real-World Use Cases

•
Member self-service assistant
- •Answer questions about contribution rules, vesting periods, retirement options, and statement frequency using approved fund documents.
•
Compliance support bot
- •Help internal teams find regulator-aligned language across policies, circulars, and board resolutions without manually searching PDFs.
•
Advisor knowledge assistant
- •Surface product rules, eligibility criteria, and fund-specific exceptions so advisors can respond faster with fewer escalations.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit