How to Integrate Azure OpenAI for insurance with CosmosDB for RAG

By Cyprian AaronsUpdated 2026-04-21
azure-openai-for-insurancecosmosdbrag

Azure OpenAI gives you the generation layer; Cosmos DB gives you the retrieval layer. For insurance agents, that combination is what turns a generic chatbot into a policy-aware assistant that can answer from claims docs, endorsements, underwriting guidelines, and FAQ content with traceable context.

If you are building an AI agent system for insurance, RAG is the right pattern. You store approved knowledge in Cosmos DB, retrieve the relevant chunks at query time, and pass them into Azure OpenAI to generate grounded answers.

Prerequisites

  • An Azure subscription with:
    • Azure OpenAI resource
    • Azure Cosmos DB account
  • A deployed Azure OpenAI model:
    • Chat model like gpt-4o-mini or gpt-4.1-mini
    • Embedding model like text-embedding-3-small
  • A Cosmos DB database and container for your policy/claims documents
  • Python 3.10+
  • Installed packages:
    • azure-ai-openai
    • azure-cosmos
    • azure-identity
    • python-dotenv
  • Environment variables set:
    • AZURE_OPENAI_ENDPOINT
    • AZURE_OPENAI_API_KEY
    • AZURE_OPENAI_CHAT_DEPLOYMENT
    • AZURE_OPENAI_EMBEDDING_DEPLOYMENT
    • COSMOS_ENDPOINT
    • COSMOS_KEY
    • COSMOS_DATABASE_NAME
    • COSMOS_CONTAINER_NAME

Integration Steps

  1. Set up your clients and configuration.

Use the Azure SDK clients directly so your agent can call both services from one runtime.

import os
from dotenv import load_dotenv
from azure.ai.openai import OpenAIClient
from azure.core.credentials import AzureKeyCredential
from azure.cosmos import CosmosClient

load_dotenv()

azure_openai_client = OpenAIClient(
    endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
    credential=AzureKeyCredential(os.environ["AZURE_OPENAI_API_KEY"])
)

cosmos_client = CosmosClient(
    url=os.environ["COSMOS_ENDPOINT"],
    credential=os.environ["COSMOS_KEY"]
)

database = cosmos_client.get_database_client(os.environ["COSMOS_DATABASE_NAME"])
container = database.get_container_client(os.environ["COSMOS_CONTAINER_NAME"])
  1. Create embeddings for your insurance content.

For RAG, each document chunk needs an embedding before you store it in Cosmos DB. In insurance, chunk at the clause or paragraph level, not full PDFs.

def embed_text(text: str) -> list[float]:
    response = azure_openai_client.embeddings.create(
        model=os.environ["AZURE_OPENAI_EMBEDDING_DEPLOYMENT"],
        input=text
    )
    return response.data[0].embedding


policy_chunk = {
    "id": "policy_123_clause_7",
    "docType": "policy",
    "title": "Water Damage Exclusion",
    "content": "This policy excludes damage caused by gradual seepage over time.",
}

policy_chunk["embedding"] = embed_text(policy_chunk["content"])
  1. Store documents in Cosmos DB with vector-friendly metadata.

Cosmos DB becomes your retrieval store. Keep the original text, metadata for filtering, and the embedding vector in the same item.

container.upsert_item({
    "id": policy_chunk["id"],
    "docType": policy_chunk["docType"],
    "title": policy_chunk["title"],
    "content": policy_chunk["content"],
    "embedding": policy_chunk["embedding"],
})

If you are using Cosmos DB vector search, make sure your container is configured with a vector index and a vector policy that matches your embedding dimensions.

  1. Retrieve relevant chunks from Cosmos DB at query time.

At query time, embed the user question, then run a similarity search against Cosmos DB. In production, add filters like line of business, jurisdiction, or effective date.

def retrieve_context(query: str, top_k: int = 3) -> list[str]:
    query_embedding = embed_text(query)

    sql_query = """
    SELECT TOP @top_k c.id, c.title, c.content
    FROM c
    ORDER BY VectorDistance(c.embedding, @query_embedding)
    """

    params = [
        {"name": "@top_k", "value": top_k},
        {"name": "@query_embedding", "value": query_embedding}
    ]

    items = list(container.query_items(
        query=sql_query,
        parameters=params,
        enable_cross_partition_query=True
    ))

    return [f"{item['title']}: {item['content']}" for item in items]
  1. Send retrieved context to Azure OpenAI for grounded answers.

Now build the prompt with retrieved evidence and ask Azure OpenAI to answer only from that context. This is where your insurance agent becomes useful instead of hallucinating.

def answer_question(question: str) -> str:
    context_chunks = retrieve_context(question)
    context_text = "\n\n".join(context_chunks)

    messages = [
        {
            "role": "system",
            "content": (
                "You are an insurance assistant. Answer only from the provided context. "
                "If the context does not contain enough information, say so."
            )
        },
        {
            "role": "user",
            "content": f"""Context:
{context_text}

Question:
{question}"""
        }
    ]

    response = azure_openai_client.chat.completions.create(
        model=os.environ["AZURE_OPENAI_CHAT_DEPLOYMENT"],
        messages=messages,
        temperature=0.2,
        max_tokens=300
    )

    return response.choices[0].message.content

Testing the Integration

Run a direct question against your RAG pipeline and verify that the answer comes from stored insurance content.

if __name__ == "__main__":
    question = "Does this policy cover gradual water seepage?"
    answer = answer_question(question)

    print("QUESTION:", question)
    print("ANSWER:", answer)

Expected output:

QUESTION: Does this policy cover gradual water seepage?
ANSWER: The provided policy context excludes damage caused by gradual seepage over time.

If you get a vague answer or no mention of exclusion language, check these first:

  • Your chunking is too large or too small
  • The embedding field is missing in Cosmos DB
  • Vector search is not enabled on the container
  • The system prompt allows unsupported speculation

Real-World Use Cases

  • Claims intake assistant that checks claim notes against policy wording before routing to adjusters.
  • Underwriting copilot that retrieves product rules, appetite guides, and jurisdiction-specific exclusions.
  • Customer service agent that answers coverage questions using approved policy documents instead of free-form model memory.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides