How to Integrate Azure OpenAI for wealth management with CosmosDB for RAG

By Cyprian AaronsUpdated 2026-04-21

azure-openai-for-wealth-managementcosmosdbrag

Azure OpenAI gives you the reasoning layer for wealth management assistants: summarizing portfolios, answering policy questions, and drafting client-ready responses. CosmosDB gives you the retrieval layer: low-latency storage for client documents, market notes, suitability rules, and conversation memory that your agent can pull into context before generating an answer.

Prerequisites

•Python 3.10+
•
An Azure subscription with:
- •Azure OpenAI resource
- •Azure Cosmos DB account
•
Deployed Azure OpenAI model:
- •Chat model deployment name, for example gpt-4o-mini
- •Embedding model deployment name, for example text-embedding-3-small
•A Cosmos DB database and container for RAG documents
•
Environment variables set:
- •AZURE_OPENAI_ENDPOINT
- •AZURE_OPENAI_API_KEY
- •AZURE_COSMOS_ENDPOINT
- •AZURE_COSMOS_KEY
- •AZURE_COSMOS_DATABASE
- •AZURE_COSMOS_CONTAINER
•
Python packages:
- •openai
- •azure-cosmos
- •python-dotenv

Install them:

pip install openai azure-cosmos python-dotenv

Integration Steps

1) Initialize Azure OpenAI and CosmosDB clients

Keep the clients separate. Azure OpenAI handles generation and embeddings; CosmosDB handles document storage and retrieval.

import os
from dotenv import load_dotenv
from openai import AzureOpenAI
from azure.cosmos import CosmosClient

load_dotenv()

aoai_client = AzureOpenAI(
    api_key=os.environ["AZURE_OPENAI_API_KEY"],
    api_version="2024-02-15-preview",
    azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
)

cosmos_client = CosmosClient(
    url=os.environ["AZURE_COSMOS_ENDPOINT"],
    credential=os.environ["AZURE_COSMOS_KEY"],
)

database_name = os.environ["AZURE_COSMOS_DATABASE"]
container_name = os.environ["AZURE_COSMOS_CONTAINER"]

db = cosmos_client.get_database_client(database_name)
container = db.get_container_client(container_name)

2) Create embeddings for wealth management content

For RAG, you need embeddings for documents like IPS statements, product sheets, and compliance guidance. Store the vectors alongside the source text in CosmosDB.

def embed_text(text: str) -> list[float]:
    response = aoai_client.embeddings.create(
        model="text-embedding-3-small",
        input=text,
    )
    return response.data[0].embedding


sample_doc = {
    "id": "wm-policy-001",
    "clientId": "client-123",
    "docType": "policy",
    "title": "Discretionary Portfolio Risk Policy",
    "content": (
        "Client portfolios must maintain a maximum equity allocation of 60% "
        "for conservative profiles unless explicitly approved by the investment committee."
    ),
}

sample_doc["embedding"] = embed_text(sample_doc["content"])

3) Write documents into CosmosDB

Use CosmosDB as your retrieval store. In production, include partitioning by tenant or client ID so reads stay efficient.

# If your container does not exist yet, create it once during setup.
# db.create_container_if_not_exists(
#     id=container_name,
#     partition_key=PartitionKey(path="/clientId")
# )

container.upsert_item(sample_doc)
print(f"Stored document: {sample_doc['id']}")

4) Retrieve relevant context from CosmosDB

CosmosDB is your source of truth for metadata and document text. For a basic RAG pipeline, fetch candidate docs by client ID or doc type, then rank them in Python using cosine similarity.

import math

def cosine_similarity(a: list[float], b: list[float]) -> float:
    dot = sum(x * y for x, y in zip(a, b))
    norm_a = math.sqrt(sum(x * x for x in a))
    norm_b = math.sqrt(sum(y * y for y in b))
    return dot / (norm_a * norm_b)

def retrieve_context(query: str, client_id: str, top_k: int = 3):
    query_embedding = embed_text(query)

    query_str = f"SELECT * FROM c WHERE c.clientId = '{client_id}'"
    docs = list(container.query_items(
        query=query_str,
        enable_cross_partition_query=True
    ))

    ranked = sorted(
        docs,
        key=lambda d: cosine_similarity(query_embedding, d["embedding"]),
        reverse=True,
    )

    return ranked[:top_k]

query = "Can this client increase equity exposure in their portfolio?"
context_docs = retrieve_context(query, client_id="client-123")

5) Send retrieved context to Azure OpenAI for the final answer

This is the actual RAG step. Inject retrieved passages into the prompt so the model answers from your firm’s data instead of guessing.

def build_prompt(question: str, docs: list[dict]) -> str:
    context_blocks = []
    for doc in docs:
        context_blocks.append(
            f"Title: {doc['title']}\nType: {doc['docType']}\nContent: {doc['content']}"
        )

    context_text = "\n\n".join(context_blocks)

    return f"""
You are a wealth management assistant.
Answer only using the provided context.
If the context is insufficient, say you do not have enough information.

Context:
{context_text}

Question:
{question}
""".strip()

prompt = build_prompt(query, context_docs)

response = aoai_client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You assist with wealth management operations."},
        {"role": "user", "content": prompt},
    ],
    temperature=0.2,
)

print(response.choices[0].message.content)

Testing the Integration

Run a simple end-to-end check:

test_question = "What is the maximum equity allocation allowed for this conservative profile?"
docs = retrieve_context(test_question, client_id="client-123")
answer_prompt = build_prompt(test_question, docs)

result = aoai_client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "user", "content": answer_prompt}
    ],
)

print("Answer:", result.choices[0].message.content)

Expected output:

Answer: The maximum equity allocation allowed is 60% unless explicitly approved by the investment committee.

If you get an answer grounded in your stored policy text, the integration is working. If the model starts inventing details, your retrieval layer is too weak or your prompt is missing grounding instructions.

Real-World Use Cases

•Client-facing portfolio assistants that answer questions using approved investment policy documents and account notes.
•Advisor copilots that retrieve suitability rules, fee schedules, and product disclosures before drafting recommendations.
•Compliance review agents that compare generated advice against stored policy excerpts in CosmosDB before sending anything to a client.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit