How to Integrate Azure OpenAI for investment banking with CosmosDB for RAG

By Cyprian AaronsUpdated 2026-04-21

azure-openai-for-investment-bankingcosmosdbrag

Combining Azure OpenAI with Cosmos DB gives you a practical RAG stack for investment banking workflows. You can ground responses in deal docs, pitch books, research notes, policy memos, and client-specific data without dumping everything into the model context window.

The pattern is straightforward: Azure OpenAI handles embeddings and generation, while Cosmos DB stores and retrieves the vectorized knowledge base. That lets you build assistants that answer questions like “What were the key covenants in the latest credit memo?” or “Summarize comparable company risks from our internal research library.”

Prerequisites

•
An Azure subscription with:
- •Azure OpenAI resource
- •Azure Cosmos DB for NoSQL account with vector search enabled
•Python 3.10+
•
Installed packages:
- •openai
- •azure-cosmos
- •python-dotenv
•
Azure OpenAI deployment names for:
- •Chat model
- •Embedding model
•
Cosmos DB database and container created with:
- •Partition key
- •Vector embedding policy
- •Vector index
•
Environment variables set:
- •AZURE_OPENAI_ENDPOINT
- •AZURE_OPENAI_API_KEY
- •AZURE_OPENAI_API_VERSION
- •AZURE_OPENAI_CHAT_DEPLOYMENT
- •AZURE_OPENAI_EMBEDDING_DEPLOYMENT
- •COSMOS_ENDPOINT
- •COSMOS_KEY
- •COSMOS_DATABASE_NAME
- •COSMOS_CONTAINER_NAME

Install dependencies:

pip install openai azure-cosmos python-dotenv

Integration Steps

•Set up clients for Azure OpenAI and Cosmos DB.

Use the Azure OpenAI SDK for chat and embeddings, and the Cosmos SDK for storage and retrieval.

import os
from dotenv import load_dotenv
from openai import AzureOpenAI
from azure.cosmos import CosmosClient

load_dotenv()

aoai_client = AzureOpenAI(
    api_key=os.environ["AZURE_OPENAI_API_KEY"],
    api_version=os.environ["AZURE_OPENAI_API_VERSION"],
    azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
)

cosmos_client = CosmosClient(
    url=os.environ["COSMOS_ENDPOINT"],
    credential=os.environ["COSMOS_KEY"],
)

db = cosmos_client.get_database_client(os.environ["COSMOS_DATABASE_NAME"])
container = db.get_container_client(os.environ["COSMOS_CONTAINER_NAME"])

•Generate embeddings for banking documents before storing them.

For RAG, every chunk needs a vector. Keep chunks small enough to preserve relevance, usually one section of a memo or one page of a deck.

def embed_text(text: str) -> list[float]:
    response = aoai_client.embeddings.create(
        model=os.environ["AZURE_OPENAI_EMBEDDING_DEPLOYMENT"],
        input=text,
    )
    return response.data[0].embedding

chunk = {
    "id": "credit-memo-001",
    "docType": "credit_memo",
    "clientId": "client-123",
    "title": "Acquisition Financing Memo",
    "content": "The borrower maintains a minimum EBITDA covenant of 3.0x...",
}

chunk["embedding"] = embed_text(chunk["content"])

•Store document chunks in Cosmos DB with metadata for filtering.

In investment banking, metadata matters as much as vectors. You need filters like client, desk, deal type, region, or document class.

container.upsert_item({
    "id": chunk["id"],
    "docType": chunk["docType"],
    "clientId": chunk["clientId"],
    "title": chunk["title"],
    "content": chunk["content"],
    "embedding": chunk["embedding"],
})

•Retrieve the most relevant chunks using vector search.

Cosmos DB supports vector queries through SQL API query text and parameters. Use the query embedding to fetch top matches before calling the chat model.

def retrieve_context(query: str, top_k: int = 3) -> list[dict]:
    query_embedding = embed_text(query)

    sql = """
    SELECT TOP @top_k c.id, c.title, c.content
    FROM c
    ORDER BY VectorDistance(c.embedding, @query_embedding)
    """

    params = [
        {"name": "@top_k", "value": top_k},
        {"name": "@query_embedding", "value": query_embedding},
    ]

    items = list(container.query_items(
        query=sql,
        parameters=params,
        enable_cross_partition_query=True,
    ))
    return items

results = retrieve_context("What are the EBITDA covenant thresholds in the latest memo?")

•Send retrieved context to Azure OpenAI for grounded answers.

Keep the prompt strict. In banking workflows, you want answers tied to source material only.

def answer_question(question: str) -> str:
    context_docs = retrieve_context(question)

    context_text = "\n\n".join(
        f"Title: {doc['title']}\nContent: {doc['content']}"
        for doc in context_docs
    )

    messages = [
        {
            "role": "system",
            "content": (
                "You are an assistant for investment banking analysts. "
                "Answer only using the provided context. "
                "If the context is insufficient, say so."
            ),
        },
        {
            "role": "user",
            "content": f"Context:\n{context_text}\n\nQuestion: {question}",
        },
    ]

    response = aoai_client.chat.completions.create(
        model=os.environ["AZURE_OPENAI_CHAT_DEPLOYMENT"],
        messages=messages,
        temperature=0.2,
    )

    return response.choices[0].message.content

print(answer_question("Summarize the covenant requirements for this deal"))

Testing the Integration

Run a simple end-to-end test with one known document and one question that should clearly match it.

test_question = "What is the minimum EBITDA covenant?"
answer = answer_question(test_question)
print("QUESTION:", test_question)
print("ANSWER:", answer)

Expected output:

QUESTION: What is the minimum EBITDA covenant?
ANSWER: The context states that the borrower must maintain a minimum EBITDA covenant of 3.0x.

If you get an empty or vague answer, check these first:

•The embedding deployment name is correct
•The Cosmos container actually contains vectors
•Your query uses the same embedding model as ingestion
•The prompt instructs the model to stay grounded in retrieved context

Real-World Use Cases

•
Deal room assistant
- •Let bankers ask questions across CIMs, credit memos, and diligence packs with citations from stored chunks.
•
Research copilot
- •Summarize internal equity research and surface comparable company risks tied to specific sectors or clients.
•
Policy-aware drafting
- •Generate first-pass emails, memos, or investment committee notes using firm-approved language stored in Cosmos DB.

This setup is production-friendly because it separates concerns cleanly. Cosmos DB handles persistence and retrieval; Azure OpenAI handles semantic reasoning and generation; your agent orchestrates both with tight control over what gets answered and where it came from.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit