How to Integrate Azure OpenAI for retail banking with CosmosDB for RAG

By Cyprian AaronsUpdated 2026-04-21

azure-openai-for-retail-bankingcosmosdbrag

Azure OpenAI gives you the generation and reasoning layer. Cosmos DB gives you the durable retrieval layer for customer policies, product docs, KYC rules, and support transcripts.

For retail banking, that combination is what turns a chat model into a usable RAG agent: grounded answers, lower hallucination risk, and traceable responses against internal knowledge.

Prerequisites

•
Azure subscription with:
- •Azure OpenAI resource
- •Azure Cosmos DB for NoSQL account
•
An Azure OpenAI deployment created for:
- •chat/completions model
- •embedding model
•Cosmos DB database and container created for your knowledge base
•Python 3.10+
•
Installed packages:
- •openai
- •azure-cosmos
- •python-dotenv
•
Environment variables set:
- •AZURE_OPENAI_ENDPOINT
- •AZURE_OPENAI_API_KEY
- •AZURE_OPENAI_API_VERSION
- •AZURE_OPENAI_CHAT_DEPLOYMENT
- •AZURE_OPENAI_EMBEDDING_DEPLOYMENT
- •COSMOS_ENDPOINT
- •COSMOS_KEY
- •COSMOS_DATABASE
- •COSMOS_CONTAINER

Integration Steps

1) Install dependencies and initialize clients

Start with the SDKs you’ll actually use in production. Azure OpenAI handles embeddings and chat completions; Cosmos DB stores chunks plus vector embeddings for retrieval.

pip install openai azure-cosmos python-dotenv

import os
from dotenv import load_dotenv
from openai import AzureOpenAI
from azure.cosmos import CosmosClient

load_dotenv()

aoai_client = AzureOpenAI(
    api_key=os.environ["AZURE_OPENAI_API_KEY"],
    api_version=os.environ["AZURE_OPENAI_API_VERSION"],
    azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
)

cosmos_client = CosmosClient(
    url=os.environ["COSMOS_ENDPOINT"],
    credential=os.environ["COSMOS_KEY"],
)

db = cosmos_client.get_database_client(os.environ["COSMOS_DATABASE"])
container = db.get_container_client(os.environ["COSMOS_CONTAINER"])

2) Create embeddings for banking content

For RAG, you need a vector representation of policy text, product brochures, fee schedules, or call-center scripts. Use Azure OpenAI embeddings and store the result with metadata in Cosmos DB.

def embed_text(text: str) -> list[float]:
    response = aoai_client.embeddings.create(
        model=os.environ["AZURE_OPENAI_EMBEDDING_DEPLOYMENT"],
        input=text,
    )
    return response.data[0].embedding


doc = {
    "id": "mortgage-fees-001",
    "doc_type": "policy",
    "title": "Mortgage Fee Policy",
    "content": (
        "Early repayment charges apply during the fixed-rate period. "
        "Customers can make overpayments up to 10% per year without penalty."
    ),
}

doc["embedding"] = embed_text(doc["content"])

3) Store chunks in Cosmos DB

Cosmos DB is your retrieval store. In retail banking, keep each chunk small enough to retrieve precisely and attach business metadata like product line, jurisdiction, and effective date.

stored = container.upsert_item(doc)
print(stored["id"])

If you are using vector search in Cosmos DB, make sure your container is configured with a vector policy and index. The exact schema depends on your account setup, but the document shape above is what your app should persist.

4) Retrieve relevant context from Cosmos DB

At query time, embed the user question and compare it against stored vectors. If your Cosmos DB setup supports vector search, use it directly; otherwise fetch candidate documents by metadata and rank them in Python.

def cosine_similarity(a: list[float], b: list[float]) -> float:
    dot = sum(x * y for x, y in zip(a, b))
    norm_a = sum(x * x for x in a) ** 0.5
    norm_b = sum(x * x for x in b) ** 0.5
    return dot / (norm_a * norm_b + 1e-9)


query = "What are the early repayment rules on a mortgage?"
query_embedding = embed_text(query)

items = list(container.read_all_items(max_item_count=50))

ranked = sorted(
    items,
    key=lambda item: cosine_similarity(query_embedding, item["embedding"]),
    reverse=True,
)

top_context = "\n\n".join(
    f"Title: {item['title']}\nContent: {item['content']}"
    for item in ranked[:3]
)
print(top_context)

5) Generate a grounded answer with Azure OpenAI

Now feed the retrieved context into the chat model. This is where the agent becomes useful: it answers from bank-approved material instead of free-form guessing.

messages = [
    {
        "role": "system",
        "content": (
            "You are a retail banking assistant. Answer only from the provided context. "
            "If the answer is not in context, say you don't have enough information."
        ),
    },
    {
        "role": "user",
        "content": f"Context:\n{top_context}\n\nQuestion:\n{query}",
    },
]

response = aoai_client.chat.completions.create(
    model=os.environ["AZURE_OPENAI_CHAT_DEPLOYMENT"],
    messages=messages,
    temperature=0.2,
)

answer = response.choices[0].message.content
print(answer)

Testing the Integration

Run an end-to-end check with one banking policy chunk and one user question. The goal is simple: confirm retrieval returns the right document and the model answers using that content.

test_query = "Can I overpay my mortgage without penalty?"
test_embedding = embed_text(test_query)

items = list(container.read_all_items(max_item_count=50))
best_match = max(items, key=lambda item: cosine_similarity(test_embedding, item["embedding"]))

print("Matched document:", best_match["title"])

test_messages = [
    {
        "role": "system",
        "content": "Answer only from context.",
    },
    {
        "role": "user",
        "content": f"Context:\n{best_match['content']}\n\nQuestion:\n{test_query}",
    },
]

test_response = aoai_client.chat.completions.create(
    model=os.environ["AZURE_OPENAI_CHAT_DEPLOYMENT"],
    messages=test_messages,
)

print("Answer:", test_response.choices[0].message.content)

Expected output:

Matched document: Mortgage Fee Policy
Answer: Yes. Customers can make overpayments up to 10% per year without penalty.

Real-World Use Cases

•
Retail banking assistant for product FAQs
- •Answer questions about overdrafts, savings rates, mortgage fees, card limits, and branch services using approved policy docs.
•
Internal advisor copilot
- •Help relationship managers find product eligibility rules, pricing guidance, and compliance-approved talking points during customer calls.
•
Customer support deflection
- •Reduce ticket volume by grounding chatbot responses in Cosmos-stored knowledge articles and operational procedures.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit