How to Integrate Azure OpenAI for banking with CosmosDB for production AI

By Cyprian AaronsUpdated 2026-04-21
azure-openai-for-bankingcosmosdbproduction-ai

Why this integration matters

If you’re building banking AI, Azure OpenAI gives you the language layer, but CosmosDB gives you the state layer. That combination lets you build agents that can answer customer questions, retrieve account context, store conversation history, and keep audit-friendly metadata in a database that scales.

For production systems, this is the difference between a demo chatbot and an agent that can safely handle KYC support, dispute triage, loan prequalification, or internal banker copilots.

Prerequisites

  • An Azure subscription with:
    • Azure OpenAI resource
    • Azure Cosmos DB for NoSQL account
  • A deployed Azure OpenAI model, such as:
    • gpt-4o-mini
    • gpt-4.1-mini
  • Cosmos DB database and container created
  • Python 3.10+
  • Installed packages:
    • openai
    • azure-cosmos
    • python-dotenv
  • Environment variables configured:
    • AZURE_OPENAI_ENDPOINT
    • AZURE_OPENAI_API_KEY
    • AZURE_OPENAI_DEPLOYMENT
    • COSMOS_ENDPOINT
    • COSMOS_KEY
    • COSMOS_DATABASE
    • COSMOS_CONTAINER

Install dependencies:

pip install openai azure-cosmos python-dotenv

Integration Steps

1) Configure your clients

Start by loading secrets from environment variables and creating both SDK clients. Keep this in one module so your agent runtime can reuse it.

import os
from dotenv import load_dotenv
from openai import AzureOpenAI
from azure.cosmos import CosmosClient

load_dotenv()

# Azure OpenAI client
aoai_client = AzureOpenAI(
    api_key=os.environ["AZURE_OPENAI_API_KEY"],
    api_version="2024-06-01",
    azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
)

# Cosmos DB client
cosmos_client = CosmosClient(
    url=os.environ["COSMOS_ENDPOINT"],
    credential=os.environ["COSMOS_KEY"],
)

database_name = os.environ["COSMOS_DATABASE"]
container_name = os.environ["COSMOS_CONTAINER"]

database = cosmos_client.get_database_client(database_name)
container = database.get_container_client(container_name)
deployment_name = os.environ["AZURE_OPENAI_DEPLOYMENT"]

2) Store bank-safe conversation state in CosmosDB

For banking workloads, don’t dump raw prompts into logs. Persist only what you need: session ID, user ID hash, intent, risk flags, and a trimmed summary of the exchange.

from datetime import datetime, timezone
import uuid

def save_conversation(session_id: str, customer_id: str, user_message: str, assistant_message: str):
    item = {
        "id": str(uuid.uuid4()),
        "sessionId": session_id,
        "customerId": customer_id,
        "userMessage": user_message[:500],
        "assistantMessage": assistant_message[:1000],
        "createdAt": datetime.now(timezone.utc).isoformat(),
        "type": "conversation_turn"
    }
    container.upsert_item(item)
    return item

# Example write
save_conversation(
    session_id="sess_123",
    customer_id="cust_456",
    user_message="What is my card balance?",
    assistant_message="I can help with that after verifying your identity."
)

3) Call Azure OpenAI with banking-specific instructions

Use a system prompt that constrains the assistant to banking-safe behavior. In production, this is where you enforce policy: no hallucinated balances, no unsupported financial advice, and escalation when needed.

def generate_bank_response(user_message: str):
    response = aoai_client.chat.completions.create(
        model=deployment_name,
        messages=[
            {
                "role": "system",
                "content": (
                    "You are a banking assistant. "
                    "Do not invent account data. "
                    "If identity verification is required, ask for it. "
                    "If the request involves sensitive account actions, escalate to a human agent."
                )
            },
            {"role": "user", "content": user_message}
        ],
        temperature=0.2,
        max_tokens=300,
    )
    return response.choices[0].message.content

answer = generate_bank_response("Can you help me dispute a card charge?")
print(answer)

4) Retrieve context from CosmosDB before prompting the model

Production AI agents need memory. Pull the latest turns or customer profile facts from CosmosDB before sending the prompt to Azure OpenAI.

def get_recent_session_turns(session_id: str, limit: int = 5):
    query = """
    SELECT TOP @limit c.userMessage, c.assistantMessage, c.createdAt
    FROM c
    WHERE c.sessionId = @sessionId AND c.type = 'conversation_turn'
    ORDER BY c.createdAt DESC
    """
    params = [
        {"name": "@sessionId", "value": session_id},
        {"name": "@limit", "value": limit},
    ]
    items = list(container.query_items(
        query=query,
        parameters=params,
        enable_cross_partition_query=True
    ))
    return items

def answer_with_memory(session_id: str, user_message: str):
    history = get_recent_session_turns(session_id)
    memory_text = "\n".join(
        f"User: {x['userMessage']}\nAssistant: {x['assistantMessage']}"
        for x in reversed(history)
    )

    response = aoai_client.chat.completions.create(
        model=deployment_name,
        messages=[
            {"role": "system", "content": "You are a banking assistant with strict policy controls."},
            {"role": "user", "content": f"Conversation history:\n{memory_text}\n\nCurrent question:\n{user_message}"}
        ],
        temperature=0.2,
        max_tokens=300,
    )
    return response.choices[0].message.content

print(answer_with_memory("sess_123", "What did we discuss earlier?"))

5) Build an end-to-end agent turn

This is the pattern you want in production: read context from CosmosDB, call Azure OpenAI, then persist the result back to CosmosDB for traceability.

def handle_agent_turn(session_id: str, customer_id: str, user_message: str):
    history = get_recent_session_turns(session_id)

    context_lines = []
    for turn in reversed(history):
      context_lines.append(f"User: {turn['userMessage']}")
      context_lines.append(f"Assistant: {turn['assistantMessage']}")

    prompt = "\n".join(context_lines + [f"User: {user_message}"])

    completion = aoai_client.chat.completions.create(
        model=deployment_name,
        messages=[
            {"role": "system", "content": "You are a compliant banking assistant."},
            {"role": "user", "content": prompt}
        ],
        temperature=0.1,
        max_tokens=250,
    )

    assistant_message = completion.choices[0].message.content
    save_conversation(session_id, customer_id, user_message, assistant_message)
    return assistant_message

result = handle_agent_turn(
    session_id="sess_123",
    customer_id="cust_456",
    user_message="I think I was charged twice for the same card transaction."
)
print(result)

Testing the Integration

Run a simple smoke test that writes to CosmosDB and gets a response from Azure OpenAI.

def test_integration():
    session_id = f"test_{uuid.uuid4()}"
    
    reply = handle_agent_turn(
      session_id=session_id,
      customer_id="customer_test",
      user_message="Explain how I can raise a chargeback request."
    )

    
  

    
  
    

    
    
    
    

    
  
    
    
    
    

    
    
    
    

    
  
    
    
    
    
    
    

    
    
    
    
    

    
    
    
    
    
    

    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    



Expected output:

A clear banking-safe response explaining how to raise a chargeback request.
A new document stored in CosmosDB with sessionId=test_<uuid>.

If you want a cleaner verification script:

session_id = f"test_{uuid.uuid4()}"
response_text = handle_agent_turn(
   session_id=session_id,
   customer_id="customer_test",
   user_message="How do I reset my online banking password?"
)

print("Model response:", response_text)

saved_items = get_recent_session_turns(session_id)
print("Saved turns:", len(saved_items))

Expected output:

Model response: Please use the password reset flow in online banking or contact support if MFA fails.
Saved turns: 1

Real-World Use Cases

  • Customer service copilot

    • Answer balance-related or product questions using controlled prompts.
    • Persist conversation state and escalation metadata in CosmosDB.
  • Dispute and fraud triage

    • Classify incoming complaints with Azure OpenAI.
    • Store case summaries, risk tags, and next actions in CosmosDB.
  • Relationship manager assistant

    • Pull client interaction history from CosmosDB.
    • Generate meeting prep notes and follow-up drafts with Azure OpenAI.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides