How to Integrate Azure OpenAI for banking with CosmosDB for production AI
Why this integration matters
If you’re building banking AI, Azure OpenAI gives you the language layer, but CosmosDB gives you the state layer. That combination lets you build agents that can answer customer questions, retrieve account context, store conversation history, and keep audit-friendly metadata in a database that scales.
For production systems, this is the difference between a demo chatbot and an agent that can safely handle KYC support, dispute triage, loan prequalification, or internal banker copilots.
Prerequisites
- •An Azure subscription with:
- •Azure OpenAI resource
- •Azure Cosmos DB for NoSQL account
- •A deployed Azure OpenAI model, such as:
- •
gpt-4o-mini - •
gpt-4.1-mini
- •
- •Cosmos DB database and container created
- •Python 3.10+
- •Installed packages:
- •
openai - •
azure-cosmos - •
python-dotenv
- •
- •Environment variables configured:
- •
AZURE_OPENAI_ENDPOINT - •
AZURE_OPENAI_API_KEY - •
AZURE_OPENAI_DEPLOYMENT - •
COSMOS_ENDPOINT - •
COSMOS_KEY - •
COSMOS_DATABASE - •
COSMOS_CONTAINER
- •
Install dependencies:
pip install openai azure-cosmos python-dotenv
Integration Steps
1) Configure your clients
Start by loading secrets from environment variables and creating both SDK clients. Keep this in one module so your agent runtime can reuse it.
import os
from dotenv import load_dotenv
from openai import AzureOpenAI
from azure.cosmos import CosmosClient
load_dotenv()
# Azure OpenAI client
aoai_client = AzureOpenAI(
api_key=os.environ["AZURE_OPENAI_API_KEY"],
api_version="2024-06-01",
azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
)
# Cosmos DB client
cosmos_client = CosmosClient(
url=os.environ["COSMOS_ENDPOINT"],
credential=os.environ["COSMOS_KEY"],
)
database_name = os.environ["COSMOS_DATABASE"]
container_name = os.environ["COSMOS_CONTAINER"]
database = cosmos_client.get_database_client(database_name)
container = database.get_container_client(container_name)
deployment_name = os.environ["AZURE_OPENAI_DEPLOYMENT"]
2) Store bank-safe conversation state in CosmosDB
For banking workloads, don’t dump raw prompts into logs. Persist only what you need: session ID, user ID hash, intent, risk flags, and a trimmed summary of the exchange.
from datetime import datetime, timezone
import uuid
def save_conversation(session_id: str, customer_id: str, user_message: str, assistant_message: str):
item = {
"id": str(uuid.uuid4()),
"sessionId": session_id,
"customerId": customer_id,
"userMessage": user_message[:500],
"assistantMessage": assistant_message[:1000],
"createdAt": datetime.now(timezone.utc).isoformat(),
"type": "conversation_turn"
}
container.upsert_item(item)
return item
# Example write
save_conversation(
session_id="sess_123",
customer_id="cust_456",
user_message="What is my card balance?",
assistant_message="I can help with that after verifying your identity."
)
3) Call Azure OpenAI with banking-specific instructions
Use a system prompt that constrains the assistant to banking-safe behavior. In production, this is where you enforce policy: no hallucinated balances, no unsupported financial advice, and escalation when needed.
def generate_bank_response(user_message: str):
response = aoai_client.chat.completions.create(
model=deployment_name,
messages=[
{
"role": "system",
"content": (
"You are a banking assistant. "
"Do not invent account data. "
"If identity verification is required, ask for it. "
"If the request involves sensitive account actions, escalate to a human agent."
)
},
{"role": "user", "content": user_message}
],
temperature=0.2,
max_tokens=300,
)
return response.choices[0].message.content
answer = generate_bank_response("Can you help me dispute a card charge?")
print(answer)
4) Retrieve context from CosmosDB before prompting the model
Production AI agents need memory. Pull the latest turns or customer profile facts from CosmosDB before sending the prompt to Azure OpenAI.
def get_recent_session_turns(session_id: str, limit: int = 5):
query = """
SELECT TOP @limit c.userMessage, c.assistantMessage, c.createdAt
FROM c
WHERE c.sessionId = @sessionId AND c.type = 'conversation_turn'
ORDER BY c.createdAt DESC
"""
params = [
{"name": "@sessionId", "value": session_id},
{"name": "@limit", "value": limit},
]
items = list(container.query_items(
query=query,
parameters=params,
enable_cross_partition_query=True
))
return items
def answer_with_memory(session_id: str, user_message: str):
history = get_recent_session_turns(session_id)
memory_text = "\n".join(
f"User: {x['userMessage']}\nAssistant: {x['assistantMessage']}"
for x in reversed(history)
)
response = aoai_client.chat.completions.create(
model=deployment_name,
messages=[
{"role": "system", "content": "You are a banking assistant with strict policy controls."},
{"role": "user", "content": f"Conversation history:\n{memory_text}\n\nCurrent question:\n{user_message}"}
],
temperature=0.2,
max_tokens=300,
)
return response.choices[0].message.content
print(answer_with_memory("sess_123", "What did we discuss earlier?"))
5) Build an end-to-end agent turn
This is the pattern you want in production: read context from CosmosDB, call Azure OpenAI, then persist the result back to CosmosDB for traceability.
def handle_agent_turn(session_id: str, customer_id: str, user_message: str):
history = get_recent_session_turns(session_id)
context_lines = []
for turn in reversed(history):
context_lines.append(f"User: {turn['userMessage']}")
context_lines.append(f"Assistant: {turn['assistantMessage']}")
prompt = "\n".join(context_lines + [f"User: {user_message}"])
completion = aoai_client.chat.completions.create(
model=deployment_name,
messages=[
{"role": "system", "content": "You are a compliant banking assistant."},
{"role": "user", "content": prompt}
],
temperature=0.1,
max_tokens=250,
)
assistant_message = completion.choices[0].message.content
save_conversation(session_id, customer_id, user_message, assistant_message)
return assistant_message
result = handle_agent_turn(
session_id="sess_123",
customer_id="cust_456",
user_message="I think I was charged twice for the same card transaction."
)
print(result)
Testing the Integration
Run a simple smoke test that writes to CosmosDB and gets a response from Azure OpenAI.
def test_integration():
session_id = f"test_{uuid.uuid4()}"
reply = handle_agent_turn(
session_id=session_id,
customer_id="customer_test",
user_message="Explain how I can raise a chargeback request."
)
Expected output:
A clear banking-safe response explaining how to raise a chargeback request.
A new document stored in CosmosDB with sessionId=test_<uuid>.
If you want a cleaner verification script:
session_id = f"test_{uuid.uuid4()}"
response_text = handle_agent_turn(
session_id=session_id,
customer_id="customer_test",
user_message="How do I reset my online banking password?"
)
print("Model response:", response_text)
saved_items = get_recent_session_turns(session_id)
print("Saved turns:", len(saved_items))
Expected output:
Model response: Please use the password reset flow in online banking or contact support if MFA fails.
Saved turns: 1
Real-World Use Cases
- •
Customer service copilot
- •Answer balance-related or product questions using controlled prompts.
- •Persist conversation state and escalation metadata in CosmosDB.
- •
Dispute and fraud triage
- •Classify incoming complaints with Azure OpenAI.
- •Store case summaries, risk tags, and next actions in CosmosDB.
- •
Relationship manager assistant
- •Pull client interaction history from CosmosDB.
- •Generate meeting prep notes and follow-up drafts with Azure OpenAI.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit