How to Integrate Azure OpenAI for banking with CosmosDB for AI agents
Azure OpenAI for banking gives you the model layer for secure, policy-aware assistant behavior. CosmosDB gives you the persistence layer for customer context, conversation state, and audit-friendly memory. Put them together and you can build AI agents that answer banking questions, retain session context, and retrieve prior interactions without stuffing everything into prompts.
Prerequisites
- •Python 3.10+
- •An Azure subscription with:
- •Azure OpenAI resource
- •Azure Cosmos DB account
- •Deployed Azure OpenAI chat model, for example
gpt-4oorgpt-4.1 - •CosmosDB database and container created
- •Azure CLI installed and logged in, or environment variables set manually
- •Python packages:
- •
openai - •
azure-cosmos - •
python-dotenv
- •
Install dependencies:
pip install openai azure-cosmos python-dotenv
Set environment variables:
export AZURE_OPENAI_ENDPOINT="https://<your-openai-resource>.openai.azure.com/"
export AZURE_OPENAI_API_KEY="<your-openai-key>"
export AZURE_OPENAI_DEPLOYMENT="banking-chat-model"
export COSMOS_ENDPOINT="https://<your-cosmos-account>.documents.azure.com:443/"
export COSMOS_KEY="<your-cosmos-key>"
export COSMOS_DATABASE="agentdb"
export COSMOS_CONTAINER="sessions"
Integration Steps
- •
Create a CosmosDB client and container handle
Start by connecting to CosmosDB. For an AI agent, use a partition key like
/sessionIdso each conversation stays isolated and easy to query.
import os
from azure.cosmos import CosmosClient, PartitionKey
COSMOS_ENDPOINT = os.environ["COSMOS_ENDPOINT"]
COSMOS_KEY = os.environ["COSMOS_KEY"]
DATABASE_NAME = os.environ["COSMOS_DATABASE"]
CONTAINER_NAME = os.environ["COSMOS_CONTAINER"]
cosmos_client = CosmosClient(COSMOS_ENDPOINT, credential=COSMOS_KEY)
database = cosmos_client.create_database_if_not_exists(id=DATABASE_NAME)
container = database.create_container_if_not_exists(
id=CONTAINER_NAME,
partition_key=PartitionKey(path="/sessionId"),
offer_throughput=400
)
- •
Initialize the Azure OpenAI client
Use the Azure OpenAI SDK with your endpoint, API key, and deployment name. In banking workflows, keep the system prompt strict: no advice beyond policy, no fabricated account details, and always cite stored context when available.
import os
from openai import AzureOpenAI
AZURE_OPENAI_ENDPOINT = os.environ["AZURE_OPENAI_ENDPOINT"]
AZURE_OPENAI_API_KEY = os.environ["AZURE_OPENAI_API_KEY"]
AZURE_OPENAI_DEPLOYMENT = os.environ["AZURE_OPENAI_DEPLOYMENT"]
aoai_client = AzureOpenAI(
api_key=AZURE_OPENAI_API_KEY,
api_version="2024-02-15-preview",
azure_endpoint=AZURE_OPENAI_ENDPOINT,
)
- •
Load conversation memory from CosmosDB
Before calling the model, fetch prior messages for the session. Keep the payload small; store only what the agent needs: user input, assistant output, timestamps, and optional metadata like intent or customer tier.
from typing import List, Dict
def load_session_messages(session_id: str) -> List[Dict]:
query = """
SELECT * FROM c
WHERE c.sessionId = @sessionId
ORDER BY c.createdAt ASC
"""
items = list(container.query_items(
query=query,
parameters=[{"name": "@sessionId", "value": session_id}],
enable_cross_partition_query=False
))
messages = []
for item in items:
messages.append({"role": item["role"], "content": item["content"]})
return messages
- •
Call Azure OpenAI with session context
Build the message list from stored memory plus the current user prompt. This is where the agent becomes stateful: it can answer follow-ups like “show me that again” or “what was my last payment date?” without re-entering all context.
def get_banking_response(session_id: str, user_text: str) -> str:
history = load_session_messages(session_id)
messages = [
{
"role": "system",
"content": (
"You are a banking assistant. "
"Use only provided context. "
"Do not invent account balances or transaction details. "
"If data is missing, ask for clarification."
),
}
]
messages.extend(history)
messages.append({"role": "user", "content": user_text})
response = aoai_client.chat.completions.create(
model=AZURE_OPENAI_DEPLOYMENT,
messages=messages,
temperature=0.2,
max_tokens=400,
)
return response.choices[0].message.content
- •
Persist the new turn back into CosmosDB
After generating a response, store both sides of the exchange. This gives you durable memory for later retrieval, audit trails for compliance review, and a clean way to resume sessions across app restarts.
from datetime import datetime, timezone
def save_message(session_id: str, role: str, content: str):
item = {
"id": f"{session_id}-{role}-{datetime.now(timezone.utc).timestamp()}",
"sessionId": session_id,
"role": role,
"content": content,
"createdAt": datetime.now(timezone.utc).isoformat()
}
container.upsert_item(item)
def chat(session_id: str, user_text: str) -> str:
assistant_text = get_banking_response(session_id, user_text)
save_message(session_id, "user", user_text)
save_message(session_id, "assistant", assistant_text)
return assistant_text
Testing the Integration
Run a simple round trip with a fixed session ID. The first call should create memory; the second call should reuse it.
if __name__ == "__main__":
session_id = "acct-10001"
first_reply = chat(session_id, "I need help understanding my mortgage payment schedule.")
print("First reply:", first_reply)
second_reply = chat(session_id, "Can you summarize what we discussed?")
print("Second reply:", second_reply)
Expected output:
First reply: I can help with that. Please share your mortgage type or loan reference so I can explain the payment schedule accurately.
Second reply: We discussed your mortgage payment schedule and I asked for your loan reference to provide an accurate summary.
If you want to verify CosmosDB persistence directly:
stored_items = list(container.query_items(
query="SELECT * FROM c WHERE c.sessionId = @sessionId",
parameters=[{"name": "@sessionId", "value": "acct-10001"}],
))
print(f"Stored messages: {len(stored_items)}")
Real-World Use Cases
- •
Customer service banking agents
- •Answer balance-related questions only when backed by retrieved account data.
- •Keep conversation history in CosmosDB so customers can continue across channels.
- •
Loan onboarding assistants
- •Guide applicants through document collection and eligibility checks.
- •Store application state per session and let Azure OpenAI generate next-step instructions.
- •
Compliance-aware support workflows
- •Persist every user-assistant turn for audit review.
- •Use CosmosDB as an evidence trail while Azure OpenAI handles summarization and triage.
The pattern is straightforward: Azure OpenAI generates responses; CosmosDB stores durable agent state. That separation keeps your banking agent maintainable, testable, and ready for compliance-heavy environments.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit