How to Integrate Azure OpenAI for retail banking with CosmosDB for startups

By Cyprian AaronsUpdated 2026-04-21
azure-openai-for-retail-bankingcosmosdbstartups

Azure OpenAI gives you the language layer for customer support, document extraction, and policy-aware reasoning. CosmosDB gives you the durable state layer for customer profiles, conversation history, and transaction context. Together, they let a startup build banking agents that answer questions with memory, traceability, and low-latency retrieval.

Prerequisites

  • Python 3.10+
  • An Azure subscription with:
    • Azure OpenAI resource
    • Azure Cosmos DB account
  • A deployed Azure OpenAI model:
    • Chat model like gpt-4o-mini or equivalent
  • Cosmos DB database and container created
  • Environment variables set:
    • AZURE_OPENAI_ENDPOINT
    • AZURE_OPENAI_API_KEY
    • AZURE_OPENAI_DEPLOYMENT
    • COSMOS_ENDPOINT
    • COSMOS_KEY
    • COSMOS_DATABASE
    • COSMOS_CONTAINER
  • Install packages:
    pip install openai azure-cosmos python-dotenv
    

Integration Steps

1) Initialize both clients

Start by wiring up the two SDKs in the same service. Keep credentials in environment variables and avoid hardcoding anything.

import os
from openai import AzureOpenAI
from azure.cosmos import CosmosClient

azure_openai_client = AzureOpenAI(
    api_key=os.environ["AZURE_OPENAI_API_KEY"],
    api_version="2024-02-15-preview",
    azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
)

cosmos_client = CosmosClient(
    url=os.environ["COSMOS_ENDPOINT"],
    credential=os.environ["COSMOS_KEY"]
)

database = cosmos_client.get_database_client(os.environ["COSMOS_DATABASE"])
container = database.get_container_client(os.environ["COSMOS_CONTAINER"])

This gives you one client for generation and one for persistence. In a banking agent, that usually means the model handles interpretation while CosmosDB stores the conversation state and user profile data.

2) Write customer context into CosmosDB

For retail banking, you want the agent to remember account type, preferred channel, last interaction, and risk flags. Store this as a document keyed by customer ID.

from datetime import datetime, timezone

def upsert_customer_context(customer_id: str, full_name: str, segment: str):
    doc = {
        "id": customer_id,
        "customer_id": customer_id,
        "full_name": full_name,
        "segment": segment,
        "last_interaction_at": datetime.now(timezone.utc).isoformat(),
        "preferences": {
            "channel": "chat",
            "language": "en"
        }
    }
    return container.upsert_item(doc)

result = upsert_customer_context(
    customer_id="cust_10021",
    full_name="Amina Patel",
    segment="retail-banking"
)
print(result["id"])

Use upsert_item() instead of insert-only writes. That keeps your agent idempotent when a session gets retried or when the same user returns later.

3) Retrieve context and send it to Azure OpenAI

Pull the stored customer record from CosmosDB and inject it into the chat prompt. This is where the banking-specific behavior comes from.

def get_customer_context(customer_id: str):
    query = "SELECT * FROM c WHERE c.customer_id = @customer_id"
    params = [{"name": "@customer_id", "value": customer_id}]
    items = list(container.query_items(
        query=query,
        parameters=params,
        enable_cross_partition_query=True
    ))
    return items[0] if items else None

customer = get_customer_context("cust_10021")

messages = [
    {
        "role": "system",
        "content": (
            "You are a retail banking assistant. "
            "Do not provide financial advice. "
            "Use only the provided customer context."
        )
    },
    {
        "role": "user",
        "content": f"""
Customer context:
{customer}

Question:
What debit card options should I consider if I travel often?
"""
    }
]

response = azure_openai_client.chat.completions.create(
    model=os.environ["AZURE_OPENAI_DEPLOYMENT"],
    messages=messages,
    temperature=0.2
)

print(response.choices[0].message.content)

Keep temperature low for banking flows. You want consistent responses, not creative ones.

4) Store the agent response back into CosmosDB

Every interaction should be persisted. That gives you auditability, session continuity, and a clean source of truth for later retrieval.

def save_conversation_turn(customer_id: str, user_text: str, assistant_text: str):
    turn = {
        "id": f"{customer_id}-{datetime.now(timezone.utc).timestamp()}",
        "customer_id": customer_id,
        "user_text": user_text,
        "assistant_text": assistant_text,
        "created_at": datetime.now(timezone.utc).isoformat()
    }
    return container.upsert_item(turn)

assistant_text = response.choices[0].message.content

save_conversation_turn(
    customer_id="cust_10021",
    user_text="What debit card options should I consider if I travel often?",
    assistant_text=assistant_text
)

This pattern is useful when compliance teams need a replayable transcript of what the agent said and why it said it.

5) Build a simple end-to-end agent function

Now combine retrieval, generation, and persistence into one callable unit your app can use.

def banking_agent(customer_id: str, question: str):
    context = get_customer_context(customer_id)

    messages = [
        {
            "role": "system",
            "content": (
                "You are a retail banking support agent for a startup. "
                "Be concise. Avoid making claims about products unless present in context."
            )
        },
        {
            "role": "user",
            "content": f"Customer context: {context}\n\nQuestion: {question}"
        }
    ]

    completion = azure_openai_client.chat.completions.create(
        model=os.environ["AZURE_OPENAI_DEPLOYMENT"],
        messages=messages,
        temperature=0.2
    )

    answer = completion.choices[0].message.content

    save_conversation_turn(customer_id, question, answer)
    return answer

At this point you have a basic production shape: state in CosmosDB, reasoning in Azure OpenAI, and an explicit trail of every request/response pair.

Testing the Integration

Run a smoke test that verifies both reads and writes work end to end.

if __name__ == "__main__":
    test_customer_id = "cust_10021"

    # ensure profile exists
    upsert_customer_context(test_customer_id, "Amina Patel", "retail-banking")

    reply = banking_agent(
        test_customer_id,
        "Can I use my debit card internationally without extra fees?"
    )

    print("Agent reply:")
    print(reply)

Expected output:

Agent reply:
Based on your profile and general retail banking guidance, check whether your debit card supports international transactions and whether foreign usage fees apply...

If you want to validate persistence too, query CosmosDB after the run:

turns = list(container.query_items(
    query="SELECT * FROM c WHERE c.customer_id = @customer_id ORDER BY c.created_at DESC",
    parameters=[{"name": "@customer_id", "value": test_customer_id}],
    enable_cross_partition_query=True
))

print(f"Stored turns: {len(turns)}")
print(turns[0]["assistant_text"][:120])

Real-World Use Cases

  • Retail banking support agents
    • Answer card activation, fee questions, branch lookup requests, and product eligibility checks using stored customer context.
  • KYC-aware onboarding assistants
    • Collect onboarding data step by step and persist progress in CosmosDB so users can resume later without starting over.
  • Personalized financial service chat
    • Combine conversation memory with profile data to tailor responses for students, salaried workers, or high-value customers without mixing sessions.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides