How to Integrate Azure OpenAI for lending with CosmosDB for RAG

By Cyprian AaronsUpdated 2026-04-21
azure-openai-for-lendingcosmosdbrag

Azure OpenAI plus Cosmos DB is a solid stack for lending workflows that need grounded answers, document retrieval, and auditability. You use Azure OpenAI to reason over borrower data, policy docs, and underwriting rules, while Cosmos DB stores the chunked knowledge base and retrieval metadata that powers RAG.

Prerequisites

  • An Azure subscription with:
    • Azure OpenAI resource
    • Azure Cosmos DB for NoSQL account
  • Deployed Azure OpenAI models:
    • Chat model, such as gpt-4o-mini
    • Embedding model, such as text-embedding-3-small
  • A Cosmos DB database and container for your lending knowledge base
  • Python 3.10+
  • Installed packages:
    • openai
    • azure-cosmos
    • python-dotenv
  • Environment variables configured:
    • AZURE_OPENAI_ENDPOINT
    • AZURE_OPENAI_API_KEY
    • AZURE_OPENAI_API_VERSION
    • AZURE_OPENAI_CHAT_DEPLOYMENT
    • AZURE_OPENAI_EMBEDDING_DEPLOYMENT
    • COSMOS_ENDPOINT
    • COSMOS_KEY
    • COSMOS_DATABASE_NAME
    • COSMOS_CONTAINER_NAME

Integration Steps

  1. Set up the clients for Azure OpenAI and Cosmos DB.

Use the Azure OpenAI SDK for chat and embeddings, and the Cosmos SDK for storing your source documents and vector-like payloads.

import os
from openai import AzureOpenAI
from azure.cosmos import CosmosClient

azure_openai_client = AzureOpenAI(
    api_key=os.environ["AZURE_OPENAI_API_KEY"],
    api_version=os.environ["AZURE_OPENAI_API_VERSION"],
    azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
)

cosmos_client = CosmosClient(
    url=os.environ["COSMOS_ENDPOINT"],
    credential=os.environ["COSMOS_KEY"],
)

database = cosmos_client.get_database_client(os.environ["COSMOS_DATABASE_NAME"])
container = database.get_container_client(os.environ["COSMOS_CONTAINER_NAME"])
  1. Create embeddings for lending documents before storing them in Cosmos DB.

For RAG, you need chunks plus embeddings. In lending, these chunks usually come from credit policy PDFs, product terms, KYC rules, servicing procedures, or collections scripts.

def embed_text(text: str) -> list[float]:
    response = azure_openai_client.embeddings.create(
        model=os.environ["AZURE_OPENAI_EMBEDDING_DEPLOYMENT"],
        input=text,
    )
    return response.data[0].embedding

chunk = {
    "id": "policy_001_chunk_01",
    "docType": "lending_policy",
    "title": "Underwriting Policy",
    "chunkText": "Debt-to-income ratio must not exceed 45% for standard personal loans.",
}

chunk["embedding"] = embed_text(chunk["chunkText"])
  1. Store the chunk in Cosmos DB with metadata needed for retrieval.

Cosmos DB is doing two jobs here: durable document storage and retrieval source of truth. Keep the schema simple so your agent can filter by product type, jurisdiction, or document version.

item = {
    "id": chunk["id"],
    "docType": chunk["docType"],
    "title": chunk["title"],
    "chunkText": chunk["chunkText"],
    "embedding": chunk["embedding"],
    "product": "personal_loan",
    "jurisdiction": "US",
    "version": "2025-01",
}

container.upsert_item(item)
  1. Retrieve relevant chunks from Cosmos DB and send them to Azure OpenAI.

Cosmos DB does not magically become a RAG engine by itself. In a production setup you either use vector search features where available or do a filtered retrieval strategy over preselected chunks, then pass those chunks into the chat prompt.

def retrieve_relevant_chunks(query: str):
    query_embedding = embed_text(query)

    # Simplified example: fetch candidate docs by metadata.
    # Replace with vector search or semantic ranking in your production setup.
    query_text = """
        SELECT TOP 5 c.id, c.title, c.chunkText, c.product, c.jurisdiction
        FROM c
        WHERE c.docType = @docType AND c.product = @product
    """

    params = [
        {"name": "@docType", "value": "lending_policy"},
        {"name": "@product", "value": "personal_loan"},
    ]

    candidates = list(container.query_items(
        query=query_text,
        parameters=params,
        enable_cross_partition_query=True,
    ))

    return candidates[:5]
  1. Generate an answer grounded in retrieved context.

Keep the prompt strict. For lending workflows, you want concise answers with citations back to the retrieved chunks so reviewers can trace the decision path.

def answer_lending_question(question: str) -> str:
    chunks = retrieve_relevant_chunks(question)
    context = "\n\n".join(
        f"[{c['id']}] {c['title']}: {c['chunkText']}"
        for c in chunks
    )

    messages = [
        {
            "role": "system",
            "content": (
                "You are a lending assistant. Answer only using the provided context. "
                "If the context is insufficient, say what is missing."
            ),
        },
        {
            "role": "user",
            "content": f"Context:\n{context}\n\nQuestion: {question}",
        },
    ]

    response = azure_openai_client.chat.completions.create(
        model=os.environ["AZURE_OPENAI_CHAT_DEPLOYMENT"],
        messages=messages,
        temperature=0.2,
    )

    return response.choices[0].message.content

print(answer_lending_question("What is the maximum DTI allowed for a personal loan?"))

Testing the Integration

Run a basic end-to-end test with a known policy chunk and a question that should be answered directly from it.

test_question = "What is the maximum debt-to-income ratio for standard personal loans?"
result = answer_lending_question(test_question)
print(result)

Expected output:

The maximum debt-to-income ratio for standard personal loans is 45%.
Source: [policy_001_chunk_01] Underwriting Policy

If you get an answer without grounding in the stored policy text, your retrieval layer is too loose. If you get “I don’t know,” check that your chunk was stored correctly and that your filters match the document metadata.

Real-World Use Cases

  • Loan officer copilot
    • Answer questions about underwriting rules, exceptions, required documents, and product eligibility with citations back to policy chunks.
  • Borrower support agent
    • Help customers understand repayment schedules, fee policies, refinance options, and missing-document requests using approved content only.
  • Credit operations assistant
    • Summarize case notes, pull relevant policy snippets during manual review, and generate consistent decision explanations for audit trails.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides