How to Integrate Azure OpenAI for healthcare with CosmosDB for production AI

By Cyprian AaronsUpdated 2026-04-21
azure-openai-for-healthcarecosmosdbproduction-ai

Azure OpenAI for healthcare gives you clinically grounded language generation and extraction. CosmosDB gives you low-latency, globally distributed state for patient context, conversation memory, and audit trails.

Together, they let you build production AI agents that can summarize encounters, answer care-navigation questions, triage intake data, and persist structured clinical context without bolting on a separate database layer.

Prerequisites

  • An Azure subscription with:
    • Azure OpenAI resource deployed
    • A healthcare-capable model deployment name
    • Azure Cosmos DB account provisioned
  • Python 3.10+
  • Installed packages:
    • azure-openai client package if your org uses the new Azure OpenAI SDK wrapper
    • openai
    • azure-cosmos
    • python-dotenv
  • Environment variables set:
    • AZURE_OPENAI_ENDPOINT
    • AZURE_OPENAI_API_KEY
    • AZURE_OPENAI_DEPLOYMENT
    • COSMOS_ENDPOINT
    • COSMOS_KEY
    • COSMOS_DATABASE
    • COSMOS_CONTAINER

Install dependencies:

pip install openai azure-cosmos python-dotenv

Integration Steps

1) Initialize both clients

Use Azure OpenAI for generation and CosmosDB for persistence. In production, keep secrets in Key Vault or your platform secret store, not in code.

import os
from dotenv import load_dotenv
from openai import AzureOpenAI
from azure.cosmos import CosmosClient

load_dotenv()

aoai_client = AzureOpenAI(
    api_key=os.environ["AZURE_OPENAI_API_KEY"],
    api_version="2024-02-15-preview",
    azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
)

cosmos_client = CosmosClient(
    url=os.environ["COSMOS_ENDPOINT"],
    credential=os.environ["COSMOS_KEY"]
)

database = cosmos_client.get_database_client(os.environ["COSMOS_DATABASE"])
container = database.get_container_client(os.environ["COSMOS_CONTAINER"])
deployment_name = os.environ["AZURE_OPENAI_DEPLOYMENT"]

2) Create a patient context record in CosmosDB

Store only the fields your agent needs. For healthcare workloads, keep PHI scope tight and separate identifiers from free-text notes where possible.

patient_context = {
    "id": "patient-10021",
    "patientId": "10021",
    "partitionKey": "10021",
    "age": 67,
    "sex": "female",
    "conditions": ["type_2_diabetes", "hypertension"],
    "recentSymptoms": ["fatigue", "increased thirst"],
    "lastUpdated": "2026-04-21T10:00:00Z"
}

container.upsert_item(patient_context)
print("Saved patient context to CosmosDB")

If your container uses /partitionKey, make sure every document includes that field. That avoids cross-partition queries during agent execution.

3) Pull context from CosmosDB and send it to Azure OpenAI

This is the core pattern: retrieve structured state first, then ask the model to reason over it. Do not let the model invent patient details.

query = """
SELECT * FROM c
WHERE c.patientId = @patientId
"""

items = list(container.query_items(
    query=query,
    parameters=[{"name": "@patientId", "value": "10021"}],
    enable_cross_partition_query=True
))

patient = items[0]

messages = [
    {
        "role": "system",
        "content": (
            "You are a healthcare assistant. "
            "Use only the provided patient context. "
            "Do not diagnose. Recommend safe next steps and escalation when needed."
        )
    },
    {
        "role": "user",
        "content": f"""
Patient context:
{patient}

Task:
Summarize likely concerns in one paragraph and list safe follow-up questions.
"""
    }
]

response = aoai_client.chat.completions.create(
    model=deployment_name,
    messages=messages,
    temperature=0.2,
)

summary = response.choices[0].message.content
print(summary)

4) Write the AI output back to CosmosDB as an audit record

For production AI, every model response should be traceable. Persist the prompt version, output, timestamp, and correlation ID so you can replay incidents later.

audit_record = {
    "id": f"audit-{patient['patientId']}-001",
    "patientId": patient["patientId"],
    "partitionKey": patient["partitionKey"],
    "model": deployment_name,
    "promptVersion": "v1",
    "inputSnapshot": {
        "age": patient["age"],
        "conditions": patient["conditions"],
        "recentSymptoms": patient["recentSymptoms"]
    },
    "outputSummary": summary,
    "createdAt": "2026-04-21T10:05:00Z"
}

container.upsert_item(audit_record)
print("Saved audit record to CosmosDB")

5) Add a reusable agent function for production flows

Wrap retrieval, generation, and persistence into one function. This is what you call from your API layer or orchestrator.

def generate_care_summary(patient_id: str) -> str:
    items = list(container.query_items(
        query="SELECT * FROM c WHERE c.patientId = @patientId",
        parameters=[{"name": "@patientId", "value": patient_id}],
        enable_cross_partition_query=True
    ))

    if not items:
        raise ValueError(f"No patient found for {patient_id}")

    patient = items[0]

    response = aoai_client.chat.completions.create(
        model=deployment_name,
        messages=[
            {"role": "system", "content": (
                "You are a healthcare assistant. "
                "Summarize the case using only provided data."
            )},
            {"role": "user", "content": f"Patient JSON:\n{patient}\n\nReturn a short care summary."}
        ],
        temperature=0.2,
        max_tokens=250,
    )

    result = response.choices[0].message.content

    container.upsert_item({
        "id": f"audit-{patient_id}-{len(items)}",
        "patientId": patient_id,
        "partitionKey": str(patient_id),
        "result": result
    })

    return result


print(generate_care_summary("10021"))

Testing the Integration

Run a simple end-to-end check: write a test record, generate a response, then confirm the audit row exists in CosmosDB.

test_patient_id = "90001"

container.upsert_item({
    "id": test_patient_id,
    "patientId": test_patient_id,
    "partitionKey": test_patient_id,
    "age": 54,
    "conditions": ["asthma"],
    "recentSymptoms": ["shortness of breath after exertion"]
})

result = generate_care_summary(test_patient_id)
print("MODEL OUTPUT:", result)

saved = list(container.query_items(
    query="SELECT * FROM c WHERE c.patientId = @patientId AND IS_DEFINED(c.result)",
    parameters=[{"name": "@patientId", "value": test_patient_id}],
))
print("AUDIT COUNT:", len(saved))

Expected output:

MODEL OUTPUT: Patient reports exertional shortness of breath with asthma history...
AUDIT COUNT: 1

Real-World Use Cases

  • Clinical intake assistants

    • Collect symptoms from patients, store structured answers in CosmosDB, and use Azure OpenAI to generate a concise intake summary for staff review.
  • Care navigation agents

    • Retrieve appointment history, medication lists, and prior notes from CosmosDB.
    • Use Azure OpenAI to answer “what should I do next?” style questions with guarded language.
  • Chart summarization pipelines

    • Persist encounter fragments in CosmosDB.
    • Generate visit summaries, discharge drafts, or prior-auth support text with full auditability.

The production pattern here is simple: CosmosDB owns state, Azure OpenAI owns language reasoning, and your service layer enforces policy. Keep retrieval explicit, keep prompts narrow, and persist every generated artifact you might need to inspect later.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides