How to Integrate Azure OpenAI for healthcare with CosmosDB for AI agents

By Cyprian AaronsUpdated 2026-04-21

azure-openai-for-healthcarecosmosdbai-agents

Combining Azure OpenAI for healthcare with Cosmos DB gives you a practical pattern for building regulated AI agents that can reason over clinical context and persist state safely. The model handles summarization, extraction, and next-step recommendations, while Cosmos DB stores patient conversation history, retrieved documents, audit metadata, and agent memory.

Prerequisites

•Python 3.10+
•An Azure subscription
•An Azure OpenAI resource with a deployed model
•Azure AI Health Insights or healthcare-related content you want to process
•An Azure Cosmos DB account
•A Cosmos DB database and container created
•AZURE_OPENAI_ENDPOINT, AZURE_OPENAI_API_KEY, COSMOS_ENDPOINT, COSMOS_KEY set as environment variables
•
These Python packages installed:
- •openai
- •azure-cosmos
- •python-dotenv

pip install openai azure-cosmos python-dotenv

Integration Steps

•
Initialize both clients

Start by wiring up the Azure OpenAI client and the Cosmos DB client in the same service layer. Keep credentials out of code and use environment variables.

import os
from openai import AzureOpenAI
from azure.cosmos import CosmosClient, PartitionKey

azure_openai_client = AzureOpenAI(
    api_key=os.environ["AZURE_OPENAI_API_KEY"],
    api_version="2024-02-15-preview",
    azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
)

cosmos_client = CosmosClient(
    url=os.environ["COSMOS_ENDPOINT"],
    credential=os.environ["COSMOS_KEY"]
)

database_name = "agentdb"
container_name = "patient_memory"

database = cosmos_client.create_database_if_not_exists(id=database_name)
container = database.create_container_if_not_exists(
    id=container_name,
    partition_key=PartitionKey(path="/patient_id"),
    offer_throughput=400
)

•
Create a healthcare-aware agent prompt

In healthcare, your prompt needs hard constraints: no diagnosis claims, cite uncertainty, and prefer structured output. Use the chat completions API so you can control roles and maintain conversation state.

def generate_clinical_summary(patient_note: str) -> str:
    response = azure_openai_client.chat.completions.create(
        model="gpt-4o-mini",  # your deployed Azure OpenAI model name
        messages=[
            {
                "role": "system",
                "content": (
                    "You are a healthcare assistant for clinical documentation. "
                    "Summarize the note into structured sections: symptoms, meds, risks, follow-up. "
                    "Do not diagnose. If data is missing, say so."
                ),
            },
            {"role": "user", "content": patient_note},
        ],
        temperature=0.2,
    )
    return response.choices[0].message.content

•
Persist agent memory in Cosmos DB

Store each interaction as a document keyed by patient or case ID. This gives your agent durable memory across sessions and lets you build audit trails later.

from datetime import datetime, timezone

def save_agent_memory(patient_id: str, user_input: str, summary: str):
    item = {
        "id": f"{patient_id}-{datetime.now(timezone.utc).isoformat()}",
        "patient_id": patient_id,
        "user_input": user_input,
        "summary": summary,
        "created_at": datetime.now(timezone.utc).isoformat(),
        "source": "azure-openai-healthcare-agent"
    }
    container.upsert_item(item)
    return item

def load_recent_memory(patient_id: str):
    query = """
    SELECT TOP 5 * FROM c
    WHERE c.patient_id = @patient_id
    ORDER BY c.created_at DESC
    """
    items = list(container.query_items(
        query=query,
        parameters=[{"name": "@patient_id", "value": patient_id}],
        enable_cross_partition_query=True
    ))
    return items

•
Combine retrieval from Cosmos with generation from Azure OpenAI

This is the core integration pattern: pull prior context from Cosmos DB, feed it into the model, then write back the result. That turns stateless chat into an agent with working memory.

def run_healthcare_agent(patient_id: str, new_note: str):
    history = load_recent_memory(patient_id)

    context_block = "\n\n".join(
        [f"- {item['summary']}" for item in history]
    ) if history else "No prior history available."

    response = azure_openai_client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {
                "role": "system",
                "content": (
                    "You are a healthcare AI agent assisting clinicians. "
                    "Use prior context carefully and keep outputs concise."
                ),
            },
            {
                "role": "user",
                "content": (
                    f"Prior context:\n{context_block}\n\n"
                    f"New note:\n{new_note}\n\n"
                    "Return a structured summary and recommended next action."
                ),
            },
        ],
        temperature=0.2,
    )

    result = response.choices[0].message.content
    save_agent_memory(patient_id=patient_id, user_input=new_note, summary=result)
    return result

•
Add retrieval filtering for safer context injection

Don’t dump everything into the prompt. Filter by recency or document type so your agent only sees relevant medical context.

def get_context_by_type(patient_id: str, doc_type: str):
    query = """
    SELECT * FROM c
    WHERE c.patient_id = @patient_id AND c.doc_type = @doc_type
    ORDER BY c.created_at DESC
    """
    return list(container.query_items(
        query=query,
        parameters=[
            {"name": "@patient_id", "value": patient_id},
            {"name": "@doc_type", "value": doc_type}
        ],
        enable_cross_partition_query=True
    ))

Testing the Integration

Use one sample clinical note, generate a summary, store it in Cosmos DB, then read it back.

if __name__ == "__main__":
    patient_id = "patient-123"
    note = (
        "Patient reports mild shortness of breath on exertion for 3 days. "
        "Current meds include lisinopril 10mg daily. No chest pain reported."
    )

    summary = run_healthcare_agent(patient_id, note)
    print("MODEL OUTPUT:\n", summary)

    recent_items = load_recent_memory(patient_id)
    print("\nCOSMOS RECORDS:", len(recent_items))

Expected output:

MODEL OUTPUT:
Symptoms: mild shortness of breath on exertion for 3 days.
Medications: lisinopril 10mg daily.
Risks: missing vitals and oxygen saturation; chest pain denied.
Follow-up: review vitals and consider clinician assessment if symptoms worsen.

COSMOS RECORDS: 1

Real-World Use Cases

•Clinical intake agents that summarize patient-reported symptoms and store them as structured memory for later review.
•Care coordination assistants that retrieve prior encounters from Cosmos DB and draft follow-up actions for nurses or case managers.
•Insurance triage workflows that extract medical facts from notes, persist claim context, and generate consistent reviewer summaries.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit