How to Integrate LangGraph for healthcare with Kubernetes for AI agents

By Cyprian AaronsUpdated 2026-04-21

langgraph-for-healthcarekubernetesai-agents

Healthcare agent systems need two things at the same time: structured clinical reasoning and reliable runtime isolation. LangGraph for healthcare gives you the orchestration layer for stateful agent workflows, while Kubernetes gives you the control plane to run those workflows safely, scale them, and recover from failures.

Prerequisites

•Python 3.10+
•
A Kubernetes cluster
- •local: kind, minikube, or k3d
- •remote: EKS, GKE, AKS, or on-prem
•kubectl configured with access to your cluster
•A container registry for pushing your agent image
•
Python packages:
- •langgraph
- •kubernetes
- •pydantic
•
A healthcare-safe data boundary in place
- •de-identified payloads for non-production testing
- •secrets stored in Kubernetes Secrets or an external vault

Install the SDKs:

pip install langgraph kubernetes pydantic

Integration Steps

•Build a LangGraph workflow for a healthcare task.

Use LangGraph to model a deterministic flow, not a loose chain of prompts. For healthcare agents, a common pattern is triage -> summarize -> route to human review.

from typing import TypedDict, List
from langgraph.graph import StateGraph, END

class ClinicalState(TypedDict):
    symptoms: str
    risk_level: str
    summary: str

def triage_node(state: ClinicalState) -> ClinicalState:
    symptoms = state["symptoms"].lower()
    if "chest pain" in symptoms or "shortness of breath" in symptoms:
        return {**state, "risk_level": "high"}
    if "fever" in symptoms or "cough" in symptoms:
        return {**state, "risk_level": "medium"}
    return {**state, "risk_level": "low"}

def summarize_node(state: ClinicalState) -> ClinicalState:
    summary = f"Symptoms: {state['symptoms']}. Risk: {state['risk_level']}."
    return {**state, "summary": summary}

graph = StateGraph(ClinicalState)
graph.add_node("triage", triage_node)
graph.add_node("summarize", summarize_node)
graph.set_entry_point("triage")
graph.add_edge("triage", "summarize")
graph.add_edge("summarize", END)

app = graph.compile()

•Package the graph as a service that Kubernetes can run.

Your graph should be callable from an API layer. Keep the runtime stateless and pass state in the request body.

from fastapi import FastAPI
from pydantic import BaseModel

class PatientRequest(BaseModel):
    symptoms: str

app_api = FastAPI()

@app_api.post("/assess")
def assess_patient(req: PatientRequest):
    result = app.invoke({"symptoms": req.symptoms, "risk_level": "", "summary": ""})
    return result

•Create a Kubernetes client from inside the cluster or from your workstation.

Use the official Kubernetes Python client to talk to the API server. This is how your agent system can create jobs for heavy workloads or inspect pod health before routing requests.

from kubernetes import client, config

config.load_kube_config()  # use config.load_incluster_config() inside a pod

v1 = client.CoreV1Api()
pods = v1.list_namespaced_pod(namespace="healthcare-agents")

for pod in pods.items:
    print(pod.metadata.name, pod.status.phase)

•Launch LangGraph-backed workloads on Kubernetes Jobs.

A good production pattern is to use Kubernetes Jobs for long-running clinical batch tasks like chart summarization or claim pre-checks. The agent decides what to run; Kubernetes handles scheduling and retries.

from kubernetes import client, config

config.load_kube_config()
batch_v1 = client.BatchV1Api()

job_manifest = client.V1Job(
    metadata=client.V1ObjectMeta(name="clinical-summary-job"),
    spec=client.V1JobSpec(
        backoff_limit=2,
        template=client.V1PodTemplateSpec(
            metadata=client.V1ObjectMeta(labels={"app": "clinical-summary"}),
            spec=client.V1PodSpec(
                restart_policy="Never",
                containers=[
                    client.V1Container(
                        name="worker",
                        image="registry.example.com/healthcare-agent:latest",
                        env=[
                            client.V1EnvVar(name="SYMPTOMS", value="fever and cough"),
                        ],
                    )
                ],
            ),
        ),
    ),
)

batch_v1.create_namespaced_job(namespace="healthcare-agents", body=job_manifest)

•Wire the agent runtime to query Kubernetes before execution.

This lets your graph make deployment-aware decisions. For example, route high-risk workloads to a dedicated node pool or fail closed if the namespace is unhealthy.

from kubernetes import client, config

def namespace_ready(namespace: str) -> bool:
    config.load_kube_config()
    v1 = client.CoreV1Api()
    pods = v1.list_namespaced_pod(namespace=namespace)
    running = [p for p in pods.items if p.status.phase == "Running"]
    return len(running) > 0

def guarded_triage(state):
    if not namespace_ready("healthcare-agents"):
        return {**state, "risk_level": "degraded"}
    return triage_node(state)

Testing the Integration

Run the graph locally first, then verify Kubernetes connectivity from the same codebase.

if __name__ == "__main__":
    output = app.invoke({
        "symptoms": "shortness of breath and chest pain",
        "risk_level": "",
        "summary": ""
    })
    print(output)

    config.load_kube_config()
    v1 = client.CoreV1Api()
    namespaces = [ns.metadata.name for ns in v1.list_namespace().items]
    print("healthcare-agents" in namespaces)

Expected output:

{'symptoms': 'shortness of breath and chest pain', 'risk_level': 'high', 'summary': 'Symptoms: shortness of breath and chest pain. Risk: high.'}
True

Real-World Use Cases

•
Clinical intake routing
- •LangGraph handles symptom collection and escalation logic.
- •Kubernetes runs each intake worker in isolated pods with strict resource limits.
•
Chart summarization at scale
- •Use LangGraph to break chart review into extraction, summarization, and validation steps.
- •Use Kubernetes Jobs to process batches overnight with retry control.
•
Prior authorization assistants
- •The graph can gather missing documentation and decide whether a case needs human review.
- •Kubernetes lets you deploy separate workers for different payer rulesets and scale them independently.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit