How to Integrate LangGraph for healthcare with Kubernetes for startups

By Cyprian AaronsUpdated 2026-04-21

langgraph-for-healthcarekubernetesstartups

Healthcare startups need agent systems that can handle patient workflows, triage, scheduling, and compliance without falling over under load. LangGraph gives you the control flow for multi-step clinical agents, while Kubernetes gives you the runtime to scale, isolate, and recover those agents in production.

Prerequisites

•Python 3.10+
•
A Kubernetes cluster:
- •local: kind, minikube, or k3d
- •cloud: EKS, GKE, or AKS
•kubectl configured and pointing at your cluster
•
Access to a healthcare-focused LangGraph setup:
- •langgraph
- •your healthcare tools wrapped as LangChain tools or plain Python callables
•A container registry for pushing images
•
Basic familiarity with:
- •Kubernetes Deployments
- •ConfigMaps and Secrets
- •Python async code

Install the core packages:

pip install langgraph kubernetes pydantic fastapi uvicorn

Integration Steps

•Build the healthcare workflow in LangGraph

Use LangGraph to model the clinical flow as a state machine. For healthcare, keep the graph explicit: intake, triage, escalation, and handoff.

from typing import TypedDict, Literal
from langgraph.graph import StateGraph, START, END

class PatientState(TypedDict):
    symptoms: str
    risk_level: Literal["low", "medium", "high"]
    next_action: str

def intake_node(state: PatientState) -> PatientState:
    text = state["symptoms"].lower()
    if any(term in text for term in ["chest pain", "shortness of breath", "fainting"]):
        return {**state, "risk_level": "high", "next_action": "escalate_to_clinician"}
    if any(term in text for term in ["fever", "pain", "cough"]):
        return {**state, "risk_level": "medium", "next_action": "schedule_follow_up"}
    return {**state, "risk_level": "low", "next_action": "self_care_guidance"}

graph = StateGraph(PatientState)
graph.add_node("intake", intake_node)
graph.add_edge(START, "intake")
graph.add_edge("intake", END)

app = graph.compile()
result = app.invoke({"symptoms": "patient reports chest pain and dizziness"})
print(result)

•Wrap the graph behind a service that Kubernetes can run

Expose the graph through FastAPI so Kubernetes can manage it like any other microservice. This is the cleanest way to connect LangGraph execution with cluster scheduling.

from fastapi import FastAPI
from pydantic import BaseModel
from typing import Literal

app_api = FastAPI()

class TriageRequest(BaseModel):
    symptoms: str

@app_api.post("/triage")
async def triage(req: TriageRequest):
    output = app.invoke({
        "symptoms": req.symptoms,
        "risk_level": "low",
        "next_action": ""
    })
    return output

Run it locally first:

uvicorn main:app_api --host 0.0.0.0 --port 8000

•Containerize the agent service for Kubernetes

Kubernetes runs containers, not Python files. Package the API into an image so your startup can deploy versioned agent workloads.

FROM python:3.11-slim

WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY main.py .

EXPOSE 8000
CMD ["uvicorn", "main:app_api", "--host", "0.0.0.0", "--port", "8000"]

A minimal deployment manifest:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: healthcare-agent
spec:
  replicas: 2
  selector:
    matchLabels:
      app: healthcare-agent
  template:
    metadata:
      labels:
        app: healthcare-agent
    spec:
      containers:
        - name: api
          image: your-registry/healthcare-agent:v1
          ports:
            - containerPort: 8000

•Deploy and manage it from Python using the Kubernetes client

If your startup needs automation, use the official Kubernetes Python client to create or update resources from a deployment pipeline or internal ops tool.

from kubernetes import client, config

config.load_kube_config()

apps_v1 = client.AppsV1Api()

deployment = client.V1Deployment(
    metadata=client.V1ObjectMeta(name="healthcare-agent"),
    spec=client.V1DeploymentSpec(
        replicas=2,
        selector=client.V1LabelSelector(match_labels={"app": "healthcare-agent"}),
        template=client.V1PodTemplateSpec(
            metadata=client.V1ObjectMeta(labels={"app": "healthcare-agent"}),
            spec=client.V1PodSpec(containers=[
                client.V1Container(
                    name="api",
                    image="your-registry/healthcare-agent:v1",
                    ports=[client.V1ContainerPort(container_port=8000)]
                )
            ])
        )
    )
)

apps_v1.create_namespaced_deployment(namespace="default", body=deployment)
print("Deployment created")

•Connect runtime health checks and scaling signals

For healthcare workloads, you want liveness checks and autoscaling tied to traffic spikes like appointment surges or post-discharge follow-ups.

from kubernetes import client, config

config.load_kube_config()
autoscaling_v2 = client.AutoscalingV2Api()

hpa = client.V2HorizontalPodAutoscaler(
    metadata=client.V1ObjectMeta(name="healthcare-agent-hpa"),
    spec=client.V2HorizontalPodAutoscalerSpec(
        scale_target_ref=client.V2CrossVersionObjectReference(
            api_version="apps/v1",
            kind="Deployment",
            name="healthcare-agent"
        ),
        min_replicas=2,
        max_replicas=10,
        metrics=[
            client.V2MetricSpec(
                type="Resource",
                resource=client.V2ResourceMetricSource(
                    name="cpu",
                    target=client.V2MetricTarget(type="Utilization", average_utilization=60)
                )
            )
        ]
    )
)

autoscaling_v2.create_namespaced_horizontal_pod_autoscaler(namespace="default", body=hpa)
print("HPA created")

Testing the Integration

Send a request to the running service and confirm the graph response matches the expected triage logic.

import requests

resp = requests.post(
    "http://localhost:8000/triage",
    json={"symptoms": "patient has chest pain and shortness of breath"}
)

print(resp.status_code)
print(resp.json())

Expected output:

200
{'symptoms': 'patient has chest pain and shortness of breath', 'risk_level': 'high', 'next_action': 'escalate_to_clinician'}

If you want to verify Kubernetes sees the workload:

kubectl get deployments,pods,hpa

Real-World Use Cases

•
Clinical intake routing
- •Use LangGraph to classify symptoms and route patients to self-care content, nurse callbacks, or clinician escalation.
•
Prior authorization assistants
- •Run document-checking and policy-validation agents in Kubernetes with separate pods per tenant or insurer.
•
Post-discharge follow-up automation
- •Trigger reminder flows, medication adherence checks, and escalation paths based on patient responses.

The pattern here is simple: LangGraph handles decision logic; Kubernetes handles execution boundaries. For startups building healthcare agents, that split keeps your workflow understandable before it becomes expensive.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit