How to Integrate LangGraph for insurance with Kubernetes for startups
Combining LangGraph for insurance with Kubernetes gives you a clean way to run policy, claims, and underwriting agents as stateful workflows while keeping the infrastructure elastic and isolated. For startups, that means you can ship an insurance assistant that handles document intake, triage, and decision routing without hardwiring the whole system into one monolith.
Prerequisites
- •Python 3.10+
- •A Kubernetes cluster
- •Local:
kind,minikube, or Docker Desktop Kubernetes - •Cloud: GKE, EKS, or AKS
- •Local:
- •
kubectlconfigured and pointing at your cluster - •A container registry your cluster can pull from
- •LangGraph installed in your app environment
- •Kubernetes Python client installed
- •Access to your insurance workflow dependencies:
- •LLM provider API key
- •Document store or object storage
- •Any policy/claims rules service you already use
Install the Python packages:
pip install langgraph kubernetes pydantic
Integration Steps
1) Define the insurance workflow in LangGraph
Start with a stateful graph that handles intake, classification, and routing. In insurance systems, this is where you decide whether a submission goes to claims, underwriting, or human review.
from typing import TypedDict, Literal
from langgraph.graph import StateGraph, END
class InsuranceState(TypedDict):
message: str
route: str
result: str
def classify_intent(state: InsuranceState) -> InsuranceState:
text = state["message"].lower()
if "claim" in text:
state["route"] = "claims"
elif "quote" in text or "underwrite" in text:
state["route"] = "underwriting"
else:
state["route"] = "review"
return state
def process_claims(state: InsuranceState) -> InsuranceState:
state["result"] = f"Claims workflow started for: {state['message']}"
return state
def process_underwriting(state: InsuranceState) -> InsuranceState:
state["result"] = f"Underwriting workflow started for: {state['message']}"
return state
def human_review(state: InsuranceState) -> InsuranceState:
state["result"] = f"Sent to manual review: {state['message']}"
return state
graph = StateGraph(InsuranceState)
graph.add_node("classify_intent", classify_intent)
graph.add_node("claims", process_claims)
graph.add_node("underwriting", process_underwriting)
graph.add_node("review", human_review)
graph.set_entry_point("classify_intent")
graph.add_conditional_edges(
"classify_intent",
lambda s: s["route"],
{
"claims": "claims",
"underwriting": "underwriting",
"review": "review",
},
)
graph.add_edge("claims", END)
graph.add_edge("underwriting", END)
graph.add_edge("review", END)
app = graph.compile()
2) Package the graph into a service that Kubernetes can run
For startups, don’t try to run the graph inside a notebook or a one-off script. Put it behind a small API so pods can scale independently.
from fastapi import FastAPI
from pydantic import BaseModel
app_api = FastAPI()
class IntakeRequest(BaseModel):
message: str
@app_api.post("/intake")
def intake(req: IntakeRequest):
initial_state = {
"message": req.message,
"route": "",
"result": "",
}
output = app.invoke(initial_state)
return output
Run this as a container image with Uvicorn. Your Kubernetes deployment will point at this API and let LangGraph manage the workflow logic while K8s manages scheduling and scaling.
3) Create the Kubernetes deployment and service from Python
Use the Kubernetes Python client when you want your agent system to provision itself dynamically. This is useful when each customer or environment needs isolated workloads.
from kubernetes import client, config
config.load_kube_config()
apps_v1 = client.AppsV1Api()
core_v1 = client.CoreV1Api()
deployment = client.V1Deployment(
metadata=client.V1ObjectMeta(name="insurance-agent"),
spec=client.V1DeploymentSpec(
replicas=2,
selector=client.V1LabelSelector(
match_labels={"app": "insurance-agent"}
),
template=client.V1PodTemplateSpec(
metadata=client.V1ObjectMeta(labels={"app": "insurance-agent"}),
spec=client.V1PodSpec(
containers=[
client.V1Container(
name="insurance-agent",
image="registry.example.com/insurance-agent:latest",
ports=[client.V1ContainerPort(container_port=8000)],
)
]
),
),
),
)
service = client.V1Service(
metadata=client.V1ObjectMeta(name="insurance-agent-service"),
spec=client.V1ServiceSpec(
selector={"app": "insurance-agent"},
ports=[client.V1ServicePort(port=80, target_port=8000)],
type="ClusterIP",
),
)
apps_v1.create_namespaced_deployment(namespace="default", body=deployment)
core_v1.create_namespaced_service(namespace="default", body=service)
This pattern gives you a repeatable way to deploy the agent backend without manually applying manifests every time.
4) Call the LangGraph-backed service from inside the cluster
Once deployed, other services can call your insurance agent over HTTP. That lets claims intake apps, CRM tools, or internal ops dashboards trigger workflows without knowing anything about graph internals.
import requests
response = requests.post(
"http://insurance-agent-service.default.svc.cluster.local/intake",
json={"message": "New claim submitted for water damage"},
timeout=10,
)
print(response.json())
If you need tighter integration inside Kubernetes jobs or worker pods, keep this HTTP boundary. It gives you clear retries, observability hooks, and easier rollout control.
5) Add autoscaling for traffic spikes
Insurance workloads spike during storms, outages, open enrollment windows, and claims surges. Horizontal Pod Autoscaling keeps your agent available without overprovisioning.
from kubernetes import client, config
config.load_kube_config()
autoscaling_v2 = client.AutoscalingV2Api()
hpa = client.V2HorizontalPodAutoscaler(
metadata=client.V2ObjectMeta(name="insurance-agent-hpa"),
spec=client.V2HorizontalPodAutoscalerSpec(
scale_target_ref=client.V2CrossVersionObjectReference(
api_version="apps/v1",
kind="Deployment",
name="insurance-agent",
),
min_replicas=2,
max_replicas=10,
metrics=[
client.V2MetricSpec(
type="Resource",
resource=client.V2ResourceMetricSource(
name="cpu",
target=client.V2MetricTarget(
type="Utilization",
average_utilization=70,
),
),
)
],
),
)
autoscaling_v2.create_namespaced_horizontal_pod_autoscaler(
namespace="default",
body=hpa,
)
Testing the Integration
Use a simple end-to-end check: send an insurance request to the graph API and confirm it routes correctly. If you want to test from outside the cluster during development, port-forward the service first.
kubectl port-forward svc/insurance-agent-service 8080:80
Then test it:
import requests
resp = requests.post(
"http://localhost:8080/intake",
json={"message": "Please file a claim for roof damage"},
)
print(resp.status_code)
print(resp.json()["route"])
print(resp.json()["result"])
Expected output:
200
claims
Claims workflow started for: Please file a claim for roof damage
Real-World Use Cases
- •Claims triage agent that classifies incoming submissions, extracts required fields, and routes complex cases to human adjusters.
- •Underwriting assistant that checks application completeness, triggers enrichment jobs, and spins up isolated review workers per tenant.
- •Policy servicing bot that handles endorsements, coverage questions, and document generation with workload isolation across Kubernetes namespaces.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit