How to Integrate LangGraph for insurance with Kubernetes for AI agents

By Cyprian AaronsUpdated 2026-04-21

langgraph-for-insurancekubernetesai-agents

LangGraph for insurance gives you the orchestration layer for claims, underwriting, policy servicing, and document-heavy workflows. Kubernetes gives you the runtime to run those agents reliably across pods, scale them under load, and keep them isolated from the rest of your platform.

Put together, you get an agent system that can route insurance tasks through a graph while Kubernetes handles deployment, retries, and horizontal scaling.

Prerequisites

•Python 3.10+
•
A running Kubernetes cluster
- •kubectl configured and authenticated
- •Namespace created for your agent workloads
•Access to a LangGraph-compatible insurance workflow project
•
Installed Python packages:
- •langgraph
- •langchain-core
- •kubernetes
- •pydantic
•A container registry for pushing your agent image
•
Basic familiarity with:
- •Kubernetes Deployments and Services
- •Python async code
- •Insurance workflow concepts like FNOL, claims triage, and policy checks

Integration Steps

•Build the LangGraph insurance workflow

Start by defining the graph that routes insurance requests through deterministic nodes. In production, keep each node small: intake, policy lookup, claims triage, and escalation.

from typing import TypedDict
from langgraph.graph import StateGraph, END

class InsuranceState(TypedDict):
    claim_id: str
    policy_id: str
    status: str
    decision: str

def intake(state: InsuranceState) -> InsuranceState:
    state["status"] = "received"
    return state

def policy_check(state: InsuranceState) -> InsuranceState:
    state["status"] = "policy_checked"
    state["decision"] = "approved"
    return state

def escalate(state: InsuranceState) -> InsuranceState:
    state["status"] = "escalated"
    state["decision"] = "manual_review"
    return state

graph = StateGraph(InsuranceState)
graph.add_node("intake", intake)
graph.add_node("policy_check", policy_check)
graph.add_node("escalate", escalate)

graph.set_entry_point("intake")
graph.add_edge("intake", "policy_check")
graph.add_conditional_edges(
    "policy_check",
    lambda s: "escalate" if s["decision"] == "manual_review" else END,
    {"escalate": "escalate", END: END},
)

app = graph.compile()

This gives you a runnable agent workflow with StateGraph, add_node, add_edge, and compile().

•Package the graph as a service your cluster can run

Kubernetes should not run raw notebooks or ad hoc scripts. Wrap the graph in a small API process so each pod can execute insurance workflows on demand.

from fastapi import FastAPI
from pydantic import BaseModel
from typing import Optional

app_api = FastAPI()

class ClaimRequest(BaseModel):
    claim_id: str
    policy_id: str
    decision: Optional[str] = None

@app_api.post("/run")
async def run_workflow(req: ClaimRequest):
    initial_state = {
        "claim_id": req.claim_id,
        "policy_id": req.policy_id,
        "status": "new",
        "decision": req.decision or ""
    }
    result = app.invoke(initial_state)
    return result

The important part is app.invoke(...). That is the execution boundary you will call from inside the pod or from another service.

•Create a Kubernetes client inside the agent service

If your agent needs to inspect pods, read job status, or trigger workloads based on claim volume, use the official Kubernetes Python client. This is where the orchestration layer meets infrastructure control.

from kubernetes import client, config

def get_k8s_client():
    try:
        config.load_incluster_config()
    except config.ConfigException:
        config.load_kube_config()
    return client.CoreV1Api()

v1 = get_k8s_client()
pods = v1.list_namespaced_pod(namespace="insurance-agents")
for pod in pods.items:
    print(pod.metadata.name, pod.status.phase)

Use load_incluster_config() when running inside the cluster. Use list_namespaced_pod() to inspect workload health before deciding whether to route more claims into the system.

•Trigger Kubernetes-aware behavior from LangGraph nodes

This is where the integration becomes useful. A node can check cluster capacity before approving a high-cost workflow, or send jobs to a dedicated queue when CPU pressure is high.

from kubernetes import client, config

def cluster_pressure_check(state):
    try:
        config.load_incluster_config()
    except config.ConfigException:
        config.load_kube_config()

    v1 = client.CoreV1Api()
    pods = v1.list_namespaced_pod(namespace="insurance-agents")
    running = sum(1 for p in pods.items if p.status.phase == "Running")

    state["status"] = f"cluster_running_pods_{running}"
    if running > 20:
        state["decision"] = "manual_review"
        return state

    state["decision"] = "approved"
    return state

You can swap this into your graph as a node before final approval. That pattern lets your insurance logic react to platform conditions instead of blindly processing every request.

•Deploy with Kubernetes primitives

Use a Deployment for replicas and a Service for internal access. The pod runs your API wrapper around the LangGraph app.

from kubernetes import client

container = client.V1Container(
    name="insurance-agent",
    image="registry.example.com/insurance-agent:1.0.0",
    ports=[client.V1ContainerPort(container_port=8000)],
)

template = client.V1PodTemplateSpec(
    metadata=client.V1ObjectMeta(labels={"app": "insurance-agent"}),
    spec=client.V1PodSpec(containers=[container]),
)

spec = client.V1DeploymentSpec(
    replicas=3,
    selector=client.V1LabelSelector(match_labels={"app": "insurance-agent"}),
    template=template,
)

deployment = client.V1Deployment(
    api_version="apps/v1",
    kind="Deployment",
    metadata=client.V1ObjectMeta(name="insurance-agent"),
    spec=spec,
)

This is not meant to replace YAML forever, but it shows how your Python code can generate or manage cluster resources using kubernetes.client.

Testing the Integration

Run the graph locally first, then validate Kubernetes access from the same runtime.

test_state = {
    "claim_id": "CLM-10021",
    "policy_id": "POL-77881",
    "status": "new",
}
result = app.invoke(test_state)
print(result)

from kubernetes import config, client
config.load_kube_config()
v1 = client.CoreV1Api()
ns_list = v1.list_namespace()
print([ns.metadata.name for ns in ns_list.items][:5])

Expected output:

{'claim_id': 'CLM-10021', 'policy_id': 'POL-77881', 'status': 'policy_checked', 'decision': 'approved'}
['default', 'kube-system', 'insurance-agents', 'ingress-nginx']

If both calls succeed, your LangGraph workflow executes correctly and your process can talk to the cluster API.

Real-World Use Cases

•
Claims triage at scale
- •Route FNOL intake through LangGraph nodes while Kubernetes autoscaling handles traffic spikes after storms or major events.
•
Policy servicing agents
- •Let agents check document completeness, validate coverage rules, and escalate edge cases while pods stay isolated per tenant or region.
•
Operational control loops
- •Build agents that inspect pod health with list_namespaced_pod() and change workflow behavior when cluster pressure crosses thresholds.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit