How to Integrate LangGraph for insurance with Kubernetes for multi-agent systems

By Cyprian AaronsUpdated 2026-04-21
langgraph-for-insurancekubernetesmulti-agent-systems

Combining LangGraph for insurance with Kubernetes gives you a clean way to run multi-agent insurance workflows as real services, not just local scripts. LangGraph handles the agent orchestration and stateful decision flow; Kubernetes gives you scheduling, scaling, and isolation for claims, underwriting, fraud, and policy servicing agents.

This setup is useful when one request needs multiple specialist agents to cooperate, with retries, observability, and deployment control across environments.

Prerequisites

  • Python 3.10+
  • A running Kubernetes cluster
    • Local: kind, minikube, or k3d
    • Remote: EKS, GKE, AKS
  • kubectl configured against your cluster
  • Docker installed for building images
  • Access to your LLM provider credentials
  • Python packages:
    • langgraph
    • langchain-openai or your model provider package
    • kubernetes
    • pydantic
  • Basic familiarity with:
    • LangGraph state graphs
    • Kubernetes Deployments, Services, ConfigMaps, and Secrets

Install the Python dependencies:

pip install langgraph langchain-openai kubernetes pydantic

Integration Steps

  1. Define the insurance workflow in LangGraph

Start by modeling the multi-agent flow. For insurance use cases, a common pattern is:

  • intake agent
  • policy lookup agent
  • claims assessment agent
  • escalation agent

Use LangGraph’s StateGraph to route between these nodes.

from typing import TypedDict, Annotated
import operator

from langgraph.graph import StateGraph, END
from langchain_openai import ChatOpenAI

class InsuranceState(TypedDict):
    messages: Annotated[list, operator.add]
    claim_type: str
    risk_score: float
    decision: str

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

def intake_agent(state: InsuranceState):
    prompt = f"Classify this insurance request: {state['messages'][-1]}"
    response = llm.invoke(prompt)
    return {"messages": [response.content], "claim_type": "auto"}

def assessment_agent(state: InsuranceState):
    prompt = f"Assess risk for claim type {state['claim_type']}"
    response = llm.invoke(prompt)
    return {"messages": [response.content], "risk_score": 0.72}

def decision_agent(state: InsuranceState):
    if state["risk_score"] > 0.7:
        return {"decision": "escalate"}
    return {"decision": "approve"}

graph = StateGraph(InsuranceState)
graph.add_node("intake", intake_agent)
graph.add_node("assess", assessment_agent)
graph.add_node("decide", decision_agent)

graph.set_entry_point("intake")
graph.add_edge("intake", "assess")
graph.add_edge("assess", "decide")
graph.add_edge("decide", END)

app = graph.compile()
  1. Package the graph as a service for Kubernetes

In production, each graph worker should run in a container. Expose an HTTP endpoint so Kubernetes can scale it horizontally.

from fastapi import FastAPI
from pydantic import BaseModel

app_api = FastAPI()

class ClaimRequest(BaseModel):
    message: str

@app_api.post("/run")
def run_claim(req: ClaimRequest):
    result = app.invoke({
        "messages": [req.message],
        "claim_type": "",
        "risk_score": 0.0,
        "decision": ""
    })
    return result

Build a Docker image with this app and deploy it to Kubernetes behind a Service.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: insurance-agent
spec:
  replicas: 2
  selector:
    matchLabels:
      app: insurance-agent
  template:
    metadata:
      labels:
        app: insurance-agent
    spec:
      containers:
        - name: app
          image: your-registry/insurance-agent:latest
          ports:
            - containerPort: 8000
  1. Use the Kubernetes Python client to manage agent workers

If you want dynamic multi-agent capacity, use the Kubernetes API to inspect pods or trigger jobs for specific workloads like fraud checks or document extraction.

from kubernetes import client, config

config.load_incluster_config()  # use config.load_kube_config() locally

v1 = client.CoreV1Api()

pods = v1.list_namespaced_pod(namespace="insurance-ai", label_selector="app=insurance-agent")
for pod in pods.items:
    print(pod.metadata.name, pod.status.phase)

You can also create a Job for burst traffic when claim volume spikes.

batch_v1 = client.BatchV1Api()

job_manifest = client.V1Job(
    metadata=client.V1ObjectMeta(name="fraud-check-job"),
    spec=client.V1JobSpec(
        template=client.V1PodTemplateSpec(
            metadata=client.V1ObjectMeta(labels={"job": "fraud-check"}),
            spec=client.V1PodSpec(
                restart_policy="Never",
                containers=[
                    client.V1Container(
                        name="fraud-check",
                        image="your-registry/fraud-check:latest",
                        args=["python", "worker.py"]
                    )
                ],
            ),
        ),
        backoff_limit=2,
    ),
)

batch_v1.create_namespaced_job(namespace="insurance-ai", body=job_manifest)
  1. Wire multi-agent routing through Kubernetes service discovery

In a real system, one LangGraph node can call another agent over HTTP using its Kubernetes Service DNS name. That keeps each specialist isolated and independently deployable.

import requests

FRAUD_SERVICE_URL = "http://fraud-agent.insurance-ai.svc.cluster.local/run"

def fraud_agent(state: InsuranceState):
    payload = {
        "claim_id": state["messages"][-1],
        "risk_context": state["claim_type"],
    }
    resp = requests.post(FRAUD_SERVICE_URL, json=payload, timeout=10)
    resp.raise_for_status()
    data = resp.json()
    return {
        "messages": [f"fraud_score={data['score']}"],
        "risk_score": data["score"]
    }

This pattern works well when one team owns underwriting logic and another owns fraud logic. Kubernetes gives you stable service names; LangGraph gives you deterministic orchestration.

  1. Add config and secrets from Kubernetes into your graph runtime

Keep model keys and environment-specific settings in Secrets and ConfigMaps. Load them into the pod as environment variables so your graph code stays portable.

import os

OPENAI_API_KEY = os.environ["OPENAI_API_KEY"]
KUBE_NAMESPACE = os.environ.get("KUBE_NAMESPACE", "insurance-ai")
MODEL_NAME = os.environ.get("MODEL_NAME", "gpt-4o-mini")

llm = ChatOpenAI(model=MODEL_NAME, api_key=OPENAI_API_KEY)

A typical pod spec injects those values like this:

env:
  - name: OPENAI_API_KEY
    valueFrom:
      secretKeyRef:
        name: llm-secrets
        key: openai_api_key
  - name: MODEL_NAME
    valueFrom:
      configMapKeyRef:
        name: app-config,
        key: model_name

Testing the Integration

Run the service locally or inside the cluster, then send a claim request through the API.

import requests

resp = requests.post(
    "http://localhost:8000/run",
    json={"message": "Rear-end collision with moderate damage"}
)

print(resp.status_code)
print(resp.json())

Expected output:

200
{
  "messages": [
    "Rear-end collision with moderate damage",
    "...model output...",
    "...assessment output..."
  ],
  "claim_type": "auto",
  "risk_score": 0.72,
  "decision": "escalate"
}

If you want to validate Kubernetes connectivity from Python directly:

from kubernetes import client, config

config.load_kube_config()
v1 = client.CoreV1Api()
namespaces = v1.list_namespace()
print([ns.metadata.name for ns in namespaces.items][:5])

Real-World Use Cases

  • Claims triage pipelines where one agent extracts details from documents, another scores risk, and a third decides whether to escalate to a human adjuster.
  • Underwriting workflows that fan out across policy rules agents, external data lookup agents, and compliance agents deployed as separate Kubernetes services.
  • Fraud detection systems that autoscale investigation agents during spikes in FNOL submissions or suspicious claim bursts.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides