How to Integrate LangGraph for insurance with Kubernetes for RAG
Combining LangGraph for insurance with Kubernetes gives you a clean way to run regulated RAG workflows as durable, observable services. In practice, that means claims triage, policy Q&A, and underwriting assistants can retrieve from internal documents while staying deployable, scalable, and isolated inside your cluster.
Prerequisites
- •Python 3.10+
- •Access to a Kubernetes cluster
- •
kubectlconfigured for the target cluster - •A container registry for pushing images
- •LangGraph installed:
- •
pip install langgraph langchain-openai
- •
- •Kubernetes Python client installed:
- •
pip install kubernetes
- •
- •An LLM API key set in your environment
- •A document store or vector index for RAG data
- •A Kubernetes namespace for the agent workload
- •RBAC permissions to create:
- •
Deployment - •
Service - •
ConfigMap - •
Secret
- •
Integration Steps
- •
Build the LangGraph workflow for insurance RAG
Start with a graph that routes insurance questions through retrieval before generating an answer. For production insurance use cases, keep the retrieval step explicit so you can inspect sources and enforce policy controls.
from typing import TypedDict, List from langgraph.graph import StateGraph, END from langchain_openai import ChatOpenAI class GraphState(TypedDict): question: str context: List[str] answer: str llm = ChatOpenAI(model="gpt-4o-mini", temperature=0) def retrieve_docs(state: GraphState) -> GraphState: # Replace with your vector DB lookup docs = [ "Policy A covers fire damage with a 30-day reporting window.", "Policy B excludes flood damage unless rider X is active." ] return {**state, "context": docs} def generate_answer(state: GraphState) -> GraphState: prompt = f""" Answer the insurance question using only the context. Question: {state["question"]} Context: {chr(10).join(state["context"])} """ response = llm.invoke(prompt) return {**state, "answer": response.content} graph = StateGraph(GraphState) graph.add_node("retrieve", retrieve_docs) graph.add_node("generate", generate_answer) graph.set_entry_point("retrieve") graph.add_edge("retrieve", "generate") graph.add_edge("generate", END) app = graph.compile() - •
Package the graph as a service-friendly Python app
Your Kubernetes pod should expose a simple HTTP interface. Keep the graph execution inside the container so each replica can process requests independently.
from fastapi import FastAPI from pydantic import BaseModel class QueryRequest(BaseModel): question: str api = FastAPI() @api.post("/ask") def ask(req: QueryRequest): result = app.invoke({"question": req.question, "context": [], "answer": ""}) return {"answer": result["answer"], "sources": result["context"]} - •
Create Kubernetes resources from Python
Use the Kubernetes client to create a deployment and service for the agent. This is where LangGraph becomes a real workload on the cluster instead of a local script.
from kubernetes import client, config config.load_kube_config() apps_v1 = client.AppsV1Api() core_v1 = client.CoreV1Api() namespace = "insurance-rag" deployment = client.V1Deployment( metadata=client.V1ObjectMeta(name="langgraph-insurance-agent"), spec=client.V1DeploymentSpec( replicas=2, selector=client.V1LabelSelector( match_labels={"app": "langgraph-insurance-agent"} ), template=client.V1PodTemplateSpec( metadata=client.V1ObjectMeta(labels={"app": "langgraph-insurance-agent"}), spec=client.V1PodSpec( containers=[ client.V1Container( name="agent", image="your-registry/langgraph-insurance-agent:latest", ports=[client.V1ContainerPort(container_port=8000)], ) ] ), ), ), ) service = client.V1Service( metadata=client.V1ObjectMeta(name="langgraph-insurance-agent"), spec=client.V1ServiceSpec( selector={"app": "langgraph-insurance-agent"}, ports=[client.V1ServicePort(port=80, target_port=8000)], type="ClusterIP", ), ) apps_v1.create_namespaced_deployment(namespace=namespace, body=deployment) core_v1.create_namespaced_service(namespace=namespace, body=service) - •
Wire secrets and config into the pod
Keep model keys and retrieval settings out of code. Store them in Kubernetes Secrets and ConfigMaps, then mount them as environment variables.
secret = client.V1Secret( metadata=client.V1ObjectMeta(name="llm-secrets"), string_data={ "OPENAI_API_KEY": "replace-me" }, type="Opaque", ) config_map = client.V1ConfigMap( metadata=client.V1ObjectMeta(name="rag-config"), data={ "VECTOR_INDEX_NAME": "insurance-policies", "TOP_K": "4" } ) core_v1.create_namespaced_secret(namespace=namespace, body=secret) core_v1.create_namespaced_config_map(namespace=namespace, body=config_map) - •
Update the deployment to consume those values
Add environment references so every replica gets the same runtime configuration. This keeps your LangGraph behavior stable across pods.
container_env = [ client.V1EnvVar( name="OPENAI_API_KEY", value_from=client.V1EnvVarSource( secret_key_ref=client.V1SecretKeySelector( name="llm-secrets", key="OPENAI_API_KEY" ) ) ), client.V1EnvVar( name="VECTOR_INDEX_NAME", value_from=client.V1EnvVarSource( config_map_key_ref=client.V1ConfigMapKeySelector( name="rag-config", key="VECTOR_INDEX_NAME" ) ) ) ]
Testing the Integration
Run a request against the service after it is deployed. If you are testing locally through port-forwarding or an ingress route, hit /ask with a real insurance query.
import requests
response = requests.post(
"http://localhost:8000/ask",
json={"question": "Does Policy B cover flood damage?"}
)
print(response.status_code)
print(response.json())
Expected output:
200
{
"answer": "...Policy B excludes flood damage unless rider X is active...",
"sources": [
"Policy A covers fire damage with a 30-day reporting window.",
"Policy B excludes flood damage unless rider X is active."
]
}
Real-World Use Cases
- •
Claims intake assistant
- •Route claimant questions through retrieval over policy documents, then run on Kubernetes so multiple adjusters can query it concurrently.
- •
Underwriting policy checker
- •Use LangGraph to inspect applicant answers against underwriting rules and deploy it as a namespaced service with strict resource limits.
- •
Broker support bot
- •Serve broker-facing RAG over coverage guides and endorsements inside your cluster so data access stays inside your network boundary.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit