How to Integrate LangGraph for wealth management with Kubernetes for production AI
Combining LangGraph for wealth management with Kubernetes gives you a clean path from agent logic to production runtime. LangGraph handles the orchestration of multi-step financial workflows like portfolio review, suitability checks, and client follow-ups, while Kubernetes gives you scaling, rollout control, health checks, and isolation for regulated workloads.
For wealth management teams, that means you can run agentic processes that are deterministic enough for compliance, but still flexible enough to handle document intake, policy lookup, and advisor handoff.
Prerequisites
- •Python 3.10+
- •A Kubernetes cluster
- •Local:
kind,minikube, ork3d - •Production: EKS, GKE, or AKS
- •Local:
- •
kubectlconfigured against your cluster - •Docker installed for building container images
- •A LangGraph-based app already defined for your wealth management workflow
- •Access to an LLM provider key if your graph uses one
- •Python packages:
- •
langgraph - •
langchain-core - •
kubernetes - •
fastapi - •
uvicorn
- •
Install the Python dependencies:
pip install langgraph langchain-core kubernetes fastapi uvicorn
Integration Steps
1) Define the wealth management graph
Start by modeling the workflow as a LangGraph state machine. For wealth management, a common pattern is: intake client request, classify intent, fetch account context, generate recommendation draft, then route to human review.
from typing import TypedDict, Annotated
from langgraph.graph import StateGraph, START, END
from langchain_core.messages import HumanMessage
class WealthState(TypedDict):
messages: Annotated[list, lambda x, y: x + y]
intent: str
recommendation: str
def classify_intent(state: WealthState):
text = state["messages"][-1].content.lower()
if "rebalance" in text:
return {"intent": "portfolio_rebalance"}
if "withdrawal" in text:
return {"intent": "cash_withdrawal"}
return {"intent": "general_advice"}
def draft_recommendation(state: WealthState):
if state["intent"] == "portfolio_rebalance":
rec = "Draft rebalance proposal based on risk profile and drift thresholds."
elif state["intent"] == "cash_withdrawal":
rec = "Check liquidity impact before approving withdrawal."
else:
rec = "Route to advisor for general planning guidance."
return {"recommendation": rec}
graph = StateGraph(WealthState)
graph.add_node("classify_intent", classify_intent)
graph.add_node("draft_recommendation", draft_recommendation)
graph.add_edge(START, "classify_intent")
graph.add_edge("classify_intent", "draft_recommendation")
graph.add_edge("draft_recommendation", END)
app = graph.compile()
This gives you a deterministic graph you can run locally before packaging it for Kubernetes.
2) Wrap the graph behind an API service
Kubernetes needs a networked process to manage. The simplest production pattern is FastAPI exposing a /run endpoint that calls app.invoke() on your compiled LangGraph app.
from fastapi import FastAPI
from pydantic import BaseModel
from langchain_core.messages import HumanMessage
api = FastAPI()
class RequestBody(BaseModel):
message: str
@api.post("/run")
def run_workflow(body: RequestBody):
result = app.invoke({
"messages": [HumanMessage(content=body.message)],
"intent": "",
"recommendation": ""
})
return {
"intent": result["intent"],
"recommendation": result["recommendation"]
}
Run it locally first:
uvicorn main:api --host 0.0.0.0 --port 8080
At this point you have a service boundary that Kubernetes can deploy and scale.
3) Containerize the service
Build a minimal image with your graph code and API server. Keep it small so rollouts are faster and easier to audit.
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY main.py .
EXPOSE 8080
CMD ["uvicorn", "main:api", "--host", "0.0.0.0", "--port", "8080"]
Example requirements.txt:
langgraph
langchain-core
fastapi
uvicorn[standard]
Build and push:
docker build -t your-registry/wealth-langgraph:1.0.0 .
docker push your-registry/wealth-langgraph:1.0.0
4) Deploy to Kubernetes with probes and scaling
Use a Deployment for stateless execution and a Service for internal routing. Add readiness and liveness probes so bad pods are removed before they affect advisor workflows.
apiVersion: apps/v1
kind: Deployment
metadata:
name: wealth-langgraph-api
spec:
replicas: 2
selector:
matchLabels:
app: wealth-langgraph-api
template:
metadata:
labels:
app: wealth-langgraph-api
spec:
containers:
- name: api
image: your-registry/wealth-langgraph:1.0.0
ports:
- containerPort: 8080
readinessProbe:
httpGet:
path: /docs
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
httpGet:
path: /docs
port: 8080
initialDelaySeconds: 15
periodSeconds: 20---
apiVersion: v1---
kind: Service---
metadata:
name: wealth-langgraph-api---
spec:
selector:
app: wealth-langgraph-api---
ports:
- port: 80---
targetPort :8080---
type :ClusterIP
Apply it:
kubectl apply -f deployment.yaml
If you want autoscaling later, add an HPA based on CPU or custom metrics from queue depth or request latency.
5) Use the Kubernetes Python client for operational control
For production AI systems, you usually need more than deployment manifests. Use the Kubernetes Python client to inspect pod health or trigger operational workflows from another control service.
from kubernetes import client ,config
config.load_incluster_config()
v1=client.CoreV1Api()
pods=v1.list_namespaced_pod(namespace="default",label_selector="app=wealth-langgraph-api")
for pod in pods.items :
print(pod.metadata.name,pod.status.phase)
You can extend this pattern to watch pod restarts , read logs , or route traffic only after the graph service is healthy.
Testing the Integration
Hit the API locally or through the Kubernetes Service . This verifies both layers : LangGraph execution and container orchestration .
import requests
resp=requests.post(
"http://localhost :8080/run",
json={"message":"Please rebalance my portfolio"}
)
print(resp.status_code)
print(resp.json())
Expected output:
200
{'intent':'portfolio_rebalance','recommendation':'Draft rebalance proposal based on risk profile and drift thresholds.'}
If you want to test inside the cluster , port-forward the service :
kubectl port-forward svc/wealth-langgraph-api 8080 :80
Then rerun the same request against localhost :8080.
Real-World Use Cases
- •Portfolio servicing agents that classify requests , generate next actions , and escalate exceptions to human advisors.
- •Client onboarding workflows that validate submitted documents , check missing fields , and trigger follow-up tasks through internal systems.
- •Suitability and compliance assistants that route high-risk recommendations through approval gates before anything reaches a client record.
This stack works because each tool does one job well . LangGraph owns workflow logic . Kubernetes owns runtime reliability . In production wealth management systems , that separation is what keeps agent behavior controllable while still letting you scale under load .
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit