How to Integrate LangGraph for wealth management with Kubernetes for production AI

By Cyprian AaronsUpdated 2026-04-21
langgraph-for-wealth-managementkubernetesproduction-ai

Combining LangGraph for wealth management with Kubernetes gives you a clean path from agent logic to production runtime. LangGraph handles the orchestration of multi-step financial workflows like portfolio review, suitability checks, and client follow-ups, while Kubernetes gives you scaling, rollout control, health checks, and isolation for regulated workloads.

For wealth management teams, that means you can run agentic processes that are deterministic enough for compliance, but still flexible enough to handle document intake, policy lookup, and advisor handoff.

Prerequisites

  • Python 3.10+
  • A Kubernetes cluster
    • Local: kind, minikube, or k3d
    • Production: EKS, GKE, or AKS
  • kubectl configured against your cluster
  • Docker installed for building container images
  • A LangGraph-based app already defined for your wealth management workflow
  • Access to an LLM provider key if your graph uses one
  • Python packages:
    • langgraph
    • langchain-core
    • kubernetes
    • fastapi
    • uvicorn

Install the Python dependencies:

pip install langgraph langchain-core kubernetes fastapi uvicorn

Integration Steps

1) Define the wealth management graph

Start by modeling the workflow as a LangGraph state machine. For wealth management, a common pattern is: intake client request, classify intent, fetch account context, generate recommendation draft, then route to human review.

from typing import TypedDict, Annotated
from langgraph.graph import StateGraph, START, END
from langchain_core.messages import HumanMessage

class WealthState(TypedDict):
    messages: Annotated[list, lambda x, y: x + y]
    intent: str
    recommendation: str

def classify_intent(state: WealthState):
    text = state["messages"][-1].content.lower()
    if "rebalance" in text:
        return {"intent": "portfolio_rebalance"}
    if "withdrawal" in text:
        return {"intent": "cash_withdrawal"}
    return {"intent": "general_advice"}

def draft_recommendation(state: WealthState):
    if state["intent"] == "portfolio_rebalance":
        rec = "Draft rebalance proposal based on risk profile and drift thresholds."
    elif state["intent"] == "cash_withdrawal":
        rec = "Check liquidity impact before approving withdrawal."
    else:
        rec = "Route to advisor for general planning guidance."
    return {"recommendation": rec}

graph = StateGraph(WealthState)
graph.add_node("classify_intent", classify_intent)
graph.add_node("draft_recommendation", draft_recommendation)

graph.add_edge(START, "classify_intent")
graph.add_edge("classify_intent", "draft_recommendation")
graph.add_edge("draft_recommendation", END)

app = graph.compile()

This gives you a deterministic graph you can run locally before packaging it for Kubernetes.

2) Wrap the graph behind an API service

Kubernetes needs a networked process to manage. The simplest production pattern is FastAPI exposing a /run endpoint that calls app.invoke() on your compiled LangGraph app.

from fastapi import FastAPI
from pydantic import BaseModel
from langchain_core.messages import HumanMessage

api = FastAPI()

class RequestBody(BaseModel):
    message: str

@api.post("/run")
def run_workflow(body: RequestBody):
    result = app.invoke({
        "messages": [HumanMessage(content=body.message)],
        "intent": "",
        "recommendation": ""
    })
    return {
        "intent": result["intent"],
        "recommendation": result["recommendation"]
    }

Run it locally first:

uvicorn main:api --host 0.0.0.0 --port 8080

At this point you have a service boundary that Kubernetes can deploy and scale.

3) Containerize the service

Build a minimal image with your graph code and API server. Keep it small so rollouts are faster and easier to audit.

FROM python:3.11-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY main.py .

EXPOSE 8080

CMD ["uvicorn", "main:api", "--host", "0.0.0.0", "--port", "8080"]

Example requirements.txt:

langgraph
langchain-core
fastapi
uvicorn[standard]

Build and push:

docker build -t your-registry/wealth-langgraph:1.0.0 .
docker push your-registry/wealth-langgraph:1.0.0

4) Deploy to Kubernetes with probes and scaling

Use a Deployment for stateless execution and a Service for internal routing. Add readiness and liveness probes so bad pods are removed before they affect advisor workflows.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: wealth-langgraph-api
spec:
  replicas: 2
  selector:
    matchLabels:
      app: wealth-langgraph-api
  template:
    metadata:
      labels:
        app: wealth-langgraph-api
    spec:
      containers:
      - name: api
        image: your-registry/wealth-langgraph:1.0.0
        ports:
        - containerPort: 8080
        readinessProbe:
          httpGet:
            path: /docs
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 10
        livenessProbe:
          httpGet:
            path: /docs
            port: 8080
          initialDelaySeconds: 15
          periodSeconds: 20---
apiVersion: v1---
kind: Service---
metadata:
  name: wealth-langgraph-api---
spec:
  selector:
    app: wealth-langgraph-api---
ports:
  - port: 80---
targetPort :8080---
type :ClusterIP 

Apply it:

kubectl apply -f deployment.yaml 

If you want autoscaling later, add an HPA based on CPU or custom metrics from queue depth or request latency.

5) Use the Kubernetes Python client for operational control

For production AI systems, you usually need more than deployment manifests. Use the Kubernetes Python client to inspect pod health or trigger operational workflows from another control service.

from kubernetes import client ,config 

config.load_incluster_config() 
v1=client.CoreV1Api() 

pods=v1.list_namespaced_pod(namespace="default",label_selector="app=wealth-langgraph-api") 

for pod in pods.items :
 print(pod.metadata.name,pod.status.phase) 

You can extend this pattern to watch pod restarts , read logs , or route traffic only after the graph service is healthy.

Testing the Integration

Hit the API locally or through the Kubernetes Service . This verifies both layers : LangGraph execution and container orchestration .

import requests 

resp=requests.post(
   "http://localhost :8080/run",
   json={"message":"Please rebalance my portfolio"} 
)

print(resp.status_code)
print(resp.json())

Expected output:

200 
{'intent':'portfolio_rebalance','recommendation':'Draft rebalance proposal based on risk profile and drift thresholds.'}

If you want to test inside the cluster , port-forward the service :

kubectl port-forward svc/wealth-langgraph-api 8080 :80 

Then rerun the same request against localhost :8080.

Real-World Use Cases

  • Portfolio servicing agents that classify requests , generate next actions , and escalate exceptions to human advisors.
  • Client onboarding workflows that validate submitted documents , check missing fields , and trigger follow-up tasks through internal systems.
  • Suitability and compliance assistants that route high-risk recommendations through approval gates before anything reaches a client record.

This stack works because each tool does one job well . LangGraph owns workflow logic . Kubernetes owns runtime reliability . In production wealth management systems , that separation is what keeps agent behavior controllable while still letting you scale under load .


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides