How to Integrate LangGraph for retail banking with Kubernetes for AI agents
Banks do not need another chatbot. They need agent systems that can hold state, route work, and survive infrastructure failures while handling retail banking workflows like balance disputes, card replacement, and loan pre-checks. LangGraph gives you the orchestration layer for multi-step banking agents; Kubernetes gives you the runtime to scale, isolate, and recover those agents under load.
Prerequisites
- •Python 3.10+
- •A Kubernetes cluster:
- •local:
kind,minikube, ork3d - •cloud: EKS, GKE, or AKS
- •local:
- •
kubectlconfigured and pointing at your cluster - •Docker installed for building the agent image
- •Access to a LangGraph-compatible Python environment
- •An LLM provider key configured as an environment variable
- •Basic familiarity with:
- •LangGraph graphs, nodes, edges, and state
- •Kubernetes Deployments, Services, and ConfigMaps
Install the Python packages:
pip install langgraph langchain-openai kubernetes pydantic
Integration Steps
- •Define the retail banking workflow in LangGraph.
For retail banking, keep the graph explicit. A customer request should move through classification, policy checks, action execution, and escalation if needed.
from typing import TypedDict, Literal
from langgraph.graph import StateGraph, START, END
class BankingState(TypedDict):
customer_id: str
request_text: str
intent: str
risk_level: str
response: str
def classify_request(state: BankingState) -> BankingState:
text = state["request_text"].lower()
if "card" in text:
intent = "card_service"
risk = "low"
elif "loan" in text:
intent = "loan_precheck"
risk = "medium"
else:
intent = "account_support"
risk = "low"
return {**state, "intent": intent, "risk_level": risk}
def route_intent(state: BankingState) -> Literal["card", "loan", "support"]:
if state["intent"] == "card_service":
return "card"
if state["intent"] == "loan_precheck":
return "loan"
return "support"
graph = StateGraph(BankingState)
graph.add_node("classify_request", classify_request)
graph.add_node("card", lambda s: {**s, "response": "Card replacement workflow started"})
graph.add_node("loan", lambda s: {**s, "response": "Loan pre-check workflow started"})
graph.add_node("support", lambda s: {**s, "response": "General support workflow started"})
graph.add_edge(START, "classify_request")
graph.add_conditional_edges("classify_request", route_intent)
graph.add_edge("card", END)
graph.add_edge("loan", END)
graph.add_edge("support", END)
banking_app = graph.compile()
- •Add a tool node that can call internal banking APIs.
LangGraph works best when your nodes call real services. In retail banking that usually means core banking APIs behind an internal gateway.
import os
import requests
BANKING_API_BASE = os.environ["BANKING_API_BASE"]
def fetch_customer_profile(customer_id: str) -> dict:
r = requests.get(
f"{BANKING_API_BASE}/customers/{customer_id}",
headers={"Authorization": f"Bearer {os.environ['BANKING_API_TOKEN']}"},
timeout=10,
)
r.raise_for_status()
return r.json()
def enrich_with_profile(state: BankingState) -> BankingState:
profile = fetch_customer_profile(state["customer_id"])
return {
**state,
"response": f"Customer segment={profile['segment']}, status={profile['status']}"
}
If you want this node inside the graph, insert it before routing:
graph = StateGraph(BankingState)
graph.add_node("classify_request", classify_request)
graph.add_node("enrich_with_profile", enrich_with_profile)
graph.add_node("card", lambda s: {**s, "response": s["response"] + "; card flow queued"})
graph.add_node("loan", lambda s: {**s, "response": s["response"] + "; loan flow queued"})
graph.add_node("support", lambda s: {**s, "response": s["response"] + "; support flow queued"})
graph.add_edge(START, "enrich_with_profile")
graph.add_edge("enrich_with_profile", "classify_request")
graph.add_conditional_edges("classify_request", route_intent)
graph.add_edge("card", END)
graph.add_edge("loan", END)
graph.add_edge("support", END)
banking_app = graph.compile()
- •Package the LangGraph app for Kubernetes.
The clean pattern is one container per agent service. Expose a small HTTP API so your cluster can autoscale it and other services can call it.
from fastapi import FastAPI
from pydantic import BaseModel
app = FastAPI()
class RequestIn(BaseModel):
customer_id: str
request_text: str
@app.post("/agent")
def run_agent(payload: RequestIn):
result = banking_app.invoke({
"customer_id": payload.customer_id,
"request_text": payload.request_text,
"intent": "",
"risk_level": "",
"response": ""
})
return result
Build and push the image:
docker build -t registry.example.com/retail-banking-agent:1.0 .
docker push registry.example.com/retail-banking-agent:1.0
- •Deploy the agent on Kubernetes.
Use a Deployment for replicas and a Service for stable access. Keep secrets out of the image and inject them through Kubernetes primitives.
apiVersion: apps/v1
kind: Deployment
metadata:
name: retail-banking-agent
spec:
replicas: 2
selector:
matchLabels:
app: retail-banking-agent
template:
metadata:
labels:
app: retail-banking-agent
spec:
containers:
- name: agent
image: registry.example.com/retail-banking-agent:1.0
ports:
- containerPort: 8000
envFrom:
- secretRef:
name: banking-secrets
---
apiVersion: v1
kind: Service
metadata:
name: retail-banking-agent-svc
spec:
selector:
app: retail-banking-agent
ports:
- port: 80
targetPort: 8000
Apply it from Python using the Kubernetes client:
from kubernetes import client, config
config.load_kube_config()
api = client.AppsV1Api()
with open("deployment.yaml") as f:
manifest = f.read()
print(api.list_namespaced_deployment(namespace="default").items[0].metadata.name)
For production deployments you’ll usually apply YAML with kubectl apply -f deployment.yaml. Use Python when you need automation in CI/CD or dynamic environment setup.
- •Connect runtime health checks and scaling signals.
Kubernetes should know when your agent is healthy. Add probes so failed pods get restarted before they start dropping customer requests.
from kubernetes import client
probe = client.V1Probe(
http_get=client.V1HTTPGetAction(path="/healthz", port=8000),
initial_delay_seconds=10,
period_seconds=5,
)
print(probe.http_get.path)
Testing the Integration
Run a real invocation against the graph first:
result = banking_app.invoke({
"customer_id": "CUST-10021",
"request_text": "I need to replace my debit card",
"intent": "",
"risk_level": "",
"response": ""
})
print(result["intent"])
print(result["risk_level"])
print(result["response"])
Expected output:
card_service
low
Customer segment=premium, status=active; card flow queued
Then verify Kubernetes sees your workload:
kubectl get pods -l app=retail-banking-agent
kubectl get svc retail-banking-agent-svc
You should see at least one running pod and a service exposing port 80.
Real-World Use Cases
- •
Card servicing agents
- •Handle replacement requests, fraud triage, PIN resets, and delivery status lookups.
- •Route low-risk cases automatically and escalate suspicious ones to human ops.
- •
Loan pre-screening assistants
- •Collect income details, perform policy checks, call eligibility APIs.
- •Run as multiple replicas in Kubernetes during peak application hours.
- •
Dispute resolution workflows
- •Orchestrate document collection, transaction lookup, merchant contact attempts, and case creation.
- •Keep each step auditable through LangGraph state transitions while Kubernetes handles reliability.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit