How to Integrate LangGraph for fintech with Kubernetes for production AI
Combining LangGraph for fintech with Kubernetes gives you a clean path from agent logic to production execution. LangGraph handles stateful workflows for things like KYC checks, fraud triage, payment routing, and claims review, while Kubernetes gives you scheduling, isolation, retries, and horizontal scaling for the same workflows under real load.
Prerequisites
- •Python 3.10+
- •Access to a Kubernetes cluster
- •
kubectlconfigured - •permissions to create Deployments, Services, ConfigMaps, and Secrets
- •
- •A working LangGraph project
- •
langgraph - •
langchain-core - •your model provider SDK if your graph calls an LLM
- •
- •A container registry for pushing images
- •
kubernetesPython client installed:- •
pip install langgraph kubernetes pydantic
- •
- •Basic knowledge of:
- •LangGraph
StateGraph,START,END - •Kubernetes
client.AppsV1Api,client.CoreV1Api
- •LangGraph
Integration Steps
- •Build the LangGraph workflow for the fintech use case.
Start with a deterministic graph. For fintech, that usually means explicit steps like intake, risk scoring, policy check, and decisioning.
from typing import TypedDict, Literal
from langgraph.graph import StateGraph, START, END
class LoanState(TypedDict):
customer_id: str
income: float
debt_ratio: float
risk_score: int
decision: Literal["approve", "review", "reject"]
def assess_risk(state: LoanState) -> LoanState:
score = 100 if state["debt_ratio"] > 0.5 else 30
return {**state, "risk_score": score}
def decide(state: LoanState) -> LoanState:
if state["risk_score"] >= 80:
decision = "reject"
elif state["risk_score"] >= 40:
decision = "review"
else:
decision = "approve"
return {**state, "decision": decision}
graph = StateGraph(LoanState)
graph.add_node("assess_risk", assess_risk)
graph.add_node("decide", decide)
graph.add_edge(START, "assess_risk")
graph.add_edge("assess_risk", "decide")
graph.add_edge("decide", END)
app = graph.compile()
- •Package the graph behind a Python service that Kubernetes can run.
Kubernetes should not know about graph internals. Expose one API endpoint that accepts a request and runs app.invoke().
from fastapi import FastAPI
from pydantic import BaseModel
api = FastAPI()
class LoanRequest(BaseModel):
customer_id: str
income: float
debt_ratio: float
@api.post("/evaluate")
def evaluate(req: LoanRequest):
result = app.invoke({
"customer_id": req.customer_id,
"income": req.income,
"debt_ratio": req.debt_ratio,
"risk_score": 0,
"decision": "review",
})
return result
- •Create a Kubernetes deployment from Python using the official client.
This is useful when your CI pipeline or platform service needs to roll out graph workers automatically.
from kubernetes import client, config
config.load_kube_config()
apps_v1 = client.AppsV1Api()
deployment = client.V1Deployment(
metadata=client.V1ObjectMeta(name="langgraph-fintech-agent"),
spec=client.V1DeploymentSpec(
replicas=2,
selector=client.V1LabelSelector(match_labels={"app": "langgraph-fintech-agent"}),
template=client.V1PodTemplateSpec(
metadata=client.V1ObjectMeta(labels={"app": "langgraph-fintech-agent"}),
spec=client.V1PodSpec(
containers=[
client.V1Container(
name="agent",
image="registry.example.com/langgraph-fintech-agent:latest",
ports=[client.V1ContainerPort(container_port=8000)],
)
]
),
),
),
)
apps_v1.create_namespaced_deployment(namespace="prod-ai", body=deployment)
- •Wire in autoscaling and service discovery.
For production AI systems, you want stable routing and scale based on demand. Use a Service for traffic and HorizontalPodAutoscaler for burst handling.
from kubernetes import client
core_v1 = client.CoreV1Api()
autoscaling_v2 = client.AutoscalingV2Api()
service = client.V1Service(
metadata=client.V1ObjectMeta(name="langgraph-fintech-agent"),
spec=client.V1ServiceSpec(
selector={"app": "langgraph-fintech-agent"},
ports=[client.V1ServicePort(port=80, target_port=8000)],
type="ClusterIP",
),
)
core_v1.create_namespaced_service(namespace="prod-ai", body=service)
hpa = client.V2HorizontalPodAutoscaler(
metadata=client.V1ObjectMeta(name="langgraph-fintech-agent-hpa"),
spec=client.V2HorizontalPodAutoscalerSpec(
scale_target_ref=client.V2CrossVersionObjectReference(
api_version="apps/v1",
kind="Deployment",
name="langgraph-fintech-agent",
),
min_replicas=2,
max_replicas=10,
metrics=[
client.V2MetricSpec(
type="Resource",
resource=client.V2ResourceMetricSource(
name="cpu",
target=client.V2MetricTarget(type="Utilization", average_utilization=70),
),
)
],
),
)
autoscaling_v2.create_namespaced_horizontal_pod_autoscaler(namespace="prod-ai", body=hpa)
- •Add runtime configuration through ConfigMaps and Secrets.
Keep model keys, thresholds, and policy values out of code. That lets you update rules without rebuilding the image.
from kubernetes import client
config_map = client.V1ConfigMap(
metadata=client.V1ObjectMeta(name="langgraph-fintech-config"),
data={
"RISK_REJECT_THRESHOLD": "80",
"RISK_REVIEW_THRESHOLD": "40",
},
)
secret = client.V1Secret(
metadata=client.V1ObjectMeta(name="langgraph-fintech-secrets"),
string_data={
"OPENAI_API_KEY": "replace-me",
},
)
core_v1.create_namespaced_config_map(namespace="prod-ai", body=config_map)
core_v1.create_namespaced_secret(namespace="prod-ai", body=secret)
Testing the Integration
Run the graph locally first, then verify the Kubernetes objects exist and the service responds.
import requests
payload = {
"customer_id": "cust-1009",
"income": 120000,
"debt_ratio": 0.22,
}
response = requests.post("http://localhost:8000/evaluate", json=payload)
print(response.json())
Expected output:
{
"customer_id": "cust-1009",
"income": 120000,
"debt_ratio": 0.22,
"risk_score": 30,
"decision": "approve"
}
If you want to validate from inside the cluster:
from kubernetes import client, config
config.load_kube_config()
core_v1 = client.CoreV1Api()
pods = core_v1.list_namespaced_pod(namespace="prod-ai", label_selector="app=langgraph-fintech-agent")
print([p.metadata.name for p in pods.items])
You should see at least one running pod name in the list.
Real-World Use Cases
- •
Fraud triage
- •Route transactions through a LangGraph workflow that scores risk, enriches account context, and escalates suspicious cases.
- •Run each stage as a replicated service on Kubernetes so bursts do not stall decisions.
- •
Loan underwriting
- •Use LangGraph to orchestrate credit checks, policy validation, and exception handling.
- •Use Kubernetes to isolate workloads by tenant or business unit with namespaces and resource limits.
- •
Claims automation
- •Build an agent that extracts claim details, checks policy coverage, flags inconsistencies, and creates a human-review queue.
- •Scale workers independently when claim volume spikes after major events.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit