How to Integrate LangGraph for lending with Kubernetes for multi-agent systems
Combining LangGraph for lending with Kubernetes gives you a clean way to run loan-origination agents as isolated, scalable services. LangGraph handles the decision flow across underwriting, document checks, and exception routing; Kubernetes gives you deployment, autoscaling, and fault isolation for multi-agent workloads.
Prerequisites
- •Python 3.10+
- •A running Kubernetes cluster
- •Minikube, kind, EKS, GKE, or AKS
- •
kubectlconfigured against the cluster - •Docker installed for building agent images
- •A LangGraph project set up for lending workflows
- •Access to your model provider credentials via environment variables
- •Python packages:
- •
langgraph - •
langchain-core - •
kubernetes - •
pyyaml
- •
Install the Python dependencies:
pip install langgraph langchain-core kubernetes pyyaml
Integration Steps
- •Define the lending workflow in LangGraph.
A lending system usually needs multiple agents: intake, risk scoring, fraud checks, and final decisioning. LangGraph is a good fit because it lets you wire those nodes into a deterministic state machine instead of a pile of ad hoc prompts.
from typing import TypedDict, Literal
from langgraph.graph import StateGraph, START, END
class LendingState(TypedDict):
applicant_name: str
income: float
debt: float
fraud_flag: bool
decision: str
def intake_agent(state: LendingState) -> LendingState:
return state
def risk_agent(state: LendingState) -> LendingState:
dti = state["debt"] / state["income"]
state["decision"] = "review" if dti > 0.4 else "approve"
return state
def fraud_agent(state: LendingState) -> LendingState:
if state["fraud_flag"]:
state["decision"] = "reject"
return state
graph = StateGraph(LendingState)
graph.add_node("intake", intake_agent)
graph.add_node("risk", risk_agent)
graph.add_node("fraud", fraud_agent)
graph.add_edge(START, "intake")
graph.add_edge("intake", "risk")
graph.add_edge("risk", "fraud")
graph.add_edge("fraud", END)
app = graph.compile()
- •Package the graph as an API service that Kubernetes can run.
Kubernetes should not run your graph directly inside a notebook or local process. Wrap it in a small service so each pod can execute one request at a time and stay stateless.
from fastapi import FastAPI
from pydantic import BaseModel
api = FastAPI()
class LendingRequest(BaseModel):
applicant_name: str
income: float
debt: float
fraud_flag: bool = False
@api.post("/evaluate")
def evaluate(req: LendingRequest):
result = app.invoke(req.model_dump())
return result
Save that as main.py, then build a container image with your dependencies and service entrypoint. In production, this is the unit Kubernetes schedules and scales.
- •Deploy the service to Kubernetes.
Use a Deployment for replicas and a Service for stable networking between agents or upstream systems. If you have separate underwriting and fraud services, Kubernetes lets each one scale independently.
from kubernetes import client, config
config.load_kube_config()
apps_v1 = client.AppsV1Api()
core_v1 = client.CoreV1Api()
deployment = client.V1Deployment(
metadata=client.V1ObjectMeta(name="lending-agent"),
spec=client.V1DeploymentSpec(
replicas=2,
selector=client.V1LabelSelector(match_labels={"app": "lending-agent"}),
template=client.V1PodTemplateSpec(
metadata=client.V1ObjectMeta(labels={"app": "lending-agent"}),
spec=client.V1PodSpec(
containers=[
client.V1Container(
name="lending-agent",
image="your-registry/lending-agent:latest",
ports=[client.V1ContainerPort(container_port=8000)],
)
]
),
),
),
)
apps_v1.create_namespaced_deployment(namespace="default", body=deployment)
Expose it with a ClusterIP Service so other agents can call it inside the cluster:
service = client.V1Service(
metadata=client.V1ObjectMeta(name="lending-agent-svc"),
spec=client.V1ServiceSpec(
selector={"app": "lending-agent"},
ports=[client.V1ServicePort(port=80, target_port=8000)],
type="ClusterIP",
),
)
core_v1.create_namespaced_service(namespace="default", body=service)
- •Connect multiple agents through Kubernetes-hosted services.
In multi-agent systems, one agent often calls another over HTTP. For lending, that usually means an orchestration agent sends applications to underwriting and fraud services deployed in separate pods.
import requests
payload = {
"applicant_name": "Jane Doe",
"income": 120000,
"debt": 30000,
"fraud_flag": False,
}
resp = requests.post(
"http://lending-agent-svc.default.svc.cluster.local/evaluate",
json=payload,
timeout=10,
)
print(resp.json())
This pattern keeps your LangGraph logic focused on workflow while Kubernetes handles service discovery and scaling. If fraud checks spike during business hours, scale that deployment separately without touching underwriting.
- •Add autoscaling and rollout controls.
Loan workloads are bursty. A pre-approval campaign or batch re-evaluation job can spike traffic fast, so use Horizontal Pod Autoscaler to keep latency predictable.
from kubernetes import client
autoscaling_v2 = client.AutoscalingV2Api()
hpa = client.V2HorizontalPodAutoscaler(
metadata=client.V1ObjectMeta(name="lending-agent-hpa"),
spec=client.V2HorizontalPodAutoscalerSpec(
scale_target_ref=client.V2CrossVersionObjectReference(
api_version="apps/v1",
kind="Deployment",
name="lending-agent",
),
min_replicas=2,
max_replicas=10,
metrics=[
client.V2MetricSpec(
type="Resource",
resource=client.V2ResourceMetricSource(
name="cpu",
target=client.V2MetricTarget(type="Utilization", average_utilization=70),
),
)
],
),
)
autoscaling_v2.create_namespaced_horizontal_pod_autoscaler(namespace="default", body=hpa)
Testing the Integration
Run a simple end-to-end check against the service after deployment:
import requests
test_case = {
"applicant_name": "Alex Smith",
"income": 100000,
"debt": 20000,
"fraud_flag": False,
}
r = requests.post("http://localhost:8000/evaluate", json=test_case)
print(r.status_code)
print(r.json())
Expected output:
200
{'applicant_name': 'Alex Smith', 'income': 100000.0, 'debt': 20000.0, 'fraud_flag': False, 'decision': 'approve'}
If you want to verify Kubernetes wiring directly from Python, list pods before testing traffic:
from kubernetes import client, config
config.load_kube_config()
v1 = client.CoreV1Api()
pods = v1.list_namespaced_pod(namespace="default", label_selector="app=lending-agent")
for pod in pods.items:
print(pod.metadata.name, pod.status.phase)
Real-World Use Cases
- •Automated loan prequalification where one agent gathers application data, another scores risk, and a third checks policy exceptions.
- •Fraud triage pipelines where suspicious applications get routed to specialized review agents without blocking normal approvals.
- •Document-processing workflows where OCR, KYC verification, income validation, and decisioning run as separate scalable services under one graph.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit