How to Integrate LangGraph for lending with Kubernetes for production AI
Combining LangGraph for lending with Kubernetes gives you a clean path from agent logic to production runtime. LangGraph handles the lending workflow state machine; Kubernetes handles deployment, scaling, and isolation for the services that execute those workflows.
This is the setup you want when your lending agent needs to route applications, pull bureau data, trigger policy checks, and survive real traffic without turning into a single-node science project.
Prerequisites
- •Python 3.10+
- •Access to a Kubernetes cluster
- •Minikube, kind, EKS, GKE, or AKS
- •
kubectlconfigured against your cluster - •A container registry you can push to
- •LangGraph installed:
- •
pip install langgraph langchain
- •
- •Kubernetes Python client installed:
- •
pip install kubernetes
- •
- •A lending workflow design ready:
- •intake
- •document verification
- •risk scoring
- •decisioning
- •A namespace in Kubernetes for the agent services
Integration Steps
- •Build the lending graph in LangGraph
Start with a simple state model for loan applications. In production, this state usually includes applicant data, document status, score outputs, and final decision.
from typing import TypedDict, Annotated
from langgraph.graph import StateGraph, END
def merge_lists(left: list, right: list) -> list:
return left + right
class LendingState(TypedDict):
applicant_id: str
income_verified: bool
credit_score: int
decision: str
def verify_income(state: LendingState):
return {"income_verified": True}
def score_credit(state: LendingState):
# Replace with bureau/API call in production
return {"credit_score": 742}
def decide_loan(state: LendingState):
if state["income_verified"] and state["credit_score"] >= 700:
return {"decision": "approved"}
return {"decision": "declined"}
graph = StateGraph(LendingState)
graph.add_node("verify_income", verify_income)
graph.add_node("score_credit", score_credit)
graph.add_node("decide_loan", decide_loan)
graph.set_entry_point("verify_income")
graph.add_edge("verify_income", "score_credit")
graph.add_edge("score_credit", "decide_loan")
graph.add_edge("decide_loan", END)
app = graph.compile()
- •Wrap the graph in an API service
Kubernetes should run a service that exposes your graph execution over HTTP. FastAPI is a common choice because it is easy to containerize and probe.
from fastapi import FastAPI
from pydantic import BaseModel
class LoanRequest(BaseModel):
applicant_id: str
api = FastAPI()
@api.post("/evaluate")
def evaluate_loan(req: LoanRequest):
result = app.invoke(
{
"applicant_id": req.applicant_id,
"income_verified": False,
"credit_score": 0,
"decision": ""
}
)
return result
- •Containerize the service for Kubernetes
The graph runs inside your container. Keep the image small and deterministic so rollout behavior is predictable.
# app.py
from fastapi import FastAPI
from pydantic import BaseModel
# assume graph code from above is imported as `app`
service = FastAPI()
class LoanRequest(BaseModel):
applicant_id: str
@service.post("/evaluate")
def evaluate_loan(req: LoanRequest):
return app.invoke({
"applicant_id": req.applicant_id,
"income_verified": False,
"credit_score": 0,
"decision": ""
})
A minimal Dockerfile:
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["uvicorn", "app:service", "--host", "0.0.0.0", "--port", "8000"]
- •Deploy to Kubernetes using the Python client or manifests
If you want programmatic deployment from CI/CD or an operator workflow, use the Kubernetes Python client.
from kubernetes import client, config
config.load_kube_config()
apps_v1 = client.AppsV1Api()
core_v1 = client.CoreV1Api()
deployment = client.V1Deployment(
metadata=client.V1ObjectMeta(name="lending-agent"),
spec=client.V1DeploymentSpec(
replicas=2,
selector=client.V1LabelSelector(match_labels={"app": "lending-agent"}),
template=client.V1PodTemplateSpec(
metadata=client.V1ObjectMeta(labels={"app": "lending-agent"}),
spec=client.V1PodSpec(
containers=[
client.V1Container(
name="lending-agent",
image="registry.example.com/lending-agent:1.0.0",
ports=[client.V1ContainerPort(container_port=8000)],
)
]
),
),
),
)
apps_v1.create_namespaced_deployment(namespace="ai-lending", body=deployment)
Create a service so other systems can call it:
service = client.V1Service(
metadata=client.V1ObjectMeta(name="lending-agent"),
spec=client.V1ServiceSpec(
selector={"app": "lending-agent"},
ports=[client.V1ServicePort(port=80, target_port=8000)],
type="ClusterIP",
),
)
core_v1.create_namespaced_service(namespace="ai-lending", body=service)
- •Add autoscaling and operational controls
For lending workloads, traffic spikes happen during campaigns or batch underwriting windows. Use HPA so your graph service scales on CPU or custom metrics.
from kubernetes import client
autoscaling_v2 = client.AutoscalingV2Api()
hpa = client.V2HorizontalPodAutoscaler(
metadata=client.V1ObjectMeta(name="lending-agent-hpa"),
spec=client.V2HorizontalPodAutoscalerSpec(
scale_target_ref=client.V2CrossVersionObjectReference(
api_version="apps/v1",
kind="Deployment",
name="lending-agent",
),
min_replicas=2,
max_replicas=10,
metrics=[
client.V2MetricSpec(
type="Resource",
resource=client.V2ResourceMetricSource(
name="cpu",
target=client.V2MetricTarget(type="Utilization", average_utilization=70),
),
)
],
),
)
autoscaling_v2.create_namespaced_horizontal_pod_autoscaler(
namespace="ai-lending",
body=hpa,
)
Testing the Integration
Run a quick smoke test against the deployed service.
import requests
resp = requests.post(
"http://lending-agent.ai-lending.svc.cluster.local/evaluate",
json={"applicant_id": "A12345"},
timeout=10,
)
print(resp.status_code)
print(resp.json())
Expected output:
200
{'applicant_id': 'A12345', 'income_verified': True, 'credit_score': 742, 'decision': 'approved'}
If that returns correctly, you know three things are working:
- •LangGraph is executing the lending workflow
- •The API wrapper is exposing it correctly
- •Kubernetes is running and routing traffic to the pods
Real-World Use Cases
- •Loan origination triage
- •Route applications through verification, scoring, and policy checks before sending them to underwriters.
- •Document-heavy underwriting agents
- •Run OCR extraction, fraud checks, and exception handling as separate graph nodes behind Kubernetes services.
- •Batch portfolio review
- •Scale out nightly re-evaluation of existing loans when rates change or risk models are updated.
The pattern here is straightforward: keep business logic in LangGraph and keep runtime concerns in Kubernetes. That separation makes lending agents easier to test, easier to scale, and much less painful to operate under production load.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit