How to Integrate LangGraph for lending with Kubernetes for multi-agent systems

By Cyprian AaronsUpdated 2026-04-21

langgraph-for-lendingkubernetesmulti-agent-systems

Combining LangGraph for lending with Kubernetes gives you a clean way to run loan-origination agents as isolated, scalable services. LangGraph handles the decision flow across underwriting, document checks, and exception routing; Kubernetes gives you deployment, autoscaling, and fault isolation for multi-agent workloads.

Prerequisites

•Python 3.10+
•
A running Kubernetes cluster
- •Minikube, kind, EKS, GKE, or AKS
•kubectl configured against the cluster
•Docker installed for building agent images
•A LangGraph project set up for lending workflows
•Access to your model provider credentials via environment variables
•
Python packages:
- •langgraph
- •langchain-core
- •kubernetes
- •pyyaml

Install the Python dependencies:

pip install langgraph langchain-core kubernetes pyyaml

Integration Steps

•Define the lending workflow in LangGraph.

A lending system usually needs multiple agents: intake, risk scoring, fraud checks, and final decisioning. LangGraph is a good fit because it lets you wire those nodes into a deterministic state machine instead of a pile of ad hoc prompts.

from typing import TypedDict, Literal
from langgraph.graph import StateGraph, START, END

class LendingState(TypedDict):
    applicant_name: str
    income: float
    debt: float
    fraud_flag: bool
    decision: str

def intake_agent(state: LendingState) -> LendingState:
    return state

def risk_agent(state: LendingState) -> LendingState:
    dti = state["debt"] / state["income"]
    state["decision"] = "review" if dti > 0.4 else "approve"
    return state

def fraud_agent(state: LendingState) -> LendingState:
    if state["fraud_flag"]:
        state["decision"] = "reject"
    return state

graph = StateGraph(LendingState)
graph.add_node("intake", intake_agent)
graph.add_node("risk", risk_agent)
graph.add_node("fraud", fraud_agent)

graph.add_edge(START, "intake")
graph.add_edge("intake", "risk")
graph.add_edge("risk", "fraud")
graph.add_edge("fraud", END)

app = graph.compile()

•Package the graph as an API service that Kubernetes can run.

Kubernetes should not run your graph directly inside a notebook or local process. Wrap it in a small service so each pod can execute one request at a time and stay stateless.

from fastapi import FastAPI
from pydantic import BaseModel

api = FastAPI()

class LendingRequest(BaseModel):
    applicant_name: str
    income: float
    debt: float
    fraud_flag: bool = False

@api.post("/evaluate")
def evaluate(req: LendingRequest):
    result = app.invoke(req.model_dump())
    return result

Save that as main.py, then build a container image with your dependencies and service entrypoint. In production, this is the unit Kubernetes schedules and scales.

•Deploy the service to Kubernetes.

Use a Deployment for replicas and a Service for stable networking between agents or upstream systems. If you have separate underwriting and fraud services, Kubernetes lets each one scale independently.

from kubernetes import client, config

config.load_kube_config()

apps_v1 = client.AppsV1Api()
core_v1 = client.CoreV1Api()

deployment = client.V1Deployment(
    metadata=client.V1ObjectMeta(name="lending-agent"),
    spec=client.V1DeploymentSpec(
        replicas=2,
        selector=client.V1LabelSelector(match_labels={"app": "lending-agent"}),
        template=client.V1PodTemplateSpec(
            metadata=client.V1ObjectMeta(labels={"app": "lending-agent"}),
            spec=client.V1PodSpec(
                containers=[
                    client.V1Container(
                        name="lending-agent",
                        image="your-registry/lending-agent:latest",
                        ports=[client.V1ContainerPort(container_port=8000)],
                    )
                ]
            ),
        ),
    ),
)

apps_v1.create_namespaced_deployment(namespace="default", body=deployment)

Expose it with a ClusterIP Service so other agents can call it inside the cluster:

service = client.V1Service(
    metadata=client.V1ObjectMeta(name="lending-agent-svc"),
    spec=client.V1ServiceSpec(
        selector={"app": "lending-agent"},
        ports=[client.V1ServicePort(port=80, target_port=8000)],
        type="ClusterIP",
    ),
)

core_v1.create_namespaced_service(namespace="default", body=service)

•Connect multiple agents through Kubernetes-hosted services.

In multi-agent systems, one agent often calls another over HTTP. For lending, that usually means an orchestration agent sends applications to underwriting and fraud services deployed in separate pods.

import requests

payload = {
    "applicant_name": "Jane Doe",
    "income": 120000,
    "debt": 30000,
    "fraud_flag": False,
}

resp = requests.post(
    "http://lending-agent-svc.default.svc.cluster.local/evaluate",
    json=payload,
    timeout=10,
)

print(resp.json())

This pattern keeps your LangGraph logic focused on workflow while Kubernetes handles service discovery and scaling. If fraud checks spike during business hours, scale that deployment separately without touching underwriting.

•Add autoscaling and rollout controls.

Loan workloads are bursty. A pre-approval campaign or batch re-evaluation job can spike traffic fast, so use Horizontal Pod Autoscaler to keep latency predictable.

from kubernetes import client

autoscaling_v2 = client.AutoscalingV2Api()

hpa = client.V2HorizontalPodAutoscaler(
    metadata=client.V1ObjectMeta(name="lending-agent-hpa"),
    spec=client.V2HorizontalPodAutoscalerSpec(
        scale_target_ref=client.V2CrossVersionObjectReference(
            api_version="apps/v1",
            kind="Deployment",
            name="lending-agent",
        ),
        min_replicas=2,
        max_replicas=10,
        metrics=[
            client.V2MetricSpec(
                type="Resource",
                resource=client.V2ResourceMetricSource(
                    name="cpu",
                    target=client.V2MetricTarget(type="Utilization", average_utilization=70),
                ),
            )
        ],
    ),
)

autoscaling_v2.create_namespaced_horizontal_pod_autoscaler(namespace="default", body=hpa)

Testing the Integration

Run a simple end-to-end check against the service after deployment:

import requests

test_case = {
    "applicant_name": "Alex Smith",
    "income": 100000,
    "debt": 20000,
    "fraud_flag": False,
}

r = requests.post("http://localhost:8000/evaluate", json=test_case)
print(r.status_code)
print(r.json())

Expected output:

200
{'applicant_name': 'Alex Smith', 'income': 100000.0, 'debt': 20000.0, 'fraud_flag': False, 'decision': 'approve'}

If you want to verify Kubernetes wiring directly from Python, list pods before testing traffic:

from kubernetes import client, config

config.load_kube_config()
v1 = client.CoreV1Api()

pods = v1.list_namespaced_pod(namespace="default", label_selector="app=lending-agent")
for pod in pods.items:
    print(pod.metadata.name, pod.status.phase)

Real-World Use Cases

•Automated loan prequalification where one agent gathers application data, another scores risk, and a third checks policy exceptions.
•Fraud triage pipelines where suspicious applications get routed to specialized review agents without blocking normal approvals.
•Document-processing workflows where OCR, KYC verification, income validation, and decisioning run as separate scalable services under one graph.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit