How to Integrate LangGraph for retail banking with Kubernetes for startups

By Cyprian AaronsUpdated 2026-04-21
langgraph-for-retail-bankingkubernetesstartups

Combining LangGraph for retail banking with Kubernetes gives you a clean way to run regulated agent workflows as production services. You get stateful orchestration for customer journeys like account opening, dispute triage, and loan pre-qualification, while Kubernetes handles scaling, rollout control, and fault isolation.

For startups, that matters because banking agents are not stateless chat apps. They need retries, auditability, controlled execution, and the ability to survive pod restarts without losing workflow state.

Prerequisites

  • Python 3.10+
  • A Kubernetes cluster:
    • local: kind or minikube
    • cloud: EKS, GKE, or AKS
  • kubectl configured and pointing at your cluster
  • A container registry you can push images to
  • LangGraph installed:
    • pip install langgraph langchain-openai
  • Kubernetes Python client installed:
    • pip install kubernetes pydantic
  • Access to your LLM provider credentials in environment variables
  • A persistent store for workflow state:
    • Postgres, Redis, or a managed KV store
  • Basic familiarity with:
    • Python async code
    • Kubernetes Deployments and Services

Integration Steps

  1. Define the banking workflow in LangGraph

Start by modeling the bank process as a graph. For example, a retail banking intake flow can collect customer data, validate KYC fields, and route to either approval or manual review.

from typing import TypedDict, Literal
from langgraph.graph import StateGraph, END

class BankingState(TypedDict):
    customer_id: str
    income_verified: bool
    kyc_passed: bool
    decision: Literal["approve", "review", "reject"]

def verify_income(state: BankingState):
    return {"income_verified": True}

def verify_kyc(state: BankingState):
    return {"kyc_passed": True}

def route_decision(state: BankingState):
    if state["income_verified"] and state["kyc_passed"]:
        return {"decision": "approve"}
    return {"decision": "review"}

builder = StateGraph(BankingState)
builder.add_node("verify_income", verify_income)
builder.add_node("verify_kyc", verify_kyc)
builder.add_node("route_decision", route_decision)

builder.set_entry_point("verify_income")
builder.add_edge("verify_income", "verify_kyc")
builder.add_edge("verify_kyc", "route_decision")
builder.add_edge("route_decision", END)

graph = builder.compile()

This gives you deterministic control over the business flow. In banking systems, that matters more than fancy prompts.

  1. Add persistence so Kubernetes restarts do not kill the workflow

If you run LangGraph inside pods without persistence, any restart loses in-flight agent state. Use a checkpointer backed by Postgres or another durable store.

from langgraph.checkpoint.memory import MemorySaver

checkpointer = MemorySaver()

app = builder.compile(checkpointer=checkpointer)

initial_state = {
    "customer_id": "cust_123",
    "income_verified": False,
    "kyc_passed": False,
    "decision": "review",
}

result = app.invoke(initial_state)
print(result)

For production, swap MemorySaver() with a durable checkpointer implementation. The pattern stays the same: compile the graph with a checkpointer so each node execution can be resumed after failure.

  1. Wrap the graph in an API service for Kubernetes

Kubernetes should run your agent as a normal HTTP service. FastAPI is a practical choice because it gives you health checks and clean request handling.

from fastapi import FastAPI
from pydantic import BaseModel
from langgraph.graph import StateGraph, END

app = FastAPI()

class IntakeRequest(BaseModel):
    customer_id: str

class BankingState(TypedDict):
    customer_id: str
    income_verified: bool
    kyc_passed: bool
    decision: str

def verify_income(state: BankingState):
    return {"income_verified": True}

def verify_kyc(state: BankingState):
    return {"kyc_passed": True}

def route_decision(state: BankingState):
    return {"decision": "approve" if state["income_verified"] and state["kyc_passed"] else "review"}

builder = StateGraph(BankingState)
builder.add_node("verify_income", verify_income)
builder.add_node("verify_kyc", verify_kyc)
builder.add_node("route_decision", route_decision)
builder.set_entry_point("verify_income")
builder.add_edge("verify_income", "verify_kyc")
builder.add_edge("verify_kyc", "route_decision")
builder.add_edge("route_decision", END)

workflow = builder.compile()

@app.post("/intake")
def intake(req: IntakeRequest):
    state = {
        "customer_id": req.customer_id,
        "income_verified": False,
        "kyc_passed": False,
        "decision": "review",
    }
    return workflow.invoke(state)

@app.get("/healthz")
def healthz():
    return {"ok": True}

This is the layer Kubernetes will manage. Keep the graph logic separate from HTTP concerns so you can test it without spinning up pods.

  1. Containerize and deploy to Kubernetes

Build a container image and deploy it as a standard Deployment with replicas behind a Service.

from kubernetes import client, config

config.load_kube_config()

apps_v1 = client.AppsV1Api()
core_v1 = client.CoreV1Api()

deployment = client.V1Deployment(
    metadata=client.V1ObjectMeta(name="banking-agent"),
    spec=client.V1DeploymentSpec(
        replicas=2,
        selector=client.V1LabelSelector(match_labels={"app": "banking-agent"}),
        template=client.V1PodTemplateSpec(
            metadata=client.V1ObjectMeta(labels={"app": "banking-agent"}),
            spec=client.V1PodSpec(
                containers=[
                    client.V1Container(
                        name="banking-agent",
                        image="registry.example.com/banking-agent:latest",
                        ports=[client.V1ContainerPort(container_port=8000)],
                    )
                ]
            ),
        ),
    ),
)

apps_v1.create_namespaced_deployment(namespace="default", body=deployment)

service = client.V1Service(
    metadata=client.V1ObjectMeta(name="banking-agent-svc"),
    spec=client.V1ServiceSpec(
        selector={"app": "banking-agent"},
        ports=[client.V1ServicePort(port=80, target_port=8000)],
        type="ClusterIP",
    ),
)

core_v1.create_namespaced_service(namespace="default", body=service)

In real deployments you would apply this YAML through GitOps or CI/CD rather than creating resources from Python. The important part is that your LangGraph app runs as an ordinary microservice under Kubernetes control.

  1. Add autoscaling and safe rollout behavior

Retail banking traffic is bursty around payday cycles, loan campaigns, and support spikes. Use HPA plus readiness probes so Kubernetes only sends traffic to healthy pods.

from kubernetes import client

autoscaling_v2 = client.AutoscalingV2Api()

hpa = client.V2HorizontalPodAutoscaler(
    metadata=client.V1ObjectMeta(name="banking-agent-hpa"),
    spec=client.V2HorizontalPodAutoscalerSpec(
        scale_target_ref=client.V2CrossVersionObjectReference(
            api_version="apps/v1",
            kind="Deployment",
            name="banking-agent",
        ),
        min_replicas=2,
        max_replicas=10,
        metrics=[
            client.V2MetricSpec(
                type="Resource",
                resource=client.V2ResourceMetricSource(
                    name="cpu",
                    target=client.V2MetricTarget(type="Utilization", average_utilization=70),
                ),
            )
        ],
    ),
)

autoscaling_v2.create_namespaced_horizontal_pod_autoscaler(namespace="default", body=hpa)

Use readiness probes on /healthz, set resource requests/limits, and keep node-level side effects out of the graph nodes. That keeps failures contained when one agent path misbehaves.

Testing the Integration

Run the service locally or inside your cluster, then invoke the workflow through the API.

import requests

resp = requests.post(
    "http://localhost:8000/intake",
    json={"customer_id": "cust_123"}
)

print(resp.status_code)
print(resp.json())

Expected output:

200
{'customer_id': 'cust_123', 'income_verified': True, 'kyc_passed': True, 'decision': 'approve'}

If this works in local Docker and inside Kubernetes with multiple replicas, your integration is sound enough for a startup MVP.

Real-World Use Cases

  • Loan pre-screening agent

    • Collect applicant data with LangGraph nodes.
    • Run it in Kubernetes so multiple loan officers’ requests scale independently during peak hours.
  • Dispute resolution assistant

    • Route card disputes through verification, evidence collection, and escalation steps.
    • Use pod autoscaling when support volume spikes after billing cycles.
  • KYC onboarding pipeline

    • Orchestrate document checks, sanctions screening hooks, and manual review routing.
    • Keep state durable so onboarding can resume after pod rescheduling or rolling updates.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides