How to Integrate LangGraph for fintech with Kubernetes for production AI

By Cyprian AaronsUpdated 2026-04-21
langgraph-for-fintechkubernetesproduction-ai

Combining LangGraph for fintech with Kubernetes gives you a clean path from agent logic to production execution. LangGraph handles stateful workflows for things like KYC checks, fraud triage, payment routing, and claims review, while Kubernetes gives you scheduling, isolation, retries, and horizontal scaling for the same workflows under real load.

Prerequisites

  • Python 3.10+
  • Access to a Kubernetes cluster
    • kubectl configured
    • permissions to create Deployments, Services, ConfigMaps, and Secrets
  • A working LangGraph project
    • langgraph
    • langchain-core
    • your model provider SDK if your graph calls an LLM
  • A container registry for pushing images
  • kubernetes Python client installed:
    • pip install langgraph kubernetes pydantic
  • Basic knowledge of:
    • LangGraph StateGraph, START, END
    • Kubernetes client.AppsV1Api, client.CoreV1Api

Integration Steps

  1. Build the LangGraph workflow for the fintech use case.

Start with a deterministic graph. For fintech, that usually means explicit steps like intake, risk scoring, policy check, and decisioning.

from typing import TypedDict, Literal
from langgraph.graph import StateGraph, START, END

class LoanState(TypedDict):
    customer_id: str
    income: float
    debt_ratio: float
    risk_score: int
    decision: Literal["approve", "review", "reject"]

def assess_risk(state: LoanState) -> LoanState:
    score = 100 if state["debt_ratio"] > 0.5 else 30
    return {**state, "risk_score": score}

def decide(state: LoanState) -> LoanState:
    if state["risk_score"] >= 80:
        decision = "reject"
    elif state["risk_score"] >= 40:
        decision = "review"
    else:
        decision = "approve"
    return {**state, "decision": decision}

graph = StateGraph(LoanState)
graph.add_node("assess_risk", assess_risk)
graph.add_node("decide", decide)
graph.add_edge(START, "assess_risk")
graph.add_edge("assess_risk", "decide")
graph.add_edge("decide", END)

app = graph.compile()
  1. Package the graph behind a Python service that Kubernetes can run.

Kubernetes should not know about graph internals. Expose one API endpoint that accepts a request and runs app.invoke().

from fastapi import FastAPI
from pydantic import BaseModel

api = FastAPI()

class LoanRequest(BaseModel):
    customer_id: str
    income: float
    debt_ratio: float

@api.post("/evaluate")
def evaluate(req: LoanRequest):
    result = app.invoke({
        "customer_id": req.customer_id,
        "income": req.income,
        "debt_ratio": req.debt_ratio,
        "risk_score": 0,
        "decision": "review",
    })
    return result
  1. Create a Kubernetes deployment from Python using the official client.

This is useful when your CI pipeline or platform service needs to roll out graph workers automatically.

from kubernetes import client, config

config.load_kube_config()

apps_v1 = client.AppsV1Api()

deployment = client.V1Deployment(
    metadata=client.V1ObjectMeta(name="langgraph-fintech-agent"),
    spec=client.V1DeploymentSpec(
        replicas=2,
        selector=client.V1LabelSelector(match_labels={"app": "langgraph-fintech-agent"}),
        template=client.V1PodTemplateSpec(
            metadata=client.V1ObjectMeta(labels={"app": "langgraph-fintech-agent"}),
            spec=client.V1PodSpec(
                containers=[
                    client.V1Container(
                        name="agent",
                        image="registry.example.com/langgraph-fintech-agent:latest",
                        ports=[client.V1ContainerPort(container_port=8000)],
                    )
                ]
            ),
        ),
    ),
)

apps_v1.create_namespaced_deployment(namespace="prod-ai", body=deployment)
  1. Wire in autoscaling and service discovery.

For production AI systems, you want stable routing and scale based on demand. Use a Service for traffic and HorizontalPodAutoscaler for burst handling.

from kubernetes import client

core_v1 = client.CoreV1Api()
autoscaling_v2 = client.AutoscalingV2Api()

service = client.V1Service(
    metadata=client.V1ObjectMeta(name="langgraph-fintech-agent"),
    spec=client.V1ServiceSpec(
        selector={"app": "langgraph-fintech-agent"},
        ports=[client.V1ServicePort(port=80, target_port=8000)],
        type="ClusterIP",
    ),
)

core_v1.create_namespaced_service(namespace="prod-ai", body=service)

hpa = client.V2HorizontalPodAutoscaler(
    metadata=client.V1ObjectMeta(name="langgraph-fintech-agent-hpa"),
    spec=client.V2HorizontalPodAutoscalerSpec(
        scale_target_ref=client.V2CrossVersionObjectReference(
            api_version="apps/v1",
            kind="Deployment",
            name="langgraph-fintech-agent",
        ),
        min_replicas=2,
        max_replicas=10,
        metrics=[
            client.V2MetricSpec(
                type="Resource",
                resource=client.V2ResourceMetricSource(
                    name="cpu",
                    target=client.V2MetricTarget(type="Utilization", average_utilization=70),
                ),
            )
        ],
    ),
)

autoscaling_v2.create_namespaced_horizontal_pod_autoscaler(namespace="prod-ai", body=hpa)
  1. Add runtime configuration through ConfigMaps and Secrets.

Keep model keys, thresholds, and policy values out of code. That lets you update rules without rebuilding the image.

from kubernetes import client

config_map = client.V1ConfigMap(
    metadata=client.V1ObjectMeta(name="langgraph-fintech-config"),
    data={
        "RISK_REJECT_THRESHOLD": "80",
        "RISK_REVIEW_THRESHOLD": "40",
    },
)

secret = client.V1Secret(
    metadata=client.V1ObjectMeta(name="langgraph-fintech-secrets"),
    string_data={
        "OPENAI_API_KEY": "replace-me",
    },
)

core_v1.create_namespaced_config_map(namespace="prod-ai", body=config_map)
core_v1.create_namespaced_secret(namespace="prod-ai", body=secret)

Testing the Integration

Run the graph locally first, then verify the Kubernetes objects exist and the service responds.

import requests

payload = {
    "customer_id": "cust-1009",
    "income": 120000,
    "debt_ratio": 0.22,
}

response = requests.post("http://localhost:8000/evaluate", json=payload)
print(response.json())

Expected output:

{
  "customer_id": "cust-1009",
  "income": 120000,
  "debt_ratio": 0.22,
  "risk_score": 30,
  "decision": "approve"
}

If you want to validate from inside the cluster:

from kubernetes import client, config

config.load_kube_config()
core_v1 = client.CoreV1Api()

pods = core_v1.list_namespaced_pod(namespace="prod-ai", label_selector="app=langgraph-fintech-agent")
print([p.metadata.name for p in pods.items])

You should see at least one running pod name in the list.

Real-World Use Cases

  • Fraud triage

    • Route transactions through a LangGraph workflow that scores risk, enriches account context, and escalates suspicious cases.
    • Run each stage as a replicated service on Kubernetes so bursts do not stall decisions.
  • Loan underwriting

    • Use LangGraph to orchestrate credit checks, policy validation, and exception handling.
    • Use Kubernetes to isolate workloads by tenant or business unit with namespaces and resource limits.
  • Claims automation

    • Build an agent that extracts claim details, checks policy coverage, flags inconsistencies, and creates a human-review queue.
    • Scale workers independently when claim volume spikes after major events.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides