How to Integrate LangGraph for payments with Kubernetes for RAG
Combining LangGraph for payments with Kubernetes for RAG gives you a clean split between orchestration and infrastructure. LangGraph handles the payment workflow state machine, while Kubernetes gives you repeatable deployment, scaling, and isolation for retrieval-heavy workloads.
That matters when your agent needs to answer questions from private docs, trigger billing events, and keep the whole thing auditable. You want the payment flow to stay deterministic while the RAG layer scales independently under load.
Prerequisites
- •Python 3.10+
- •A Kubernetes cluster with
kubectlaccess - •
kubernetesPython client installed - •LangGraph installed and configured
- •A payment provider SDK or internal payments service endpoint
- •Access to a vector store or retrieval API used by your RAG pipeline
- •Environment variables set for:
- •
KUBECONFIG - •
PAYMENTS_API_KEY - •
LANGCHAIN_API_KEYif you trace LangGraph runs
- •
Install the core packages:
pip install langgraph kubernetes requests pydantic
Integration Steps
1) Define the payment workflow in LangGraph
Start by modeling payment authorization as a graph node. Keep the graph small and explicit; this is where you want retries, guardrails, and auditability.
from typing import TypedDict, Annotated
from langgraph.graph import StateGraph, START, END
class PaymentState(TypedDict):
user_id: str
amount_cents: int
invoice_id: str
payment_status: str
rag_context: str
def authorize_payment(state: PaymentState):
# Replace with your real payment provider call
# Example: stripe.PaymentIntent.create(...)
if state["amount_cents"] > 0:
return {"payment_status": "authorized"}
return {"payment_status": "rejected"}
def finalize_payment(state: PaymentState):
return {"payment_status": "captured"}
graph = StateGraph(PaymentState)
graph.add_node("authorize_payment", authorize_payment)
graph.add_node("finalize_payment", finalize_payment)
graph.add_edge(START, "authorize_payment")
graph.add_edge("authorize_payment", "finalize_payment")
graph.add_edge("finalize_payment", END)
payment_app = graph.compile()
This gives you a deterministic payment path. In production, swap the placeholder logic for your PSP SDK or internal billing API.
2) Use Kubernetes to run the RAG service as an isolated workload
Your retrieval service should not live inside the same process as payment orchestration. Deploy it as a separate pod so you can scale it independently and roll it safely.
from kubernetes import client, config
config.load_kube_config()
apps_v1 = client.AppsV1Api()
core_v1 = client.CoreV1Api()
deployment = client.V1Deployment(
metadata=client.V1ObjectMeta(name="rag-service"),
spec=client.V1DeploymentSpec(
replicas=2,
selector=client.V1LabelSelector(match_labels={"app": "rag-service"}),
template=client.V1PodTemplateSpec(
metadata=client.V1ObjectMeta(labels={"app": "rag-service"}),
spec=client.V1PodSpec(
containers=[
client.V1Container(
name="rag-service",
image="your-registry/rag-service:latest",
ports=[client.V1ContainerPort(container_port=8000)],
)
]
),
),
),
)
apps_v1.create_namespaced_deployment(namespace="default", body=deployment)
Expose it with a service so your LangGraph node can query it over HTTP.
service = client.V1Service(
metadata=client.V1ObjectMeta(name="rag-service"),
spec=client.V1ServiceSpec(
selector={"app": "rag-service"},
ports=[client.V1ServicePort(port=80, target_port=8000)],
type="ClusterIP",
),
)
core_v1.create_namespaced_service(namespace="default", body=service)
3) Connect LangGraph to the RAG endpoint inside a graph node
The graph should call retrieval before making payment decisions that depend on document context. For example, you may need policy text, customer entitlement status, or invoice metadata.
import os
import requests
RAG_URL = "http://rag-service.default.svc.cluster.local/retrieve"
def fetch_rag_context(state: PaymentState):
payload = {
"query": f"invoice {state['invoice_id']} customer {state['user_id']}"
}
resp = requests.post(RAG_URL, json=payload, timeout=10)
resp.raise_for_status()
data = resp.json()
return {"rag_context": data["context"]}
def decide_with_context(state: PaymentState):
context = state.get("rag_context", "")
if "approved" in context.lower():
return {"payment_status": "authorized"}
return {"payment_status": "review_required"}
Now wire that node into the graph before authorization.
graph = StateGraph(PaymentState)
graph.add_node("fetch_rag_context", fetch_rag_context)
graph.add_node("decide_with_context", decide_with_context)
graph.add_node("finalize_payment", finalize_payment)
graph.add_edge(START, "fetch_rag_context")
graph.add_edge("fetch_rag_context", "decide_with_context")
graph.add_edge("decide_with_context", "finalize_payment")
graph.add_edge("finalize_payment", END)
payment_app = graph.compile()
4) Trigger Kubernetes jobs from LangGraph when retrieval needs batch processing
If the RAG system needs reindexing or document enrichment before payment approval, launch a Kubernetes Job from a LangGraph node.
from kubernetes import client
batch_v1 = client.BatchV1Api()
def launch_reindex_job(state: PaymentState):
job = client.V1Job(
metadata=client.V1ObjectMeta(name=f"reindex-{state['invoice_id']}"),
spec=client.V1JobSpec(
template=client.V1PodTemplateSpec(
metadata=client.V1ObjectMeta(labels={"job": f"reindex-{state['invoice_id']}"}),
spec=client.V1PodSpec(
restart_policy="Never",
containers=[
client.V1Container(
name="reindex",
image="your-registry/rag-indexer:latest",
args=["--invoice-id", state["invoice_id"]],
)
],
),
)
),
)
batch_v1.create_namespaced_job(namespace="default", body=job)
return {"payment_status": "pending_reindex"}
This pattern works well when document freshness is part of the authorization decision.
5) Execute the full flow end-to-end
Run the compiled graph with real input. This is where you connect all pieces into one execution path.
result = payment_app.invoke({
"user_id": "usr_123",
"amount_cents": 2500,
"invoice_id": "inv_456",
"payment_status": "",
"rag_context": "",
})
print(result)
Testing the Integration
Use a simple smoke test that validates both the RAG fetch and payment state transitions.
test_input = {
"user_id": "usr_123",
"amount_cents": 2500,
"invoice_id": "inv_456",
"payment_status": "",
"rag_context": "",
}
result = payment_app.invoke(test_input)
assert result["payment_status"] in ["authorized", "captured", "review_required"]
assert isinstance(result["rag_context"], str)
print("Integration OK")
print(result)
Expected output:
Integration OK
{'user_id': 'usr_123', 'amount_cents': 2500, 'invoice_id': 'inv_456', 'payment_status': 'captured', 'rag_context': '...'}
If you get review_required, that usually means your retrieval layer returned policy text that does not satisfy your approval rule. That is fine; it means the control flow is working.
Real-World Use Cases
- •
Invoice dispute handling
Retrieve contract clauses from RAG, then use LangGraph to route between refund, escalation, or manual review while Kubernetes hosts each service separately. - •
Usage-based billing agents
Pull entitlement data from a knowledge base, calculate charges in LangGraph, and run batch reconciliation jobs in Kubernetes when documents need reprocessing. - •
Insurance claims payout checks
Fetch policy terms through RAG, validate payout eligibility in a graph node, and trigger secure payout workflows only after policy conditions are met.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit