How to Integrate LangGraph for retail banking with LangSmith for production AI

By Cyprian AaronsUpdated 2026-04-22
langgraph-for-retail-bankinglangsmithproduction-ai

Combining LangGraph for retail banking with LangSmith gives you a production-grade way to build regulated banking agents that are observable, testable, and easier to debug. LangGraph handles the stateful workflow logic for things like account servicing, fraud checks, and customer authentication, while LangSmith gives you tracing, evaluation, and prompt/version visibility across the full agent path.

Prerequisites

  • Python 3.10+
  • A LangChain/LangGraph-compatible project structure
  • langgraph installed
  • langsmith installed
  • langchain-openai or another model provider package
  • API keys configured:
    • OPENAI_API_KEY
    • LANGSMITH_API_KEY
  • LangSmith project created in your account
  • Basic retail banking agent flow defined:
    • customer intent classification
    • policy/risk checks
    • tool execution for account actions
    • human escalation path for sensitive cases

Install the packages:

pip install langgraph langsmith langchain-openai

Set environment variables:

export LANGSMITH_TRACING=true
export LANGSMITH_API_KEY="lsv2_..."
export LANGSMITH_PROJECT="retail-banking-agent"
export OPENAI_API_KEY="sk-..."

Integration Steps

  1. Create the banking graph with LangGraph

Start by defining a state model and a simple graph. In retail banking, your graph should keep customer context, requested action, risk flags, and final decision.

from typing import TypedDict, Annotated
from langgraph.graph import StateGraph, START, END

class BankingState(TypedDict):
    customer_id: str
    intent: str
    risk_flag: bool
    response: str

def classify_intent(state: BankingState) -> BankingState:
    intent = state["intent"].lower()
    return {
        **state,
        "risk_flag": intent in ["wire_transfer", "close_account", "change_phone_number"],
    }

def route_response(state: BankingState) -> BankingState:
    if state["risk_flag"]:
        return {**state, "response": "Escalate to human review."}
    return {**state, "response": "Request approved."}

graph = StateGraph(BankingState)
graph.add_node("classify_intent", classify_intent)
graph.add_node("route_response", route_response)

graph.add_edge(START, "classify_intent")
graph.add_edge("classify_intent", "route_response")
graph.add_edge("route_response", END)

app = graph.compile()
  1. Enable LangSmith tracing for the graph run

LangSmith traces are easiest to capture through environment variables plus explicit run metadata. For production banking flows, tag runs by channel, product line, and risk tier.

import os

os.environ["LANGSMITH_TRACING"] = "true"
os.environ["LANGSMITH_PROJECT"] = "retail-banking-agent"

result = app.invoke(
    {
        "customer_id": "cust_10021",
        "intent": "wire_transfer",
        "risk_flag": False,
        "response": "",
    },
    config={
        "tags": ["retail-banking", "production"],
        "metadata": {
            "channel": "mobile",
            "product": "checking",
            "risk_tier": "high",
        },
    },
)

print(result)
  1. Wrap model calls with a traced LangChain runnable

If your graph uses an LLM for intent classification or response drafting, make that call traceable. LangSmith automatically captures supported LangChain runnables when tracing is enabled.

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a retail banking assistant. Classify the user's intent."),
    ("user", "{text}")
])

classifier = prompt | llm

def llm_classify(state):
    msg = classifier.invoke(
        {"text": f"Customer {state['customer_id']} requested: {state['intent']}"},
        config={"tags": ["intent-classifier"]}
    )
    return {**state, "response": msg.content}

# Example standalone call for tracing validation
print(classifier.invoke({"text": "I want to wire transfer $15k"}).content)
  1. Add LangSmith evaluation hooks for regression testing

For production AI in banking, you need repeatable checks on routing behavior. Use LangSmith datasets and evaluators to compare expected vs actual outputs before shipping changes.

from langsmith import Client

client = Client()

dataset = client.create_dataset(
    dataset_name="banking-intent-routing"
)

client.create_examples(
    inputs=[
        {"customer_id": "1", "intent": "balance inquiry", "risk_flag": False},
        {"customer_id": "2", "intent": "wire_transfer", "risk_flag": True},
    ],
    outputs=[
        {"response": "Request approved."},
        {"response": "Escalate to human review."},
    ],
    dataset_id=dataset.id,
)
  1. Run the graph with traceable metadata in production-style execution

Pass structured metadata every time you invoke the graph. This makes it easier to filter traces by branch, customer segment, or incident window inside LangSmith.

run_input = {
    "customer_id": "cust_77881",
    "intent": "close_account",
    "risk_flag": False,
    "response": "",
}

output = app.invoke(
    run_input,
    config={
        "tags": ["prod", "banking-agent", "account-servicing"],
        "metadata": {
            "region": "us-east-1",
            "journey": "__account_closure__",
            "case_id": "CASE-44019",
        },
    },
)

print(output["response"])

Testing the Integration

Use a direct invocation and confirm that both graph logic and tracing are active. If LangSmith is configured correctly, you should see the run in your project dashboard after execution.

test_input = {
    "customer_id": "cust_90001",
    "intent": "wire_transfer",
    "risk_flag": False,
    "response": "",
}

result = app.invoke(
    test_input,
    config={
        "_run_name": "_banking_flow_test_",
        "_tags_please_ignore_if_using_old_sdk_": ["test"],
        # prefer tags/metadata in newer configs; keep this minimal if your SDK version differs
        # tags=["test"],
        # metadata={"env":"ci"}
    },
)

assert result["risk_flag"] is True or result["response"] == ""
print(result)

Expected output:

{
  'customer_id': 'cust_90001',
  'intent': 'wire_transfer',
  'risk_flag': True,
  'response': 'Escalate to human review.'
}

In LangSmith, verify:

  • a trace exists for the run
  • node-level timing is visible
  • metadata includes your case ID or journey tag
  • prompts and model calls are attached where applicable

Real-World Use Cases

  • Fraud-sensitive request routing

    • Detect high-risk intents like wire transfers or contact detail changes.
    • Route those cases to manual review while letting low-risk servicing requests complete automatically.
  • Customer servicing copilots

    • Build an agent that answers balance questions, card replacement requests, and fee explanations.
    • Use LangSmith traces to inspect failures when the agent picks the wrong branch.
  • Policy-aware loan or collections workflows

    • Chain eligibility checks, document validation, and escalation logic in LangGraph.
    • Use LangSmith evaluations to catch regressions when prompts or routing rules change.

If you’re building AI agents for retail banking, this pairing is what gets you from prototype to something you can actually operate. LangGraph gives you control over state and branching; LangSmith gives you the observability layer you need when auditors, ops teams, and product owners all want answers.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides