How to Integrate LangGraph for retail banking with LangSmith for multi-agent systems

By Cyprian AaronsUpdated 2026-04-22
langgraph-for-retail-bankinglangsmithmulti-agent-systems

Why this integration matters

If you’re building retail banking agents, you need more than a chat loop. You need controlled workflows for things like KYC checks, balance inquiries, dispute intake, and loan pre-qualification, plus observability across every agent hop.

LangGraph gives you the workflow layer for multi-agent banking systems. LangSmith gives you tracing, evaluation, and debugging so you can see exactly where an agent chain failed, which tool was called, and whether the bank policy logic behaved correctly.

Prerequisites

  • Python 3.10+
  • langgraph
  • langchain-core
  • langsmith
  • An OpenAI-compatible model provider or another LLM configured in LangChain
  • LangSmith account and API key
  • Environment variables set:
    • LANGSMITH_API_KEY
    • LANGSMITH_TRACING=true
    • LANGSMITH_PROJECT=retail-banking-agents
  • A basic understanding of:
    • LangGraph StateGraph
    • LangChain tool calling
    • LangSmith tracing

Install the packages:

pip install langgraph langchain-core langsmith langchain-openai

Integration Steps

  1. Build the banking workflow with LangGraph.

Use LangGraph to model the retail banking flow as explicit states and nodes. For a multi-agent system, that means each specialist agent gets a bounded role: triage, policy check, customer lookup, and escalation.

from typing import TypedDict, Annotated
from operator import add

from langgraph.graph import StateGraph, START, END

class BankingState(TypedDict):
    messages: Annotated[list, add]
    intent: str
    risk_flag: bool
    result: str

def triage_node(state: BankingState):
    last_message = state["messages"][-1].content.lower()
    if "chargeback" in last_message or "fraud" in last_message:
        return {"intent": "dispute", "risk_flag": True}
    if "balance" in last_message:
        return {"intent": "balance", "risk_flag": False}
    return {"intent": "general", "risk_flag": False}

def policy_node(state: BankingState):
    if state["risk_flag"]:
        return {"result": "Route to fraud operations"}
    return {"result": f"Handle intent: {state['intent']}"}

graph = StateGraph(BankingState)
graph.add_node("triage", triage_node)
graph.add_node("policy", policy_node)

graph.add_edge(START, "triage")
graph.add_edge("triage", "policy")
graph.add_edge("policy", END)

app = graph.compile()
  1. Add LangSmith tracing to the graph runtime.

LangSmith traces each node execution automatically when tracing is enabled. For production banking systems, this is what lets you audit a customer conversation end-to-end without guessing which agent made the decision.

import os
os.environ["LANGSMITH_TRACING"] = "true"
os.environ["LANGSMITH_PROJECT"] = "retail-banking-agents"
# LANGSMITH_API_KEY should already be set in your environment

from langchain_core.messages import HumanMessage

input_state = {
    "messages": [HumanMessage(content="I think there was fraud on my debit card")],
    "intent": "",
    "risk_flag": False,
    "result": ""
}

output = app.invoke(input_state)
print(output["result"])

If tracing is configured correctly, every node call appears in LangSmith with inputs, outputs, latency, and errors.

  1. Wrap specialist agents as tools and expose them inside the graph.

In real retail banking systems, you usually want separate agents for account servicing and compliance checks. Here’s a simple pattern using LangChain tools that can be called from graph nodes.

from langchain_core.tools import tool

@tool
def get_account_summary(customer_id: str) -> str:
    """Fetch a customer's account summary."""
    # Replace with your core banking API call
    return f"Customer {customer_id}: checking balance $4,120.55"

@tool
def check_dispute_policy(transaction_type: str) -> str:
    """Check whether a dispute is eligible under policy."""
    allowed_types = ["card_present", "card_not_present"]
    if transaction_type in allowed_types:
        return "Eligible for dispute review"
    return "Not eligible under current policy"

Then connect them into your graph logic:

def servicing_node(state: BankingState):
    msg = state["messages"][-1].content.lower()

    if "balance" in msg:
        summary = get_account_summary.invoke({"customer_id": "12345"})
        return {"result": summary}

    if state["intent"] == "dispute":
        policy = check_dispute_policy.invoke({"transaction_type": "card_not_present"})
        return {"result": policy}

    return {"result": "Transfer to human banker"}
  1. Attach node-level tracing metadata for better debugging in LangSmith.

Banking teams need trace filters by customer segment, product line, or escalation path. Use metadata so every trace carries business context that ops teams can search later.

from langsmith import Client

client = Client()

run_metadata = {
    "product": "retail_banking",
    "channel": "mobile_app",
    "region": "us-east",
    "workflow": "dispute_triage"
}

# When invoking the app through your application layer,
# pass metadata through your tracing wrapper or runnable config.
config = {
    "metadata": run_metadata,
    "tags": ["banking", "multi-agent", "fraud-triage"]
}

output = app.invoke(
    {
        "messages": [HumanMessage(content="Need help with a disputed card charge")],
        "intent": "",
        "risk_flag": False,
        "result": ""
    },
    config=config,
)

print(output["result"])
  1. Evaluate the workflow with LangSmith datasets and runs.

Once the graph is wired up, test it against real banking scenarios. LangSmith datasets let you store representative cases like balance requests, disputes, fee reversals, and account lockouts.

from langsmith import Client

client = Client()

dataset_name = "retail-banking-routing-cases"

# Example: create dataset once
dataset = client.create_dataset(dataset_name=dataset_name)

client.create_example(
    inputs={"message": "What is my checking balance?"},
    outputs={"expected_intent": "balance"},
    dataset_id=dataset.id,
)

client.create_example(
    inputs={"message": "There is fraud on my card"},
    outputs={"expected_intent": "dispute"},
    dataset_id=dataset.id,
)

Then run evaluations against your graph output and inspect failures in LangSmith traces.

Testing the Integration

Use one request that should route to dispute handling and verify both graph output and trace creation.

from langchain_core.messages import HumanMessage

test_input = {
    "messages": [HumanMessage(content="I need to report fraud on my debit card")],
    "intent": "",
    "risk_flag": False,
    "result": ""
}

result = app.invoke(test_input)

print("Intent:", result["intent"])
print("Risk flag:", result["risk_flag"])
print("Result:", result["result"])

Expected output:

Intent: dispute
Risk flag: True
Result: Route to fraud operations

In LangSmith, you should see a trace with at least two steps:

  • triage
  • policy

Real-World Use Cases

  • Fraud intake routing

    • Triage customer reports from mobile chat.
    • Route high-risk cases to fraud ops while logging every decision path in LangSmith.
  • Account servicing assistant

    • Handle balance checks, card replacement requests, and fee explanations.
    • Use LangGraph to keep each action deterministic and auditable.
  • Loan pre-screening workflow

    • Split work across eligibility checks, document collection, and affordability review.
    • Use LangSmith traces to compare outcomes across prompt versions and policy changes.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides