How to Build a customer support Agent Using LangGraph in Python for investment banking

By Cyprian AaronsUpdated 2026-04-21
customer-supportlanggraphpythoninvestment-banking

A customer support agent for investment banking handles client questions about account access, trade confirmations, settlement status, fee schedules, document requests, and escalation routing. It matters because the support layer sits in the middle of regulated client communication, so every response needs to be accurate, traceable, and constrained by policy.

Architecture

  • Ingress layer

    • Accepts chat or ticket text from a CRM, portal, or internal support desk.
    • Normalizes the request into a structured state object.
  • Intent router

    • Classifies the request into support categories like onboarding, statements, trade ops, billing, or escalation.
    • Sends low-risk requests to retrieval and high-risk requests to human review.
  • Policy and compliance gate

    • Blocks unsafe outputs.
    • Enforces rules around MNPI, suitability advice, account-specific disclosures, and jurisdiction-specific handling.
  • Retrieval layer

    • Pulls answers from approved sources only: SOPs, product docs, fee schedules, SLAs, and internal runbooks.
    • Avoids free-form generation for factual banking content.
  • Response composer

    • Drafts the final answer using retrieved context.
    • Adds citations or source references for auditability.
  • Escalation handler

    • Routes ambiguous or regulated cases to a human agent.
    • Captures reason codes for audit logs and QA review.

Implementation

  1. Define the graph state

    Keep the state small and explicit. For banking support, you want fields for the user message, intent, retrieved context, final response, and an escalation flag.

from typing import TypedDict

class SupportState(TypedDict):
    message: str
    intent: str
    context: str
    response: str
    escalate: bool
  1. Build the LangGraph workflow

    This example uses StateGraph, START, END, add_node, add_conditional_edges, and compile(). The router sends high-risk requests to escalation instead of answering directly.

from langgraph.graph import StateGraph, START, END

def classify_intent(state: SupportState) -> SupportState:
    msg = state["message"].lower()
    if any(x in msg for x in ["trade", "execution", "order"]):
        state["intent"] = "trading"
        state["escalate"] = True
    elif any(x in msg for x in ["statement", "fee", "invoice"]):
        state["intent"] = "billing"
        state["escalate"] = False
    else:
        state["intent"] = "general"
        state["escalate"] = False
    return state

def retrieve_policy_context(state: SupportState) -> SupportState:
    if state["intent"] == "billing":
        state["context"] = (
            "Billing policy: fee disputes must reference the latest approved schedule "
            "and include ticket ID. Do not speculate on waived fees."
        )
    else:
        state["context"] = (
            "General support policy: answer only from approved documentation. "
            "If request involves trading advice or account-specific decisions, escalate."
        )
    return state

def draft_response(state: SupportState) -> SupportState:
    if state["escalate"]:
        state["response"] = (
            "I’m routing this to a human specialist because it may involve regulated "
            "trading activity or account-specific handling."
        )
    else:
        state["response"] = f"{state['context']} Based on your request: {state['message']}"
    return state

def route(state: SupportState):
    return "escalate" if state["escalate"] else "respond"

def escalate_case(state: SupportState) -> SupportState:
    state["response"] = (
        f"Escalated case logged under intent={state['intent']}. "
        f"Human review required for message: {state['message']}"
    )
    return state

graph = StateGraph(SupportState)

graph.add_node("classify_intent", classify_intent)
graph.add_node("retrieve_policy_context", retrieve_policy_context)
graph.add_node("draft_response", draft_response)
graph.add_node("escalate_case", escalate_case)

graph.add_edge(START, "classify_intent")
graph.add_edge("classify_intent", "retrieve_policy_context")
graph.add_edge("retrieve_policy_context", "draft_response")
graph.add_conditional_edges(
    "draft_response",
    route,
    {
        "escalate": "escalate_case",
        "respond": END,
    },
)
graph.add_edge("escalate_case", END)

app = graph.compile()

result = app.invoke({
    "message": "Can you confirm whether my trade settled yet?",
    "intent": "",
    "context": "",
    "response": "",
    "escalate": False,
})

print(result["response"])
  1. Add retrieval from approved sources

    In production, replace the hardcoded context with a retriever backed by approved documents. For investment banking support, that usually means a controlled index over policy docs stored in-region.

def retrieve_policy_context(state: SupportState) -> SupportState:
    approved_docs = {
        "billing": "Approved fee schedule v12.0. Fee disputes require ticket ID and client legal entity.",
        "general": "Support SOP v8.1. Never provide trading recommendations or confidential client data."
    }
    key = state["intent"] if state["intent"] in approved_docs else "general"
    state["context"] = approved_docs[key]
    return state
  1. Wire in guardrails before response delivery

    The main pattern is simple: classify first, retrieve second, generate last. If the request touches trade execution status beyond allowed disclosure boundaries, KYC/AML issues, MNPI concerns, or legal interpretations, force escalation instead of answering.

Production Considerations

  • Deployment

    • Run the graph behind an authenticated API gateway with per-client tenancy.
    • Keep model calls and document retrieval inside your approved cloud region to satisfy data residency rules.
    • Separate dev/test/prod indexes so no non-production content leaks into client responses.
  • Monitoring

    • Log every node transition with a correlation ID.
    • Track escalation rate by intent; spikes often mean your classifier is too permissive or your docs are incomplete.
    • Store prompt inputs and retrieved sources for audit review under your retention policy.
  • Guardrails

    • Block responses that mention performance predictions, investment advice, or unapproved product claims.
    • Add deterministic checks for PII leakage before returning output.
    • Require human approval for complaints involving fraud allegations, sanctions exposure, trade disputes above threshold value, or legal discovery requests.
  • Auditability

    • Persist the final answer plus the exact policy snippets used to produce it.
    • Version every prompt template and retrieval corpus.
    • Make sure compliance can reconstruct why a case was answered versus escalated.

Common Pitfalls

  • Letting the LLM answer before classification

    • This is how regulated questions slip through.
    • Always route first; generation should happen after intent is known.
  • Using generic web search as a knowledge source

    • Investment banking support needs approved internal sources only.
    • Unvetted content creates compliance risk and inconsistent answers.
  • Treating escalation as an exception path without logging

    • Escalations are part of the control plane.
    • Log reason codes like “trade-related,” “account-specific,” or “policy ambiguity” so compliance can review patterns.
  • Ignoring jurisdiction and residency constraints

    • A client in one region may have different disclosure requirements than another.
    • Keep storage and processing aligned with legal entity boundaries and local regulations.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides