How to Build a policy Q&A Agent Using LangGraph in Python for investment banking

By Cyprian AaronsUpdated 2026-04-21

policy-q-alanggraphpythoninvestment-bankingpolicy-qanda

A policy Q&A agent for investment banking answers questions like “Can I share this deck externally?”, “Is this client data allowed in this workflow?”, or “Which retention rule applies to this trade record?” It matters because these questions sit at the intersection of speed, compliance, and auditability. If the agent is wrong, you do not just get a bad answer — you get a control failure.

Architecture

•
User query router
- •Classifies the question into policy lookup, escalation, or unsupported request.
- •Keeps the agent from hallucinating answers when the question is outside policy scope.
•
Policy retrieval layer
- •Pulls from controlled sources: internal compliance docs, desk procedures, retention schedules, and approved regulatory interpretations.
- •In practice, use vector search plus metadata filters like jurisdiction, business line, and document version.
•
LangGraph orchestration
- •Models the flow as nodes: retrieve, answer, validate, escalate.
- •This gives you deterministic control over when the model can respond and when it must stop.
•
Compliance guardrail node
- •Checks whether the answer cites policy evidence and stays within approved language.
- •Blocks unsupported responses and routes ambiguous cases to human review.
•
Audit logging
- •Stores question, retrieved policy IDs, answer text, model version, timestamps, and escalation outcome.
- •Required for internal controls and post-incident review.
•
Human escalation path
- •Sends high-risk queries to Legal, Compliance, or Operations.
- •Needed for edge cases like cross-border sharing, sanctions screening references, or client confidentiality exceptions.

Implementation

1. Define state and graph nodes

Use a typed state so every node has a clear contract. For a banking policy assistant, keep both the user input and evidence objects in state so you can audit what drove the final response.

from typing import TypedDict, List
from langgraph.graph import StateGraph, END
from langchain_core.messages import HumanMessage

class PolicyState(TypedDict):
    question: str
    retrieved_docs: List[str]
    answer: str
    needs_escalation: bool

def route_question(state: PolicyState) -> PolicyState:
    q = state["question"].lower()
    risky_terms = ["client data", "external", "sanctions", "cross-border", "mnpi"]
    state["needs_escalation"] = any(term in q for term in risky_terms)
    return state

def retrieve_policy(state: PolicyState) -> PolicyState:
    # Replace with vector DB + metadata filters in production
    docs = [
        "Policy 12.4: Client materials may not be shared externally without approval.",
        "Policy 8.1: MNPI handling requires restricted access and logging."
    ]
    state["retrieved_docs"] = docs
    return state

def draft_answer(state: PolicyState) -> PolicyState:
    if state["needs_escalation"]:
        state["answer"] = "Escalate to Compliance before responding."
    else:
        state["answer"] = f"Based on policy: {state['retrieved_docs'][0]}"
    return state

2. Build the LangGraph workflow

This is the core pattern: route first, retrieve second, then either answer or escalate. StateGraph gives you explicit control over branching instead of hiding it inside a prompt chain.

graph = StateGraph(PolicyState)

graph.add_node("route_question", route_question)
graph.add_node("retrieve_policy", retrieve_policy)
graph.add_node("draft_answer", draft_answer)

graph.set_entry_point("route_question")
graph.add_edge("route_question", "retrieve_policy")
graph.add_edge("retrieve_policy", "draft_answer")
graph.add_edge("draft_answer", END)

app = graph.compile()

At this point you have a runnable graph with deterministic flow. For a real deployment, replace retrieve_policy() with your retriever and add a validation node before END.

3. Add an escalation branch for high-risk queries

Investment banking needs hard stops for sensitive topics. If the query touches MNPI, sanctions, client confidentiality, or cross-border data movement, do not let the model improvise.

def escalate(state: PolicyState) -> PolicyState:
    state["answer"] = (
        "This request requires Compliance review due to potential regulatory impact."
    )
    return state

graph = StateGraph(PolicyState)
graph.add_node("route_question", route_question)
graph.add_node("retrieve_policy", retrieve_policy)
graph.add_node("draft_answer", draft_answer)
graph.add_node("escalate", escalate)

def choose_next(state: PolicyState):
    return "escalate" if state["needs_escalation"] else "retrieve_policy"

graph.set_entry_point("route_question")
graph.add_conditional_edges(
    "route_question",
    choose_next,
    {
        "retrieve_policy": "retrieve_policy",
        "escalate": "escalate",
    },
)
graph.add_edge("retrieve_policy", "draft_answer")
graph.add_edge("draft_answer", END)
graph.add_edge("escalate", END)

app = graph.compile()

result = app.invoke({"question": "Can I send client data to an external vendor?", "retrieved_docs": [], "answer": "", "needs_escalation": False})
print(result["answer"])

4. Return grounded answers only

For production policy QA, make the answer depend on retrieved evidence. If retrieval returns nothing relevant enough, force escalation instead of letting the model guess.

A practical pattern is:

•retrieve top-k documents
•filter by jurisdiction/business unit/version
•require citation IDs in output
•reject answers without supporting evidence

If you later add an LLM node through langchain_core, keep it downstream of retrieval and upstream of validation only. The graph should never allow free-form generation before evidence exists.

Production Considerations

•
Compliance controls
- •Require citations from approved sources in every final answer.
- •Block responses that mention legal interpretation unless they come from an approved policy corpus or are explicitly escalated.
•
Auditability
- •Log question, retrieved document IDs, graph path taken, model name/version, and final response.
- •Store immutable records so internal audit can reconstruct how a decision was made.
•
Data residency
- •Keep retrieval indexes and logs inside approved regions if you handle EMEA or APAC banking data.
- •Do not send client-sensitive prompts to unmanaged external endpoints.
•
Monitoring
- •Track escalation rate, unanswered rate, retrieval hit rate, and false-positive routing to Compliance.
- •Alert when the agent starts answering too many questions without citations or when one desk sees abnormal traffic patterns.

Common Pitfalls

•
Letting the LLM answer before retrieval
- •This causes confident but unsupported responses.
- •Fix it by making retrieval a required upstream node and validating that evidence exists before drafting an answer.
•
Using generic embeddings without metadata filters
- •A policy from one jurisdiction can be wrong for another desk or entity.
- •Fix it by filtering on region, entity legal entity name, business line, document version date, and approval status.
•
Skipping escalation logic for sensitive topics
- •Queries about MNPI, sanctions, external sharing, or records retention often need human review.
- •Fix it by using conditional edges in StateGraph that route high-risk requests directly to Compliance instead of generating an answer.

A good investment banking policy agent is not clever first; it is controlled first. LangGraph works well here because it makes control flow explicit enough for engineering teams and defensible enough for compliance teams.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit