How to Integrate LangGraph for investment banking with LangSmith for RAG

By Cyprian AaronsUpdated 2026-04-22
langgraph-for-investment-bankinglangsmithrag

Combining LangGraph for investment banking with LangSmith gives you a clean way to build regulated RAG agents that are traceable end to end. You use LangGraph to control the workflow, retries, branching, and human checkpoints, then use LangSmith to inspect retrieval quality, prompt behavior, and failure points across every run.

For investment banking workflows, that matters because you are not just answering questions. You are pulling from deal documents, filings, internal memos, and market research while keeping a full audit trail for compliance and model debugging.

Prerequisites

  • Python 3.10+
  • langgraph
  • langchain
  • langsmith
  • A vector store for RAG, such as FAISS, Pinecone, or pgvector
  • Access to your document corpus:
    • pitch decks
    • CIMs
    • 10-Ks / 10-Qs
    • internal research notes
  • API keys configured:
    • LANGSMITH_API_KEY
    • OPENAI_API_KEY or another LLM provider key
  • Environment variables set for tracing:
    • LANGSMITH_TRACING=true
    • LANGSMITH_PROJECT=investment-banking-rag

Integration Steps

  1. Install the packages and enable tracing.
pip install langgraph langchain langsmith langchain-openai faiss-cpu
import os

os.environ["LANGSMITH_TRACING"] = "true"
os.environ["LANGSMITH_PROJECT"] = "investment-banking-rag"
os.environ["LANGSMITH_API_KEY"] = "lsv2_..."
os.environ["OPENAI_API_KEY"] = "sk-..."
  1. Build the retriever and wire it into a LangGraph state machine.

This example uses a simple FAISS-backed retriever. In production, swap in your real document index and chunking pipeline.

from typing import TypedDict, List
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_community.vectorstores import FAISS
from langchain_core.documents import Document

docs = [
    Document(page_content="Company A reported EBITDA of $240M in FY24."),
    Document(page_content="Company B is exposed to refinancing risk in Q3."),
]

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = FAISS.from_documents(docs, embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 3})
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
  1. Define the graph state and nodes for retrieve → answer.

LangGraph gives you explicit control over the flow. That is what makes it useful for banking: you can insert review gates, fallback logic, or policy checks before anything is returned to an analyst.

from langgraph.graph import StateGraph, END

class RAGState(TypedDict):
    question: str
    context: List[str]
    answer: str

def retrieve(state: RAGState):
    docs = retriever.invoke(state["question"])
    return {"context": [d.page_content for d in docs]}

def generate(state: RAGState):
    context_text = "\n".join(state["context"])
    prompt = f"""
You are an investment banking assistant.
Answer only from the provided context.

Question: {state['question']}

Context:
{context_text}
"""
    response = llm.invoke(prompt)
    return {"answer": response.content}

graph = StateGraph(RAGState)
graph.add_node("retrieve", retrieve)
graph.add_node("generate", generate)

graph.set_entry_point("retrieve")
graph.add_edge("retrieve", "generate")
graph.add_edge("generate", END)

app = graph.compile()
  1. Add LangSmith tracing so every node execution is observable.

LangSmith traces the graph execution automatically when tracing is enabled through environment variables. If you want explicit run metadata, attach tags and metadata at invocation time.

result = app.invoke(
    {"question": "What is Company A's FY24 EBITDA?"},
    config={
        "tags": ["ib-rag", "earnings-analysis"],
        "metadata": {
            "desk": "M&A",
            "region": "US",
            "doc_type": "financials"
        }
    }
)

print(result["answer"])

If you want more control over spans inside custom functions, use LangSmith’s tracing utilities directly.

from langsmith import traceable

@traceable(name="banking_retrieve")
def traced_retrieve(question: str):
    return retriever.invoke(question)

@traceable(name="banking_generate")
def traced_generate(question: str, context: list[str]):
    context_text = "\n".join(context)
    prompt = f"Question: {question}\n\nContext:\n{context_text}"
    return llm.invoke(prompt).content
  1. Register a dataset in LangSmith for repeatable RAG evaluation.

This is how you stop guessing whether your banking agent is actually improving. Create test cases around known questions from deal teams and analysts, then run them against your graph.

from langsmith import Client

client = Client()

dataset = client.create_dataset(
    dataset_name="investment-banking-rag-eval",
    description="RAG evaluation set for banking QA"
)

client.create_examples(
    inputs=[
        {"question": "What was Company A's FY24 EBITDA?"},
        {"question": "Does Company B have refinancing risk?"},
    ],
    outputs=[
        {"expected_answer": "$240M"},
        {"expected_answer": "Yes"},
    ],
    dataset_id=dataset.id,
)

Testing the Integration

Run a single end-to-end query and confirm both retrieval and generation execute under LangSmith tracing.

result = app.invoke(
    {"question": "Does Company B have refinancing risk?"},
    config={"tags": ["smoke-test"]}
)

print("ANSWER:", result["answer"])
print("CONTEXT:", result["context"])

Expected output:

ANSWER: Company B is exposed to refinancing risk in Q3.
CONTEXT: ['Company B is exposed to refinancing risk in Q3.', ...]

If LangSmith is configured correctly, you should also see a new trace in your project with separate spans for retrieval and generation.

Real-World Use Cases

  • Deal team Q&A assistant
    • Answer questions from CIMs, financial models, diligence notes, and earnings transcripts with full traceability.
  • Credit memo drafting
    • Pull relevant facts from filings and internal research, then generate structured memo sections with review checkpoints.
  • Comparable company analysis copilot
    • Retrieve valuation metrics from approved sources and produce analyst-ready summaries with logged evidence chains.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides