How to Integrate LangGraph for investment banking with LangSmith for RAG
Combining LangGraph for investment banking with LangSmith gives you a clean way to build regulated RAG agents that are traceable end to end. You use LangGraph to control the workflow, retries, branching, and human checkpoints, then use LangSmith to inspect retrieval quality, prompt behavior, and failure points across every run.
For investment banking workflows, that matters because you are not just answering questions. You are pulling from deal documents, filings, internal memos, and market research while keeping a full audit trail for compliance and model debugging.
Prerequisites
- •Python 3.10+
- •
langgraph - •
langchain - •
langsmith - •A vector store for RAG, such as FAISS, Pinecone, or pgvector
- •Access to your document corpus:
- •pitch decks
- •CIMs
- •10-Ks / 10-Qs
- •internal research notes
- •API keys configured:
- •
LANGSMITH_API_KEY - •
OPENAI_API_KEYor another LLM provider key
- •
- •Environment variables set for tracing:
- •
LANGSMITH_TRACING=true - •
LANGSMITH_PROJECT=investment-banking-rag
- •
Integration Steps
- •Install the packages and enable tracing.
pip install langgraph langchain langsmith langchain-openai faiss-cpu
import os
os.environ["LANGSMITH_TRACING"] = "true"
os.environ["LANGSMITH_PROJECT"] = "investment-banking-rag"
os.environ["LANGSMITH_API_KEY"] = "lsv2_..."
os.environ["OPENAI_API_KEY"] = "sk-..."
- •Build the retriever and wire it into a LangGraph state machine.
This example uses a simple FAISS-backed retriever. In production, swap in your real document index and chunking pipeline.
from typing import TypedDict, List
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_community.vectorstores import FAISS
from langchain_core.documents import Document
docs = [
Document(page_content="Company A reported EBITDA of $240M in FY24."),
Document(page_content="Company B is exposed to refinancing risk in Q3."),
]
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = FAISS.from_documents(docs, embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 3})
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
- •Define the graph state and nodes for retrieve → answer.
LangGraph gives you explicit control over the flow. That is what makes it useful for banking: you can insert review gates, fallback logic, or policy checks before anything is returned to an analyst.
from langgraph.graph import StateGraph, END
class RAGState(TypedDict):
question: str
context: List[str]
answer: str
def retrieve(state: RAGState):
docs = retriever.invoke(state["question"])
return {"context": [d.page_content for d in docs]}
def generate(state: RAGState):
context_text = "\n".join(state["context"])
prompt = f"""
You are an investment banking assistant.
Answer only from the provided context.
Question: {state['question']}
Context:
{context_text}
"""
response = llm.invoke(prompt)
return {"answer": response.content}
graph = StateGraph(RAGState)
graph.add_node("retrieve", retrieve)
graph.add_node("generate", generate)
graph.set_entry_point("retrieve")
graph.add_edge("retrieve", "generate")
graph.add_edge("generate", END)
app = graph.compile()
- •Add LangSmith tracing so every node execution is observable.
LangSmith traces the graph execution automatically when tracing is enabled through environment variables. If you want explicit run metadata, attach tags and metadata at invocation time.
result = app.invoke(
{"question": "What is Company A's FY24 EBITDA?"},
config={
"tags": ["ib-rag", "earnings-analysis"],
"metadata": {
"desk": "M&A",
"region": "US",
"doc_type": "financials"
}
}
)
print(result["answer"])
If you want more control over spans inside custom functions, use LangSmith’s tracing utilities directly.
from langsmith import traceable
@traceable(name="banking_retrieve")
def traced_retrieve(question: str):
return retriever.invoke(question)
@traceable(name="banking_generate")
def traced_generate(question: str, context: list[str]):
context_text = "\n".join(context)
prompt = f"Question: {question}\n\nContext:\n{context_text}"
return llm.invoke(prompt).content
- •Register a dataset in LangSmith for repeatable RAG evaluation.
This is how you stop guessing whether your banking agent is actually improving. Create test cases around known questions from deal teams and analysts, then run them against your graph.
from langsmith import Client
client = Client()
dataset = client.create_dataset(
dataset_name="investment-banking-rag-eval",
description="RAG evaluation set for banking QA"
)
client.create_examples(
inputs=[
{"question": "What was Company A's FY24 EBITDA?"},
{"question": "Does Company B have refinancing risk?"},
],
outputs=[
{"expected_answer": "$240M"},
{"expected_answer": "Yes"},
],
dataset_id=dataset.id,
)
Testing the Integration
Run a single end-to-end query and confirm both retrieval and generation execute under LangSmith tracing.
result = app.invoke(
{"question": "Does Company B have refinancing risk?"},
config={"tags": ["smoke-test"]}
)
print("ANSWER:", result["answer"])
print("CONTEXT:", result["context"])
Expected output:
ANSWER: Company B is exposed to refinancing risk in Q3.
CONTEXT: ['Company B is exposed to refinancing risk in Q3.', ...]
If LangSmith is configured correctly, you should also see a new trace in your project with separate spans for retrieval and generation.
Real-World Use Cases
- •Deal team Q&A assistant
- •Answer questions from CIMs, financial models, diligence notes, and earnings transcripts with full traceability.
- •Credit memo drafting
- •Pull relevant facts from filings and internal research, then generate structured memo sections with review checkpoints.
- •Comparable company analysis copilot
- •Retrieve valuation metrics from approved sources and produce analyst-ready summaries with logged evidence chains.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit