How to Integrate FastAPI for investment banking with LangChain for RAG
Why this integration matters
If you’re building an AI agent for investment banking, FastAPI gives you the service layer, and LangChain gives you the retrieval and orchestration layer. Put them together and you can expose RAG-powered workflows that answer questions from deal docs, research notes, policy memos, or client materials without hardcoding business logic into the model layer.
The practical win is simple: FastAPI handles request validation, auth hooks, and API boundaries; LangChain handles document loading, chunking, embedding, retrieval, and response generation. That’s the right split for production systems in banking.
Prerequisites
- •Python 3.10+
- •A FastAPI project with
uvicorn - •LangChain installed with your chosen LLM provider
- •An embedding provider such as OpenAI or Azure OpenAI
- •A vector store such as FAISS, Pinecone, or Chroma
- •Access to your internal investment banking documents in PDF, text, or HTML
- •Basic familiarity with async Python and REST APIs
Install the core packages:
pip install fastapi uvicorn langchain langchain-openai langchain-community faiss-cpu pydantic
Integration Steps
- •Set up the FastAPI service boundary.
You want a clean API contract before wiring in retrieval. Define a request model for bank-grade inputs like query text and optional deal context.
from fastapi import FastAPI
from pydantic import BaseModel
app = FastAPI(title="Investment Banking RAG API")
class RAGRequest(BaseModel):
query: str
deal_id: str | None = None
user_role: str | None = "analyst"
@app.get("/health")
def health():
return {"status": "ok"}
- •Load documents and build the retriever with LangChain.
For RAG, your source data needs to be chunked and indexed. In banking use cases, that usually means CIMs, pitch decks, earnings notes, or policy docs.
from langchain_community.document_loaders import TextLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS
loader = TextLoader("data/investment_banking_policy.txt")
docs = loader.load()
splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=150)
chunks = splitter.split_documents(docs)
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = FAISS.from_documents(chunks, embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 4})
- •Build the LangChain RAG chain.
Use a prompt that forces grounded answers and avoids hallucination. In banking workflows, “I don’t know” is better than fabricated confidence.
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains.retrieval import create_retrieval_chain
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
prompt = ChatPromptTemplate.from_messages([
("system", "You are an investment banking assistant. Answer only from the provided context."),
("human", "Question: {input}\n\nContext:\n{context}")
])
document_chain = create_stuff_documents_chain(llm=llm, prompt=prompt)
rag_chain = create_retrieval_chain(retriever=retriever, combine_docs_chain=document_chain)
- •Expose the RAG chain through a FastAPI endpoint.
This is where the two systems meet. The endpoint receives the request, calls LangChain’s retrieval pipeline, then returns a structured response.
from fastapi import HTTPException
@app.post("/rag/query")
async def rag_query(req: RAGRequest):
if not req.query.strip():
raise HTTPException(status_code=400, detail="query cannot be empty")
result = rag_chain.invoke({"input": req.query})
return {
"query": req.query,
"answer": result["answer"],
"sources": [
doc.metadata.get("source", "unknown")
for doc in result.get("context", [])
]
}
- •Add async startup wiring for production readiness.
In production you usually don’t want to rebuild embeddings on every request. Load your vector store at startup and keep it in app state.
from contextlib import asynccontextmanager
@asynccontextmanager
async def lifespan(app: FastAPI):
app.state.vectorstore = vectorstore
app.state.retriever = app.state.vectorstore.as_retriever(search_kwargs={"k": 4})
yield
app.router.lifespan_context = lifespan
If you need stronger structure for large teams, move this into a dependency function and inject the retriever into your route handlers.
Testing the Integration
Run the API:
uvicorn main:app --reload --port 8000
Then test it with curl or a small Python client:
import requests
resp = requests.post(
"http://localhost:8000/rag/query",
json={"query": "What is our policy on confidential information in pitch materials?"}
)
print(resp.status_code)
print(resp.json())
Expected output:
200
{
"query": "What is our policy on confidential information in pitch materials?",
"answer": "Pitch materials must not include confidential client data unless approved under internal disclosure rules...",
"sources": ["data/investment_banking_policy.txt"]
}
If you get an empty answer or missing sources:
- •Check that your document loader actually loaded content.
- •Confirm embeddings are created with the right API key.
- •Verify your retriever returns relevant chunks with
kset correctly.
Real-World Use Cases
- •Analyst copilot for deal teams that answers questions from internal policy docs, prior pitch decks, and transaction notes.
- •Compliance assistant that checks whether a draft memo violates disclosure rules before it goes to a banker or legal reviewer.
- •Client coverage search bot that retrieves relevant precedent transactions, sector notes, and meeting summaries from indexed internal content.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit