How to Integrate FastAPI for banking with LangChain for RAG

By Cyprian AaronsUpdated 2026-04-21
fastapi-for-bankinglangchainrag

FastAPI for banking gives you a clean, typed API layer for account data, transactions, KYC, and internal workflows. LangChain adds retrieval, tool orchestration, and LLM-driven reasoning, which is what you need when an agent has to answer questions from policy docs, product manuals, or customer history without hardcoding every path.

Prerequisites

  • Python 3.10+
  • A working FastAPI app for your banking domain
  • Access to your banking backend or sandbox APIs
  • LangChain installed with a model provider:
    • langchain
    • langchain-openai or another chat model package
    • langchain-community
  • A vector store for RAG:
    • FAISS, Chroma, Pinecone, or similar
  • Banking API credentials stored in environment variables
  • Basic familiarity with:
    • FastAPI dependency injection
    • Pydantic models
    • LangChain retrievers and chains

Integration Steps

  1. Expose banking data through FastAPI endpoints

Start by making the banking system queryable through stable endpoints. Keep the response shapes strict; agents work better when the schema does not drift.

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from typing import List

app = FastAPI(title="Banking API")

class Transaction(BaseModel):
    id: str
    amount: float
    currency: str
    description: str

class AccountSummary(BaseModel):
    account_id: str
    balance: float
    currency: str

@app.get("/accounts/{account_id}/summary", response_model=AccountSummary)
def get_account_summary(account_id: str):
    if account_id != "acc_123":
        raise HTTPException(status_code=404, detail="Account not found")
    return AccountSummary(account_id="acc_123", balance=12500.75, currency="USD")

@app.get("/accounts/{account_id}/transactions", response_model=List[Transaction])
def get_transactions(account_id: str):
    return [
        Transaction(id="tx_1", amount=-120.0, currency="USD", description="ATM withdrawal"),
        Transaction(id="tx_2", amount=2500.0, currency="USD", description="Salary deposit"),
    ]

This is the contract your agent will call later via HTTP or an internal client.

  1. Create a LangChain retriever for bank policies and product docs

RAG should answer policy and product questions from indexed documents, not from the model’s memory. Load your documents into a vector store and expose a retriever.

from langchain_community.document_loaders import TextLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings

loader = TextLoader("bank_policy.txt")
docs = loader.load()

splitter = RecursiveCharacterTextSplitter(chunk_size=800, chunk_overlap=120)
chunks = splitter.split_documents(docs)

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = FAISS.from_documents(chunks, embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 4})

This gives LangChain a retrieval layer that can pull the right policy passages before generating an answer.

  1. Wrap the FastAPI banking endpoint as a LangChain tool

Use a tool so the agent can call live banking APIs when it needs fresh account data. In production I prefer StructuredTool because it keeps inputs explicit.

import requests
from pydantic import BaseModel, Field
from langchain_core.tools import StructuredTool

class AccountQuery(BaseModel):
    account_id: str = Field(..., description="Bank account identifier")

def fetch_account_summary(account_id: str) -> dict:
    resp = requests.get(f"http://localhost:8000/accounts/{account_id}/summary", timeout=10)
    resp.raise_for_status()
    return resp.json()

account_summary_tool = StructuredTool.from_function(
    func=fetch_account_summary,
    name="fetch_account_summary",
    description="Fetch live banking account summary from the FastAPI service",
    args_schema=AccountQuery,
)

Now LangChain can decide when to use live API data versus retrieved policy context.

  1. Build a RAG chain that combines retrieved docs and live banking data

The pattern here is simple: retrieve policy context first, then call the banking tool if needed. Use a chat model plus a prompt that tells the agent what source to trust for each type of question.

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a banking assistant. Use policy docs for rules and tools for live account data."),
    ("human", "Question: {question}\n\nPolicy context:\n{context}\n\nLive data:\n{live_data}")
])

def answer_banking_question(question: str, account_id: str):
    docs = retriever.invoke(question)
    context = "\n\n".join(d.page_content for d in docs)

    live_data = account_summary_tool.invoke({"account_id": account_id})

    chain_input = prompt.format_messages(
        question=question,
        context=context,
        live_data=str(live_data),
    )
    return llm.invoke(chain_input).content

This is enough to support questions like “Can this customer qualify for overdraft protection?” where policy text and current balances both matter.

  1. Expose the RAG workflow as another FastAPI endpoint

Once the chain works locally, wrap it in an API so your agent platform or frontend can call it directly.

from fastapi import Body

class AgentRequest(BaseModel):
    question: str
    account_id: str

class AgentResponse(BaseModel):
    answer: str

@app.post("/agent/answer", response_model=AgentResponse)
def agent_answer(payload: AgentRequest = Body(...)):
    answer = answer_banking_question(payload.question, payload.account_id)
    return AgentResponse(answer=answer)

That gives you one entry point for your AI agent system while keeping banking logic behind typed endpoints.

Testing the Integration

Run FastAPI first:

uvicorn main:app --reload --port 8000

Then test the end-to-end flow with Python:

import requests

payload = {
    "question": "What is this customer's current balance and can they request overdraft support?",
    "account_id": "acc_123"
}

resp = requests.post("http://localhost:8000/agent/answer", json=payload, timeout=30)
resp.raise_for_status()
print(resp.json()["answer"])

Expected output will vary by model wording, but it should include both live account details and policy-based guidance:

The customer’s current balance is USD 12,500.75. Based on the policy context provided, overdraft support may require additional eligibility checks such as minimum activity or risk review.

Real-World Use Cases

  • Customer support copilot
    • Answer balance questions, transaction disputes, fee explanations, and policy questions from one agent interface.
  • Ops assistant for bank staff
    • Pull live account summaries through FastAPI while using RAG over internal SOPs for escalation paths and compliance steps.
  • KYC and onboarding helper
    • Combine document retrieval with workflow APIs to guide staff through missing-document checks and approval rules.

The production pattern is straightforward: FastAPI owns trusted banking operations, LangChain owns retrieval and orchestration. Keep them separate at the service boundary, then combine them at the agent layer where you can control prompts, tools, retries, and audit logging.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides