How to Integrate CrewAI for retail banking with FastAPI for RAG

By Cyprian AaronsUpdated 2026-04-22
crewai-for-retail-bankingfastapirag

Combining CrewAI for retail banking with FastAPI gives you a clean split between orchestration and serving. CrewAI handles the multi-agent workflow for tasks like KYC checks, product matching, and policy lookup, while FastAPI exposes those capabilities as a RAG-backed API your internal apps, chat surfaces, or banker tools can call.

Prerequisites

  • Python 3.10+
  • A working FastAPI app
  • CrewAI installed with your retail banking agent setup
  • An LLM provider configured for CrewAI, such as OpenAI or Azure OpenAI
  • A vector store or retrieval backend for RAG, such as Chroma, pgvector, or Pinecone
  • Bank policy documents, product guides, or FAQ content indexed for retrieval
  • uvicorn for running the API locally

Install the core packages:

pip install fastapi uvicorn crewai crewai-tools langchain-openai chromadb pydantic

Integration Steps

  1. Define the retail banking RAG tool

    Your CrewAI agents need a retriever they can call during task execution. In retail banking, this usually means policy docs, fee schedules, account opening rules, and product eligibility criteria.

from crewai_tools import tool
from langchain_openai import OpenAIEmbeddings
from langchain_chroma import Chroma

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = Chroma(
    collection_name="retail_banking_kb",
    persist_directory="./chroma_db",
    embedding_function=embeddings,
)

@tool("bank_policy_search")
def bank_policy_search(query: str) -> str:
    """Search retail banking knowledge base for policy and product answers."""
    docs = vectorstore.similarity_search(query, k=3)
    return "\n\n".join([doc.page_content for doc in docs])
  1. Create CrewAI agents and a task

    Use one agent to answer the customer question and another to validate the result against banking policy. This pattern works better than stuffing everything into one prompt.

from crewai import Agent, Task, Crew, Process
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

policy_agent = Agent(
    role="Retail Banking Policy Analyst",
    goal="Answer customer questions using approved bank policy and product documentation.",
    backstory="You work in a regulated retail banking environment.",
    llm=llm,
    tools=[bank_policy_search],
)

review_agent = Agent(
    role="Compliance Reviewer",
    goal="Check that responses are consistent with bank policy and do not overpromise.",
    backstory="You review customer-facing responses for accuracy and compliance.",
    llm=llm,
)

answer_task = Task(
    description=(
        "Answer the customer's question using bank_policy_search. "
        "Return a concise response suitable for a banker assistant."
    ),
    expected_output="A grounded answer with no unsupported claims.",
    agent=policy_agent,
)

review_task = Task(
    description="Review the drafted answer for policy risk and factual issues.",
    expected_output="Approved or revised answer with brief rationale.",
    agent=review_agent,
)

crew = Crew(
    agents=[policy_agent, review_agent],
    tasks=[answer_task, review_task],
    process=Process.sequential,
)
  1. Wrap the CrewAI workflow in a FastAPI endpoint

    FastAPI becomes the thin API layer around your agent system. Keep request/response models strict so downstream consumers get predictable outputs.

from fastapi import FastAPI
from pydantic import BaseModel

app = FastAPI(title="Retail Banking RAG API")

class QueryRequest(BaseModel):
    question: str

class QueryResponse(BaseModel):
    answer: str

@app.post("/rag/query", response_model=QueryResponse)
def rag_query(payload: QueryRequest):
    result = crew.kickoff(inputs={"question": payload.question})
    return QueryResponse(answer=str(result))
  1. Add retrieval-friendly input handling

    In production banking flows, you usually need context like customer segment, country, or product type. Pass those into the crew so retrieval is narrower and answers are safer.

@app.post("/rag/query/with-context", response_model=QueryResponse)
def rag_query_with_context(payload: QueryRequest):
    inputs = {
        "question": payload.question,
        "customer_type": "retail",
        "market": "UK",
        "channel": "branch_assistant",
    }
    result = crew.kickoff(inputs=inputs)
    return QueryResponse(answer=str(result))
  1. Run FastAPI and expose it to your internal systems

    Start the service with Uvicorn and wire it into your chatbot, CRM plugin, or banker workstation.

# main.py
# run with: uvicorn main:app --reload --port 8000

Testing the Integration

Hit the endpoint with a realistic retail banking question:

import requests

response = requests.post(
    "http://localhost:8000/rag/query",
    json={"question": "Can a student open a current account without proof of income?"}
)

print(response.status_code)
print(response.json())

Expected output:

200
{'answer': '...grounded response based on retrieved bank policy...'}

If retrieval is wired correctly, the answer should reference approved account-opening rules and avoid inventing eligibility criteria.

Real-World Use Cases

  • Branch assistant for account opening

    • Staff ask questions like “What documents are required for joint accounts?” and get grounded answers from policy docs.
  • Customer support copilot

    • Call center agents use the API to retrieve fee schedules, overdraft rules, card replacement steps, and complaint handling guidance.
  • Product eligibility checker

    • A front-end app sends customer context to FastAPI, which routes it through CrewAI agents that validate product fit against bank policy.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides