How to Integrate CrewAI for banking with FastAPI for RAG

By Cyprian AaronsUpdated 2026-04-22
crewai-for-bankingfastapirag

Combining CrewAI for banking with FastAPI gives you a clean way to expose agent-driven RAG workflows as production APIs. The pattern is simple: CrewAI handles task orchestration and tool use, while FastAPI gives you a stable HTTP layer for retrieval, validation, auth, and downstream integration.

This is useful when you need banking assistants that answer policy questions, summarize customer documents, or route cases using internal knowledge bases. Instead of embedding agent logic inside a monolith, you keep the agent pipeline isolated and serve it through an API your web app, CRM, or case management system can call.

Prerequisites

  • Python 3.10+
  • fastapi
  • uvicorn
  • crewai
  • crewai-tools
  • A vector store or retrieval backend:
    • chromadb, faiss-cpu, or your managed vector DB
  • An LLM provider configured through environment variables
  • Banking knowledge sources ready for ingestion:
    • policy PDFs
    • product manuals
    • compliance docs
    • FAQ articles
  • Basic familiarity with:
    • FastAPI routes and Pydantic models
    • CrewAI agents, tasks, and crews
    • Retrieval-Augmented Generation patterns

Install the core packages:

pip install fastapi uvicorn crewai crewai-tools chromadb pydantic

Integration Steps

1. Build a retrieval tool for banking documents

CrewAI works best when the agent has a narrow tool surface. For RAG, expose one tool that retrieves relevant banking context from your vector store.

from crewai_tools import tool

@tool("banking_rag_search")
def banking_rag_search(query: str) -> str:
    """
    Retrieve relevant banking policy snippets for a user query.
    Replace this stub with your vector DB search.
    """
    results = [
        "Mortgage eligibility requires 6 months of bank statements.",
        "KYC refresh is required every 12 months for retail customers.",
    ]
    return "\n".join(results)

In production, replace the stub with your retriever call. The important part is that CrewAI gets a deterministic function it can call during task execution.

2. Define the CrewAI agent and task

Create an agent that acts like a banking support analyst. Keep the instructions specific so it uses retrieved context instead of inventing answers.

from crewai import Agent, Task, Crew, Process

banking_agent = Agent(
    role="Banking Support Analyst",
    goal="Answer banking policy questions using retrieved internal knowledge only",
    backstory=(
        "You work in a regulated banking environment and must rely on provided context."
    ),
    tools=[banking_rag_search],
    verbose=True,
)

answer_task = Task(
    description=(
        "Answer the user's question using only the retrieved context. "
        "If the answer is not in context, say you do not have enough information."
    ),
    expected_output="A concise banking answer grounded in retrieved context.",
    agent=banking_agent,
)

crew = Crew(
    agents=[banking_agent],
    tasks=[answer_task],
    process=Process.sequential,
)

For RAG-heavy banking flows, this setup keeps the model from freewheeling. The tool provides evidence; the task instruction controls response behavior.

3. Wrap the crew in a FastAPI endpoint

Now expose the workflow over HTTP. Use Pydantic to validate requests and make the contract explicit.

from fastapi import FastAPI
from pydantic import BaseModel

app = FastAPI(title="Banking RAG API")

class QueryRequest(BaseModel):
    question: str

class QueryResponse(BaseModel):
    answer: str

@app.post("/rag/query", response_model=QueryResponse)
def rag_query(payload: QueryRequest):
    result = crew.kickoff(inputs={"question": payload.question})
    return QueryResponse(answer=str(result))

This is where FastAPI earns its keep:

  • request validation before the agent runs
  • clean response schema for consumers
  • easy extension for auth, logging, rate limiting, and tracing

4. Add retrieval-aware input handling

In real systems, you usually pass metadata into the crew so it can tailor retrieval by product line, region, or customer segment. You can inject those fields through FastAPI and forward them into kickoff().

from pydantic import BaseModel

class BankingQueryRequest(BaseModel):
    question: str
    product: str | None = None
    region: str | None = None

@app.post("/rag/banking-query")
def banking_query(payload: BankingQueryRequest):
    inputs = {
        "question": payload.question,
        "product": payload.product or "retail_banking",
        "region": payload.region or "default",
    }
    result = crew.kickoff(inputs=inputs)
    return {"answer": str(result)}

That pattern matters when your bank has different rules by jurisdiction or product type. You can route those fields into retrieval filters inside banking_rag_search.

5. Run and wire it into your app stack

Start FastAPI with Uvicorn and point your frontend or internal services at the endpoint.

uvicorn main:app --reload --host 0.0.0.0 --port 8000

Typical production additions:

  • JWT auth on /rag/*
  • audit logging of every prompt and response
  • request IDs propagated to tracing systems
  • caching for repeated policy lookups

Testing the Integration

Use FastAPI’s test client to verify that the endpoint returns a grounded answer path.

from fastapi.testclient import TestClient
from main import app

client = TestClient(app)

response = client.post(
    "/rag/query",
    json={"question": "What documents are needed for mortgage approval?"}
)

print(response.status_code)
print(response.json())

Expected output:

200
{"answer":"...retrieved banking answer..."}

If you want stricter validation in tests, assert on status code and key phrases from your known document set:

assert response.status_code == 200
assert "mortgage" in response.json()["answer"].lower()

Real-World Use Cases

  • Internal policy assistant

    • Staff ask questions about KYC, AML, loan servicing, or account opening rules.
    • FastAPI exposes the endpoint; CrewAI retrieves from approved policy docs and answers with citations if you add them.
  • Customer support triage

    • A support portal sends customer questions to /rag/query.
    • The agent classifies intent, pulls relevant product guidance, and returns a draft response for review.
  • Compliance document helper

    • Analysts query procedures across multiple document sets.
    • The API can filter by region or business unit before running retrieval so answers stay scoped correctly.

The main design rule is simple: keep FastAPI thin and keep CrewAI focused on reasoning over retrieved context. That separation makes the system easier to test, safer to audit, and much easier to evolve when your bank adds more products or jurisdictions.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides