How to Integrate CrewAI for fintech with FastAPI for RAG

By Cyprian AaronsUpdated 2026-04-22

crewai-for-fintechfastapirag

Combining CrewAI for fintech with FastAPI gives you a clean way to expose a multi-agent RAG system over HTTP without turning your app into a mess of orchestration code. The pattern is simple: use CrewAI to coordinate finance-specific agents, and FastAPI to serve retrieval, inference, and workflow endpoints that your product can call from web apps, back offices, or internal tools.

Prerequisites

Before you wire this up, make sure you have:

•Python 3.10+
•crewai installed
•fastapi and uvicorn installed
•A vector store or retriever for your RAG layer
•API keys configured for your LLM provider
•
A finance corpus ready for ingestion:
- •policy docs
- •product sheets
- •compliance notes
- •customer support knowledge base

Install the core packages:

pip install crewai fastapi uvicorn pydantic

If you are using a local retriever, also install your embedding/vector stack:

pip install chromadb sentence-transformers

Integration Steps

•Define the retrieval layer first

Your FastAPI app should own document retrieval. That keeps CrewAI focused on reasoning and task execution, not storage concerns.

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from typing import List

app = FastAPI(title="Fintech RAG API")

class QueryRequest(BaseModel):
    question: str

class RetrievalResult(BaseModel):
    chunks: List[str]

# Replace this with Chroma, Pinecone, FAISS, etc.
FINANCE_KB = {
    "loan_policy": "Loan approval requires KYC verification, income validation, and credit score review.",
    "fraud_policy": "Suspicious transfers above threshold require manual review and case creation.",
}

def retrieve_chunks(question: str) -> list[str]:
    matches = []
    q = question.lower()
    for _, text in FINANCE_KB.items():
        if any(token in text.lower() for token in q.split()):
            matches.append(text)
    return matches[:5]

This is intentionally simple. In production, swap retrieve_chunks() for a real vector search call and keep the endpoint contract the same.

•Create CrewAI agents for fintech reasoning

CrewAI gives you agent/task abstractions. For fintech RAG, a good split is:

•one agent to interpret the request,
•one agent to ground the answer in retrieved context,
•one agent to validate compliance language.

from crewai import Agent, Task, Crew, Process

rag_analyst = Agent(
    role="Fintech RAG Analyst",
    goal="Answer user questions using retrieved finance context only",
    backstory="You work on banking support and must stay grounded in policy documents.",
    verbose=True,
)

compliance_reviewer = Agent(
    role="Compliance Reviewer",
    goal="Check responses for risky or unsupported claims",
    backstory="You review financial answers for accuracy and regulatory risk.",
    verbose=True,
)

def build_crew(question: str, chunks: list[str]) -> Crew:
    context = "\n\n".join(chunks) if chunks else "No relevant context found."

    answer_task = Task(
        description=f"""
Use only the provided context to answer the user question.

Question:
{question}

Context:
{context}
""",
        expected_output="A concise answer grounded in the retrieved context.",
        agent=rag_analyst,
    )

    review_task = Task(
        description="""
Review the drafted answer for unsupported financial claims,
missing caveats, or compliance issues.
""",
        expected_output="A reviewed answer with corrections if needed.",
        agent=compliance_reviewer,
        context=[answer_task],
    )

    return Crew(
        agents=[rag_analyst, compliance_reviewer],
        tasks=[answer_task, review_task],
        process=Process.sequential,
        verbose=True,
    )

This structure works well when your fintech team wants traceability. You can log retrieved chunks, task output, and reviewer output separately.

•Expose a FastAPI endpoint that runs retrieval + CrewAI

Now connect both layers in one endpoint. FastAPI handles the request lifecycle; CrewAI handles orchestration.

@app.post("/rag/answer")
async def rag_answer(request: QueryRequest):
    chunks = retrieve_chunks(request.question)

    if not chunks:
        raise HTTPException(status_code=404, detail="No relevant finance context found")

    crew = build_crew(request.question, chunks)
    result = crew.kickoff()

    return {
        "question": request.question,
        "retrieved_chunks": chunks,
        "answer": str(result),
    }

The key method here is crew.kickoff(). That is where CrewAI executes the tasks in order and returns the final output.

•Add a dedicated health check and retrieval endpoint

In real systems you want separate endpoints for observability and debugging. Keep one endpoint for raw retrieval so you can inspect what the RAG layer is doing before blaming the model.

@app.get("/health")
async def health():
    return {"status": "ok"}

@app.post("/rag/retrieve", response_model=RetrievalResult)
async def rag_retrieve(request: QueryRequest):
    chunks = retrieve_chunks(request.question)
    return RetrievalResult(chunks=chunks)

This makes it easier to test whether failures come from retrieval or from CrewAI orchestration.

•Run FastAPI locally and wire it into your app

Start the service with Uvicorn:

uvicorn main:app --reload --port 8000

From another service or frontend client, call /rag/answer with a user question. If you are building an internal banking assistant, this endpoint becomes your orchestration boundary.

Testing the Integration

Use curl or httpx to verify that retrieval and crew execution both work.

import httpx

payload = {"question": "What checks are required before loan approval?"}

response = httpx.post("http://localhost:8000/rag/answer", json=payload)
print(response.status_code)
print(response.json())

Expected output:

{
  "question": "What checks are required before loan approval?",
  "retrieved_chunks": [
    "Loan approval requires KYC verification, income validation, and credit score review."
  ],
  "answer": "...final reviewed response..."
}

If retrieved_chunks is empty but your question is valid, fix retrieval first. If retrieval looks right but the answer is off-topic, inspect your CrewAI task instructions and agent constraints.

Real-World Use Cases

•
Customer support assistant for lending products
- •Answer questions about eligibility, repayment rules, fees, and document requirements using policy-backed RAG.
•
Internal compliance copilot
- •Let analysts query procedures while a reviewer agent checks responses against approved policy language.
•
Fraud operations assistant
- •Retrieve case playbooks and run an agent workflow that suggests next steps for suspicious transaction reviews.

The clean pattern here is: FastAPI owns serving and retrieval APIs; CrewAI owns multi-step reasoning over retrieved context. Keep those responsibilities separate and your fintech agent system stays testable, debuggable, and easier to ship.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit