How to Integrate CrewAI for payments with FastAPI for RAG
Combining CrewAI for payments with FastAPI gives you a clean way to expose payment-aware agent workflows as HTTP endpoints. In practice, this lets you build RAG systems that can answer questions, trigger payment checks, and route premium actions through an API your app can call directly.
Prerequisites
- •Python 3.10+
- •
fastapi - •
uvicorn - •
crewai - •A CrewAI Payments API key or configured payment provider credentials
- •A working RAG backend:
- •vector store like Pinecone, Chroma, or FAISS
- •embedding model configured
- •Basic familiarity with:
- •FastAPI routers and request models
- •CrewAI
Agent,Task, andCrew - •async Python for API handlers
Install the core packages:
pip install fastapi uvicorn crewai pydantic
Integration Steps
- •Define the payment-aware agent flow
Start by creating a CrewAI agent that can handle payment-related decisions inside your RAG pipeline. The key pattern is to separate retrieval from payment validation so your agent can decide whether a user is allowed to access premium context.
from crewai import Agent, Task, Crew, Process
payment_agent = Agent(
role="Payment Validation Agent",
goal="Validate whether the user can access paid RAG responses",
backstory="You check entitlement before premium retrieval is returned.",
verbose=True,
)
payment_task = Task(
description=(
"Given a user_id and plan status, determine whether premium RAG access "
"should be granted."
),
expected_output="A short approval or denial decision with reason.",
agent=payment_agent,
)
crew = Crew(
agents=[payment_agent],
tasks=[payment_task],
process=Process.sequential,
)
- •Build the FastAPI contract
Expose a /rag/query endpoint that accepts the user query plus billing metadata. FastAPI gives you typed request validation, which matters when you are passing payment flags into an agent workflow.
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
app = FastAPI(title="Payment-Aware RAG API")
class RAGRequest(BaseModel):
user_id: str
query: str
plan: str # free | pro | enterprise
class RAGResponse(BaseModel):
answer: str
payment_status: str
@app.post("/rag/query", response_model=RAGResponse)
async def rag_query(payload: RAGRequest):
if payload.plan not in {"free", "pro", "enterprise"}:
raise HTTPException(status_code=400, detail="Invalid plan")
return {"answer": "pending", "payment_status": "pending"}
- •Connect CrewAI execution inside the endpoint
Now wire the agent into the route handler. In real systems, this is where you call your entitlement logic first, then run retrieval only if the user is eligible.
from crewai import CrewOutput
def check_entitlement(user_id: str, plan: str) -> bool:
return plan in {"pro", "enterprise"}
@app.post("/rag/query", response_model=RAGResponse)
async def rag_query(payload: RAGRequest):
entitled = check_entitlement(payload.user_id, payload.plan)
if not entitled:
return {
"answer": "Upgrade required to access this knowledge base.",
"payment_status": "denied",
}
result = crew.kickoff(inputs={
"user_id": payload.user_id,
"plan": payload.plan,
"query": payload.query,
})
return {
"answer": str(result),
"payment_status": "approved",
}
- •Add retrieval augmentation before returning the answer
For actual RAG, pull context from your vector store and pass it into the task input. The important part is that the retrieved chunks are conditioned on payment status before they reach the final response.
def retrieve_context(query: str) -> list[str]:
# Replace with Chroma/Pinecone/FAISS lookup.
return [
"Policy A covers premium invoice disputes.",
"Policy B allows chargeback review within 30 days.",
]
@app.post("/rag/query", response_model=RAGResponse)
async def rag_query(payload: RAGRequest):
entitled = check_entitlement(payload.user_id, payload.plan)
if not entitled:
return {
"answer": "This answer requires a paid plan.",
"payment_status": "denied",
}
context = retrieve_context(payload.query)
result = crew.kickoff(inputs={
"user_id": payload.user_id,
"plan": payload.plan,
"query": payload.query,
"context": context,
})
return {
"answer": str(result),
"payment_status": "approved",
}
- •Run FastAPI and expose the integration
Use Uvicorn to serve the endpoint locally. This gives you a testable service boundary between your app layer and the agent workflow.
# main.py
from fastapi import FastAPI
app = FastAPI()
# include your routes here
# run with:
# uvicorn main:app --reload --host 0.0.0.0 --port 8000
Testing the Integration
Use curl or Python requests to verify both entitlement paths.
import requests
url = "http://localhost:8000/rag/query"
payload = {
"user_id": "user_123",
"query": "What does policy B say about chargebacks?",
"plan": "pro"
}
response = requests.post(url, json=payload)
print(response.status_code)
print(response.json())
Expected output:
{
"answer": "...CrewAI-generated response using retrieved context...",
"payment_status": "approved"
}
For a free-plan user:
{
"answer": "This answer requires a paid plan.",
"payment_status": "denied"
}
Real-World Use Cases
- •
Premium support copilots
- •Let users ask product or policy questions through RAG.
- •Gate deeper answers behind subscription checks before calling the agent.
- •
Insurance claims assistants
- •Combine document retrieval with payment status checks for claim eligibility workflows.
- •Route only approved users into detailed claim reasoning.
- •
Banking advisory bots
- •Expose account policy answers through FastAPI.
- •Use CrewAI to decide whether a user can access paid advisory content or enhanced summaries.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit