How to Integrate CrewAI for healthcare with FastAPI for RAG
Combining CrewAI for healthcare with FastAPI gives you a clean split between orchestration and delivery. CrewAI handles the multi-agent workflow for tasks like triage, summarization, and evidence gathering, while FastAPI exposes that workflow as a production API for retrieval-augmented generation (RAG).
That matters in healthcare because your RAG pipeline usually needs more than one step: fetch clinical context, rank relevant documents, synthesize an answer, and return structured output fast enough for downstream systems.
Prerequisites
- •Python 3.10+
- •A virtual environment set up with
venv,poetry, oruv - •Installed packages:
- •
crewai - •
fastapi - •
uvicorn - •
pydantic - •your LLM provider SDK, such as
openaiorlangchain-openai
- •
- •Access to your healthcare knowledge sources:
- •PDFs, policy docs, clinical guidelines, internal SOPs
- •vector store or retrieval backend such as Chroma, Pinecone, or FAISS
- •Environment variables configured:
- •
OPENAI_API_KEYor equivalent model key - •any vector DB credentials
- •
- •A basic understanding of:
- •FastAPI request/response models
- •CrewAI
Agent,Task, andCrewobjects
Integration Steps
- •
Set up your RAG retrieval function
Keep retrieval outside the agent layer. The agent should reason over retrieved context, not own the search implementation.
from typing import List def retrieve_medical_context(query: str) -> List[str]: # Replace this with your vector DB lookup docs = [ "Hypertension guideline: start with lifestyle changes and first-line antihypertensives.", "Diabetes care note: assess HbA1c quarterly for uncontrolled patients.", ] return docs[:2] - •
Create CrewAI agents for healthcare RAG
Use one agent to interpret the question and another to synthesize the answer from retrieved context. In CrewAI, this is typically done with
Agent,Task, andCrew.from crewai import Agent, Task, Crew, Process clinical_researcher = Agent( role="Clinical Researcher", goal="Find relevant clinical evidence in retrieved context", backstory="You analyze healthcare documentation and extract only supported facts.", verbose=True, ) medical_writer = Agent( role="Medical Writer", goal="Produce a concise answer grounded in retrieved context", backstory="You write clear responses for clinicians and care operations teams.", verbose=True, ) def build_crew(query: str, context_docs: list[str]) -> Crew: research_task = Task( description=( f"Review the following healthcare context and identify facts relevant to: {query}\n\n" + "\n".join(f"- {doc}" for doc in context_docs) ), expected_output="A bullet list of relevant clinical facts with no speculation.", agent=clinical_researcher, ) synthesis_task = Task( description=( f"Using the research notes above, answer this question: {query}" ), expected_output="A short grounded answer suitable for a healthcare RAG response.", agent=medical_writer, context=[research_task], ) return Crew( agents=[clinical_researcher, medical_writer], tasks=[research_task, synthesis_task], process=Process.sequential, verbose=True, ) - •
Expose the RAG workflow through FastAPI
FastAPI becomes the interface layer for your AI system. Define a request schema, call retrieval, then run the crew inside the endpoint.
from fastapi import FastAPI from pydantic import BaseModel app = FastAPI(title="Healthcare RAG API") class RagRequest(BaseModel): query: str class RagResponse(BaseModel): answer: str sources: list[str] @app.post("/rag", response_model=RagResponse) async def rag_endpoint(payload: RagRequest): context_docs = retrieve_medical_context(payload.query) crew = build_crew(payload.query, context_docs) result = crew.kickoff() return RagResponse( answer=str(result), sources=context_docs, ) - •
Add async-safe execution if your model calls are blocking
CrewAI workflows often involve blocking LLM calls. If you run them directly inside an async endpoint under load, you can stall the event loop. Push execution to a thread pool.
import asyncio from fastapi.concurrency import run_in_threadpool @app.post("/rag-safe", response_model=RagResponse) async def rag_safe_endpoint(payload: RagRequest): context_docs = retrieve_medical_context(payload.query) crew = build_crew(payload.query, context_docs) result = await run_in_threadpool(crew.kickoff) return RagResponse(answer=str(result), sources=context_docs) - •
Run the API locally
Start FastAPI with Uvicorn and point your client at
/rag.uvicorn main:app --reload --host 0.0.0.0 --port 8000
Testing the Integration
Use FastAPI’s test client to verify that retrieval happens first and CrewAI produces a grounded response.
from fastapi.testclient import TestClient
from main import app
client = TestClient(app)
def test_rag_endpoint():
response = client.post("/rag", json={"query": "What is first-line treatment for hypertension?"})
assert response.status_code == 200
data = response.json()
assert "answer" in data
assert "sources" in data
assert len(data["sources"]) > 0
if __name__ == "__main__":
resp = client.post("/rag", json={"query": "How often should HbA1c be checked?"})
print(resp.json())
Expected output:
{
"answer": "Based on the retrieved clinical context...",
"sources": [
"Hypertension guideline: start with lifestyle changes and first-line antihypertensives.",
"Diabetes care note: assess HbA1c quarterly for uncontrolled patients."
]
}
Real-World Use Cases
- •
Clinical policy assistant
Answer questions from internal hospital policies, formularies, and care pathways using retrieved documents plus an agent that summarizes only supported guidance. - •
Prior authorization support
Retrieve payer rules and clinical documentation requirements, then have CrewAI generate a structured checklist or draft justification. - •
Care team knowledge assistant
Build an API that helps nurses or case managers query SOPs, escalation rules, and patient education content through a single endpoint.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit