How to Integrate CrewAI for healthcare with FastAPI for RAG

By Cyprian AaronsUpdated 2026-04-22
crewai-for-healthcarefastapirag

Combining CrewAI for healthcare with FastAPI gives you a clean split between orchestration and delivery. CrewAI handles the multi-agent workflow for tasks like triage, summarization, and evidence gathering, while FastAPI exposes that workflow as a production API for retrieval-augmented generation (RAG).

That matters in healthcare because your RAG pipeline usually needs more than one step: fetch clinical context, rank relevant documents, synthesize an answer, and return structured output fast enough for downstream systems.

Prerequisites

  • Python 3.10+
  • A virtual environment set up with venv, poetry, or uv
  • Installed packages:
    • crewai
    • fastapi
    • uvicorn
    • pydantic
    • your LLM provider SDK, such as openai or langchain-openai
  • Access to your healthcare knowledge sources:
    • PDFs, policy docs, clinical guidelines, internal SOPs
    • vector store or retrieval backend such as Chroma, Pinecone, or FAISS
  • Environment variables configured:
    • OPENAI_API_KEY or equivalent model key
    • any vector DB credentials
  • A basic understanding of:
    • FastAPI request/response models
    • CrewAI Agent, Task, and Crew objects

Integration Steps

  1. Set up your RAG retrieval function

    Keep retrieval outside the agent layer. The agent should reason over retrieved context, not own the search implementation.

    from typing import List
    
    def retrieve_medical_context(query: str) -> List[str]:
        # Replace this with your vector DB lookup
        docs = [
            "Hypertension guideline: start with lifestyle changes and first-line antihypertensives.",
            "Diabetes care note: assess HbA1c quarterly for uncontrolled patients.",
        ]
        return docs[:2]
    
  2. Create CrewAI agents for healthcare RAG

    Use one agent to interpret the question and another to synthesize the answer from retrieved context. In CrewAI, this is typically done with Agent, Task, and Crew.

    from crewai import Agent, Task, Crew, Process
    
    clinical_researcher = Agent(
        role="Clinical Researcher",
        goal="Find relevant clinical evidence in retrieved context",
        backstory="You analyze healthcare documentation and extract only supported facts.",
        verbose=True,
    )
    
    medical_writer = Agent(
        role="Medical Writer",
        goal="Produce a concise answer grounded in retrieved context",
        backstory="You write clear responses for clinicians and care operations teams.",
        verbose=True,
    )
    
    def build_crew(query: str, context_docs: list[str]) -> Crew:
        research_task = Task(
            description=(
                f"Review the following healthcare context and identify facts relevant to: {query}\n\n"
                + "\n".join(f"- {doc}" for doc in context_docs)
            ),
            expected_output="A bullet list of relevant clinical facts with no speculation.",
            agent=clinical_researcher,
        )
    
        synthesis_task = Task(
            description=(
                f"Using the research notes above, answer this question: {query}"
            ),
            expected_output="A short grounded answer suitable for a healthcare RAG response.",
            agent=medical_writer,
            context=[research_task],
        )
    
        return Crew(
            agents=[clinical_researcher, medical_writer],
            tasks=[research_task, synthesis_task],
            process=Process.sequential,
            verbose=True,
        )
    
  3. Expose the RAG workflow through FastAPI

    FastAPI becomes the interface layer for your AI system. Define a request schema, call retrieval, then run the crew inside the endpoint.

    from fastapi import FastAPI
    from pydantic import BaseModel
    
    app = FastAPI(title="Healthcare RAG API")
    
    class RagRequest(BaseModel):
        query: str
    
    class RagResponse(BaseModel):
        answer: str
        sources: list[str]
    
    @app.post("/rag", response_model=RagResponse)
    async def rag_endpoint(payload: RagRequest):
        context_docs = retrieve_medical_context(payload.query)
        crew = build_crew(payload.query, context_docs)
    
        result = crew.kickoff()
        return RagResponse(
            answer=str(result),
            sources=context_docs,
        )
    
  4. Add async-safe execution if your model calls are blocking

    CrewAI workflows often involve blocking LLM calls. If you run them directly inside an async endpoint under load, you can stall the event loop. Push execution to a thread pool.

    import asyncio
    from fastapi.concurrency import run_in_threadpool
    
    @app.post("/rag-safe", response_model=RagResponse)
    async def rag_safe_endpoint(payload: RagRequest):
        context_docs = retrieve_medical_context(payload.query)
        crew = build_crew(payload.query, context_docs)
    
        result = await run_in_threadpool(crew.kickoff)
        return RagResponse(answer=str(result), sources=context_docs)
    
  5. Run the API locally

    Start FastAPI with Uvicorn and point your client at /rag.

    uvicorn main:app --reload --host 0.0.0.0 --port 8000
    

Testing the Integration

Use FastAPI’s test client to verify that retrieval happens first and CrewAI produces a grounded response.

from fastapi.testclient import TestClient
from main import app

client = TestClient(app)

def test_rag_endpoint():
    response = client.post("/rag", json={"query": "What is first-line treatment for hypertension?"})
    assert response.status_code == 200

    data = response.json()
    assert "answer" in data
    assert "sources" in data
    assert len(data["sources"]) > 0

if __name__ == "__main__":
    resp = client.post("/rag", json={"query": "How often should HbA1c be checked?"})
    print(resp.json())

Expected output:

{
  "answer": "Based on the retrieved clinical context...",
  "sources": [
    "Hypertension guideline: start with lifestyle changes and first-line antihypertensives.",
    "Diabetes care note: assess HbA1c quarterly for uncontrolled patients."
  ]
}

Real-World Use Cases

  • Clinical policy assistant
    Answer questions from internal hospital policies, formularies, and care pathways using retrieved documents plus an agent that summarizes only supported guidance.

  • Prior authorization support
    Retrieve payer rules and clinical documentation requirements, then have CrewAI generate a structured checklist or draft justification.

  • Care team knowledge assistant
    Build an API that helps nurses or case managers query SOPs, escalation rules, and patient education content through a single endpoint.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides