How to Build a policy Q&A Agent Using CrewAI in Python for healthcare
A healthcare policy Q&A agent answers staff questions against approved policy sources: benefits, prior auth, privacy, claims handling, care pathways, and internal SOPs. It matters because the wrong answer in healthcare can create compliance exposure, delay care, or send a patient down the wrong operational path.
Architecture
- •
Policy ingestion layer
- •Pulls from approved PDFs, HTML pages, SharePoint exports, or internal knowledge bases.
- •Normalizes content into chunks with source metadata: document name, version, effective date, owner.
- •
Retrieval layer
- •Finds the most relevant policy passages for a user question.
- •Must preserve citations so the agent can justify every answer.
- •
Answering agent
- •Uses an LLM to synthesize an answer from retrieved policy text.
- •Restricts itself to policy-grounded responses and refuses unsupported claims.
- •
Compliance guardrail
- •Blocks PHI leakage, disallowed medical advice, and answers outside approved scope.
- •Enforces “answer from policy only” behavior.
- •
Audit logging
- •Stores question, retrieved sources, final answer, model version, timestamps, and reviewer actions.
- •Required for traceability in regulated environments.
- •
Deployment boundary
- •Runs in a controlled network segment with data residency constraints.
- •Keeps sensitive content inside approved cloud regions or on-prem infrastructure.
Implementation
1) Install CrewAI and define your dependencies
Use CrewAI with a real LLM provider and a retrieval tool. For healthcare policy Q&A, keep the retrieval source controlled; don’t point this at open web search.
pip install crewai crewai-tools langchain-openai chromadb pypdf
Set your model key and make sure your environment is locked down:
export OPENAI_API_KEY="your-key"
2) Build a simple policy retriever tool
CrewAI tools are the cleanest way to expose internal policy search to an agent. Below is a practical pattern using Chroma as the local vector store and BaseTool from crewai.tools.
from typing import Type
from pydantic import BaseModel, Field
from crewai.tools import BaseTool
import chromadb
client = chromadb.PersistentClient(path="./chroma_health_policy")
collection = client.get_or_create_collection(name="healthcare_policies")
class PolicySearchInput(BaseModel):
query: str = Field(..., description="Healthcare policy question to search for")
class HealthcarePolicySearchTool(BaseTool):
name: str = "healthcare_policy_search"
description: str = "Search approved healthcare policies and return relevant excerpts with citations."
args_schema: Type[BaseModel] = PolicySearchInput
def _run(self, query: str) -> str:
results = collection.query(
query_texts=[query],
n_results=3,
include=["documents", "metadatas"]
)
hits = []
for doc_list, meta_list in zip(results["documents"], results["metadatas"]):
for doc, meta in zip(doc_list, meta_list):
hits.append(
f"Source: {meta.get('source', 'unknown')} | "
f"Version: {meta.get('version', 'unknown')} | "
f"Date: {meta.get('effective_date', 'unknown')}\n"
f"Excerpt: {doc}"
)
return "\n\n".join(hits) if hits else "No approved policy matches found."
This tool gives the agent one job: retrieve approved text. That keeps the model from inventing policy language.
3) Create the agent and task with strict instructions
Use Agent, Task, and Crew directly. The key is to tell the model what it may and may not do.
from crewai import Agent, Task, Crew, Process
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
policy_agent = Agent(
role="Healthcare Policy Q&A Specialist",
goal="Answer staff questions using only approved healthcare policies with citations.",
backstory=(
"You support healthcare operations. You must not provide medical advice. "
"You must not infer beyond the supplied policy excerpts. "
"If evidence is missing, say so clearly."
),
tools=[HealthcarePolicySearchTool()],
llm=llm,
verbose=True,
)
question = "Can a member request a retroactive prior authorization review?"
task = Task(
description=(
"Answer the user's healthcare policy question using only retrieved policy excerpts. "
"Include citations by source name and version. If the policies do not support an answer, say that explicitly."
),
expected_output="A concise answer with citations and a short note on any missing information.",
agent=policy_agent,
)
crew = Crew(
agents=[policy_agent],
tasks=[task],
process=Process.sequential,
verbose=True,
)
result = crew.kickoff(inputs={"question": question})
print(result)
The important pattern here is not just calling kickoff(). It’s pairing a narrow tool with strict instructions and zero-temperature generation so answers stay deterministic enough for review.
4) Add ingestion before runtime
Your Q&A quality depends on how well you index policies. A basic ingestion script should extract text from PDFs or HTML and store metadata alongside each chunk.
from pypdf import PdfReader
def ingest_pdf(path: str, source_name: str, version: str, effective_date: str):
reader = PdfReader(path)
for page_num, page in enumerate(reader.pages):
text = page.extract_text() or ""
if not text.strip():
continue
collection.add(
documents=[text],
metadatas=[{
"source": source_name,
"version": version,
"effective_date": effective_date,
"page": page_num + 1
}],
ids=[f"{source_name}-p{page_num + 1}"]
)
# Example:
# ingest_pdf("./policies/prior_auth_policy.pdf", "Prior Auth Policy", "v3.2", "2025-01-15")
For healthcare teams, this ingestion step should run only against approved documents. Don’t mix draft policies with production content.
Production Considerations
- •
Compliance controls
- •Block PHI from entering prompts unless you have an explicit legal basis and controls in place.
- •Log all prompts and responses for audit trails.
- •Keep citations tied to document versions so reviewers can reconstruct exactly what was answered.
- •
Data residency
- •Store embeddings and logs in the same jurisdiction as your regulated data.
- •If you use managed LLM APIs, confirm regional processing guarantees and retention terms.
- •Avoid routing sensitive content through consumer-grade endpoints.
- •
Monitoring
- •Track refusal rate, citation coverage, retrieval hit rate, and escalation rate.
- •Alert when answers are generated without supporting excerpts.
- •Sample outputs weekly for clinical safety review and policy drift detection.
- •
Guardrails
- •Enforce “policy only” responses when asked about coverage or process.
- •Escalate medical judgment questions to human staff or clinical workflows.
- •Add filters for PHI patterns like member IDs, diagnosis codes paired with identifiers, or free-text notes.
Common Pitfalls
- •
Using general web search instead of approved sources
- •Healthcare policy changes often. Public web results will drift from internal SOPs.
- •Fix it by restricting retrieval to curated documents with version metadata.
- •
Letting the model answer beyond evidence
- •If retrieval returns weak matches, models will fill gaps with plausible nonsense.
- •Fix it by forcing explicit refusal when no supporting excerpt is found.
- •
Ignoring auditability
- •A useful answer that cannot be traced back to source docs is a liability in regulated operations.
- •Fix it by storing question text, retrieved chunks, final output, timestamp, model name, and document versions together.
If you build this pattern correctly, CrewAI gives you a clean orchestration layer over retrieval plus constrained generation. For healthcare policy Q&A that means fewer hallucinations, better auditability, and a system compliance teams can actually approve.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit