How to Build a compliance checking Agent Using CrewAI in Python for healthcare
A healthcare compliance checking agent reviews patient-facing content, clinical workflows, and operational documents against policy rules before they go live. It matters because the cost of a missed HIPAA issue, PHI exposure, or incorrect retention rule is not just technical debt — it is regulatory risk, audit pain, and patient trust loss.
Architecture
A production healthcare compliance agent built with CrewAI needs these components:
- •
Policy intake layer
- •Loads HIPAA, internal security policy, retention rules, and approved terminology.
- •Keeps policy versions explicit so every decision can be traced.
- •
Document analyzer
- •Extracts text from emails, chat transcripts, care summaries, forms, and support tickets.
- •Normalizes content before it reaches the LLM.
- •
Compliance reasoning agent
- •Checks for PHI leakage, consent issues, minimum-necessary violations, and disallowed claims.
- •Produces structured findings with severity and rationale.
- •
Evidence collector
- •Captures source snippets, timestamps, policy references, and reviewer notes.
- •This is what you need when legal or audit asks “why was this flagged?”
- •
Escalation workflow
- •Routes high-risk cases to a human compliance officer.
- •Keeps the agent in assistive mode, not autonomous approval mode.
- •
Audit logger
- •Stores immutable decision logs in a compliant system with retention controls.
- •Supports data residency requirements by keeping logs in-region.
Implementation
1) Install CrewAI and define the policy model
Use CrewAI’s Agent, Task, and Crew classes. For healthcare, keep the policy schema strict so the model returns structured outputs you can validate.
from pydantic import BaseModel
from crewai import Agent, Task, Crew
from crewai.project import CrewBase
from crewai_tools import SerperDevTool
class ComplianceFinding(BaseModel):
issue: str
severity: str
evidence: str
recommendation: str
policy_context = """
You are reviewing healthcare content for compliance risks.
Check for:
- PHI disclosure
- Missing consent language
- Unsupported medical claims
- Retention or access control violations
Return concise findings only.
"""
2) Build the compliance agent and task
Use one agent for review and one task for structured analysis. In healthcare workflows, keep temperature low and force the model to cite evidence from the input.
compliance_agent = Agent(
role="Healthcare Compliance Reviewer",
goal="Identify compliance risks in healthcare content",
backstory=policy_context,
verbose=True,
allow_delegation=False,
)
review_task = Task(
description="""
Review the following healthcare support message for compliance issues:
Patient John Doe asked us to email his lab results to his personal Gmail account.
The nurse replied that this is fine as long as he confirms his date of birth.
""",
expected_output="A list of compliance findings with issue, severity, evidence, and recommendation.",
agent=compliance_agent,
)
3) Run the crew and capture findings
This is the actual execution pattern. In production you would pass in document text from your intake pipeline instead of hardcoding it.
crew = Crew(
agents=[compliance_agent],
tasks=[review_task],
verbose=True,
)
result = crew.kickoff()
print(result)
If you want stricter downstream handling, wrap output validation around your own schema. CrewAI gives you the orchestration layer; your application should enforce whether a finding blocks release or triggers human review.
4) Add escalation logic for high-risk cases
Healthcare compliance should not be “LLM says approved.” Use deterministic routing based on severity keywords or confidence thresholds from your own validator.
def route_finding(finding: dict) -> str:
if finding["severity"].lower() in {"high", "critical"}:
return "human_review"
return "auto_log"
sample_finding = {
"issue": "PHI sent to personal email without verified secure channel",
"severity": "high",
"evidence": "email his lab results to his personal Gmail account",
"recommendation": "Use secure patient portal messaging and verify consent policy."
}
print(route_finding(sample_finding))
Production Considerations
- •
Keep data residency explicit
- •Run inference in-region if your policies require it.
- •Do not send PHI to third-party tools unless your legal team has signed off on BAAs and subprocessors.
- •
Log everything needed for audit
- •Store prompt version, policy version, model version, input hash, output hash, timestamp, and reviewer ID.
- •Make logs immutable or append-only.
- •
Add guardrails before the LLM sees data
- •Redact direct identifiers where possible.
- •Classify documents first; only send minimum necessary context to the agent.
- •
Use human-in-the-loop escalation
- •Any ambiguous case involving consent, diagnosis language, billing disputes, or patient safety should go to a human reviewer.
- •The agent should recommend actions, not make final compliance decisions.
Common Pitfalls
- •
Treating the agent like an approver
- •Mistake: letting the model greenlight content automatically.
- •Fix: use it as a reviewer that flags risk; require human approval for anything medium or above severity.
- •
Sending raw PHI into every prompt
- •Mistake: dumping full charts or transcripts into the model context.
- •Fix: redact identifiers and pass only the minimal snippet needed for analysis.
- •
Skipping audit metadata
- •Mistake: storing only the final answer with no traceability.
- •Fix: persist policy version, input source ID, decision timestamp, and evidence references so audits are defensible.
- •
Ignoring jurisdiction differences
- •Mistake: applying one compliance rule set across all regions.
- •Fix: parameterize policies by state/country residency rules and keep them versioned per deployment region.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit