How to Build a claims processing Agent Using CrewAI in Python for healthcare

By Cyprian AaronsUpdated 2026-04-21

claims-processingcrewaipythonhealthcare

A claims processing agent in healthcare takes incoming claim data, checks it against policy and clinical rules, flags missing or inconsistent fields, and prepares a decision-ready output for human review or downstream adjudication. It matters because claims are high-volume, error-prone, and tightly regulated; a bad automation layer can create payment delays, compliance exposure, and patient friction.

Architecture

•
Claim intake parser
- •Normalizes claim payloads from EHR, clearinghouse, or FHIR/HL7 sources into a structured Python object.
•
Policy validation agent
- •Checks eligibility, benefits coverage, prior authorization requirements, coding consistency, and medical necessity rules.
•
Documentation review agent
- •Extracts missing evidence from attachments like clinical notes, referral letters, and lab results.
•
Audit trail store
- •Persists every decision, tool call, and intermediate reasoning artifact for compliance review.
•
Human escalation path
- •Routes ambiguous or high-risk claims to a claims examiner or nurse reviewer.
•
Secure data boundary
- •Enforces PHI handling, encryption, access control, retention policies, and regional data residency.

Implementation

1) Install CrewAI and define the claim schema

Keep the input shape strict. Healthcare workflows fail when you let free-form text drift into adjudication logic.

from pydantic import BaseModel, Field
from typing import List, Optional

class ClaimInput(BaseModel):
    claim_id: str
    member_id: str
    provider_npi: str
    diagnosis_codes: List[str]
    procedure_codes: List[str]
    service_date: str
    amount_billed: float
    state: str
    attachments: Optional[List[str]] = Field(default_factory=list)

class ClaimReviewResult(BaseModel):
    claim_id: str
    status: str
    issues: List[str]
    recommended_action: str

2) Create specialized CrewAI agents with explicit roles

Use separate agents for validation and documentation review. That keeps each agent narrow and easier to audit.

from crewai import Agent

policy_agent = Agent(
    role="Claims Policy Validator",
    goal="Validate healthcare claims against coverage and billing rules",
    backstory=(
        "You review healthcare claims for eligibility, coding consistency, "
        "prior authorization requirements, and obvious billing errors."
    ),
    verbose=True,
)

doc_agent = Agent(
    role="Clinical Documentation Reviewer",
    goal="Check whether supporting documentation is sufficient for claim processing",
    backstory=(
        "You inspect attached clinical documents for missing evidence needed "
        "to support reimbursement decisions."
    ),
    verbose=True,
)

3) Define tasks that produce structured outputs

Use Task with clear instructions. In production, I prefer forcing a schema-like response so downstream systems don’t parse prose.

from crewai import Task

policy_task = Task(
    description=(
        "Review the following claim for policy issues. Check diagnosis/procedure "
        "alignment, likely authorization gaps, duplicate billing risk, and obvious "
        "inconsistencies. Return concise issues and a recommended action."
    ),
    expected_output=(
        "A short list of policy issues plus one of: approve_for_review, "
        "request_more_info, escalate_to_human."
    ),
    agent=policy_agent,
)

doc_task = Task(
    description=(
        "Review attached documentation references for sufficiency. Identify if "
        "supporting evidence is missing for the billed services."
    ),
    expected_output="A short list of documentation gaps plus a recommended action.",
    agent=doc_agent,
)

4) Run the crew and map results into your workflow

This pattern keeps orchestration simple. The crew does analysis; your application decides whether to auto-route or escalate.

from crewai import Crew, Process

def process_claim(claim: ClaimInput):
    crew = Crew(
        agents=[policy_agent, doc_agent],
        tasks=[policy_task, doc_task],
        process=Process.sequential,
        verbose=True,
    )

    result = crew.kickoff(inputs={
        "claim_id": claim.claim_id,
        "member_id": claim.member_id,
        "provider_npi": claim.provider_npi,
        "diagnosis_codes": claim.diagnosis_codes,
        "procedure_codes": claim.procedure_codes,
        "service_date": claim.service_date,
        "amount_billed": claim.amount_billed,
        "state": claim.state,
        "attachments": claim.attachments,
    })

    return result

if __name__ == "__main__":
    sample = ClaimInput(
        claim_id="CLM-10021",
        member_id="MBR-44219",
        provider_npi="1234567890",
        diagnosis_codes=["E11.9"],
        procedure_codes=["99213"],
        service_date="2026-04-01",
        amount_billed=145.00,
        state="CA",
        attachments=["visit_note.pdf"]
    )

    output = process_claim(sample)
    print(output)

If you want deterministic post-processing, wrap the final output in your own parser and convert it into ClaimReviewResult. Don’t let the model directly write to your claims database.

Production Considerations

•
PHI containment
- •Redact unnecessary identifiers before sending data to the agent.
- •Encrypt data in transit and at rest.
- •Use private networking where possible and keep model calls inside approved security boundaries.
•
Auditability
- •Log every task input/output pair with timestamps, versioned prompts, model name, and reviewer identity.
- •Store immutable audit records so compliance teams can reconstruct why a claim was escalated or flagged.
•
Data residency
- •Route EU/UK or state-regulated records to approved regions only.
- •If your vendor cannot guarantee residency controls for PHI, do not send raw claim content outside your boundary.
•
Human-in-the-loop guardrails
- •Auto-route denials or high-dollar claims to human review.
- •Set thresholds for uncertain cases instead of allowing silent auto-adjudication.

Common Pitfalls

•
Using one general-purpose agent for everything
- •This creates noisy outputs and weak accountability.
- •Split policy validation from documentation review so each task is easier to test and audit.
•
Sending raw PHI without minimization
- •Claims payloads often include more patient data than the task needs.
- •Strip names, addresses, account numbers, and unrelated clinical details before calling the crew.
•
Treating model output as final decision logic
- •CrewAI should assist adjudication, not replace business rules or payer policy engines.
- •Convert agent output into structured recommendations and keep final approval behind deterministic rules or human review.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit