How to Build a compliance checking Agent Using LangGraph in Python for healthcare

By Cyprian AaronsUpdated 2026-04-21
compliance-checkinglanggraphpythonhealthcare

A compliance checking agent for healthcare reviews clinical or administrative text, checks it against policy, and returns a structured decision: pass, flag, or escalate. It matters because healthcare workflows are full of PHI, regulatory constraints, and audit requirements, so you need deterministic checks around the model instead of trusting free-form output.

Architecture

  • Input normalizer

    • Cleans the incoming document or message.
    • Extracts the minimum necessary text to reduce PHI exposure.
  • Policy retrieval node

    • Pulls the relevant compliance rules for HIPAA, internal policy, and jurisdiction-specific requirements.
    • Uses versioned policy documents so every decision is auditable.
  • LLM analysis node

    • Classifies the content against the policy.
    • Produces structured findings, not prose.
  • Rules engine / validator

    • Applies deterministic checks for hard failures.
    • Example: missing consent language, disallowed identifiers, unsupported data transfer region.
  • Decision router

    • Chooses between approve, reject, or human review.
    • Keeps high-risk cases out of automated approval paths.
  • Audit logger

    • Persists inputs, outputs, policy version, timestamps, and final decision.
    • Needed for investigations and regulatory review.

Implementation

1) Define the state and graph nodes

Use a typed state so every step in the graph has explicit inputs and outputs. In healthcare workflows this keeps your compliance trail predictable and makes failures easier to trace.

from typing import TypedDict, Annotated, Literal
from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages

class ComplianceState(TypedDict):
    text: str
    policy_context: str
    findings: list[str]
    risk_score: int
    decision: Literal["approve", "reject", "review"]
    audit_log: list[str]

def load_policy(state: ComplianceState):
    # In production this comes from a versioned policy store or vector retriever.
    return {
        "policy_context": (
            "HIPAA minimum necessary rule; no unmasked PHI in outbound messages; "
            "consent required for sharing outside treatment/payment/operations; "
            "data residency must remain in approved region."
        )
    }

def analyze_compliance(state: ComplianceState):
    text = state["text"].lower()
    findings = []
    risk = 0

    if "ssn" in text or "social security" in text:
        findings.append("Potential direct identifier exposure.")
        risk += 40
    if "send to vendor" in text:
        findings.append("Possible third-party disclosure.")
        risk += 25
    if "consent" not in text:
        findings.append("Consent language not detected.")
        risk += 20

    return {"findings": findings, "risk_score": risk}

def decide(state: ComplianceState):
    if state["risk_score"] >= 50:
        decision = "reject"
    elif state["risk_score"] >= 20:
        decision = "review"
    else:
        decision = "approve"
    return {"decision": decision}

def audit(state: ComplianceState):
    entry = (
        f"decision={state['decision']} | risk={state['risk_score']} | "
        f"findings={state['findings']}"
    )
    return {"audit_log": state["audit_log"] + [entry]}

2) Build the LangGraph workflow

This is the core pattern: load policy first, analyze second, decide third, then write an audit entry. StateGraph is enough here; you do not need a complex multi-agent setup for a compliance checker.

workflow = StateGraph(ComplianceState)

workflow.add_node("load_policy", load_policy)
workflow.add_node("analyze_compliance", analyze_compliance)
workflow.add_node("decide", decide)
workflow.add_node("audit", audit)

workflow.add_edge(START, "load_policy")
workflow.add_edge("load_policy", "analyze_compliance")
workflow.add_edge("analyze_compliance", "decide")
workflow.add_edge("decide", "audit")
workflow.add_edge("audit", END)

app = workflow.compile()

3) Run it with healthcare text

Keep the input small and specific. If you are processing PHI, redact before sending anything to an external model provider unless your deployment contract explicitly allows it.

initial_state = {
    "text": (
        "Please send the patient's lab results and SSN to the external billing vendor. "
        "No consent form was attached."
    ),
    "policy_context": "",
    "findings": [],
    "risk_score": 0,
    "decision": "approve",
    "audit_log": []
}

result = app.invoke(initial_state)
print(result["decision"])
print(result["findings"])
print(result["audit_log"])

4) Add branching for human review

For real healthcare operations you usually want escalation when the agent is uncertain. LangGraph supports conditional edges through add_conditional_edges, which is cleaner than burying routing logic inside one node.

def route_after_decision(state: ComplianceState):
    return state["decision"]

review_graph = StateGraph(ComplianceState)
review_graph.add_node("load_policy", load_policy)
review_graph.add_node("analyze_compliance", analyze_compliance)
review_graph.add_node("decide", decide)
review_graph.add_node("audit", audit)

review_graph.add_edge(START, "load_policy")
review_graph.add_edge("load_policy", "analyze_compliance")
review_graph.add_edge("analyze_compliance", "decide")

review_graph.add_conditional_edges(
    "decide",
    route_after_decision,
    {
        "approve": "audit",
        "reject": "audit",
        "review": END,
    },
)

review_graph.add_edge("audit", END)

app_with_review = review_graph.compile()

Production Considerations

  • Deploy inside your compliance boundary

    • Keep PHI processing in approved infrastructure and region.
    • If your policies require data residency in-country or on-premises, do not send raw text to a hosted endpoint outside that boundary.
  • Log for auditability

    • Store input hash, policy version, model version, timestamp, risk score, final decision, and reviewer identity if escalated.
    • Make logs immutable or append-only where possible.
  • Add guardrails before the model call

    • Redact obvious identifiers like MRNs, SSNs, phone numbers, and addresses when they are not needed for analysis.
    • Enforce allowlists for outbound destinations and disallow free-text approvals on high-risk cases.
  • Monitor drift and false negatives

    • Track rejected-to-approved ratios by policy category.
    • Sample human-reviewed cases to catch gaps where the agent misses consent issues or misreads clinical context.

Common Pitfalls

  • Using one LLM prompt as the whole compliance system

    • Don’t do that.
    • Put hard rules in code and use the model only for classification or extraction where ambiguity exists.
  • Logging raw PHI everywhere

    • This creates its own compliance problem.
    • Mask sensitive fields in application logs and keep full payload access tightly controlled.
  • Ignoring policy versioning

    • A decision without a policy version is hard to defend later.
    • Store the exact policy document ID used by load_policy, not just “HIPAA rules.”
  • Skipping human review thresholds

    • Healthcare has too much edge-case risk for fully automatic approval on borderline cases.
    • Route medium-risk outputs to a reviewer with context attached: findings, source snippet hash, and rule references.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides