How to Integrate LangGraph for healthcare with LangSmith for production AI

By Cyprian AaronsUpdated 2026-04-22
langgraph-for-healthcarelangsmithproduction-ai

LangGraph for healthcare gives you the orchestration layer for clinical workflows, while LangSmith gives you the observability layer to see what your agent actually did in production. Combined, they let you build healthcare agents that are not just functional, but traceable, testable, and auditable across every decision path.

Prerequisites

  • Python 3.10+
  • A LangChain/LangGraph-compatible project environment
  • Installed packages:
    • langgraph
    • langchain
    • langsmith
    • a model provider package such as langchain-openai
  • A LangSmith account and API key
  • Environment variables configured:
    • LANGSMITH_API_KEY
    • LANGSMITH_TRACING=true
    • LANGSMITH_PROJECT=healthcare-agent-prod
    • your model provider key, for example OPENAI_API_KEY
  • A healthcare workflow design that avoids sending PHI unless your compliance posture allows it

Integration Steps

  1. Install the SDKs and enable tracing

    Start by installing the core packages and turning on LangSmith tracing at the process level. LangGraph will emit execution events, and LangSmith will capture them as traces.

    pip install langgraph langchain langsmith langchain-openai
    
    import os
    
    os.environ["LANGSMITH_TRACING"] = "true"
    os.environ["LANGSMITH_PROJECT"] = "healthcare-agent-prod"
    os.environ["LANGSMITH_API_KEY"] = "lsv2_***"
    os.environ["OPENAI_API_KEY"] = "sk-***"
    
  2. Build a LangGraph workflow for a healthcare use case

    Use a graph to model a simple patient triage flow. In production, this is where you encode routing logic, guardrails, escalation paths, and human review points.

    from typing import TypedDict, Literal
    from langgraph.graph import StateGraph, END
    from langchain_openai import ChatOpenAI
    
    llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
    
    class TriageState(TypedDict):
        symptom_summary: str
        risk_level: str
        recommendation: str
    
    def assess_risk(state: TriageState) -> TriageState:
        prompt = f"""
        You are a clinical triage assistant.
        Classify this symptom summary into low, medium, or high risk.
        Symptom summary: {state["symptom_summary"]}
        """
        response = llm.invoke(prompt)
        return {
            **state,
            "risk_level": response.content.strip().lower(),
        }
    
    def recommend_next_step(state: TriageState) -> TriageState:
        risk = state["risk_level"]
        if "high" in risk:
            recommendation = "Escalate to urgent care and notify clinician."
        elif "medium" in risk:
            recommendation = "Schedule same-day nurse review."
        else:
            recommendation = "Provide self-care guidance and monitor symptoms."
        return {**state, "recommendation": recommendation}
    
    graph = StateGraph(TriageState)
    graph.add_node("assess_risk", assess_risk)
    graph.add_node("recommend", recommend_next_step)
    
    graph.set_entry_point("assess_risk")
    graph.add_edge("assess_risk", "recommend")
    graph.add_edge("recommend", END)
    
    app = graph.compile()
    
  3. Attach LangSmith tracing to the graph run

    The key integration point is that LangGraph runs inside a traced execution context. If tracing is enabled via environment variables, LangSmith automatically captures node-level execution when you invoke the compiled graph.

    For more explicit control in production systems, pass metadata such as patient cohort, workflow name, or environment tags.

    result = app.invoke(
        {"symptom_summary": "Chest tightness with shortness of breath after walking upstairs."},
        config={
            "run_name": "triage-workflow",
            "tags": ["healthcare", "triage", "prod"],
            "metadata": {
                "department": "urgent_care",
                "workflow_version": "v1.2.0",
            },
        },
    )
    
    print(result)
    
  4. Add custom spans or structured feedback with LangSmith

    When you need deeper visibility than default traces, use LangSmith’s client directly. This is useful for logging evaluation labels, annotating runs after clinician review, or attaching domain-specific metadata.

     from langsmith import Client
    
     client = Client()
    
     run_id = client.create_run(
         name="manual-triage-check",
         run_type="chain",
         inputs={"symptom_summary": "Severe headache and blurred vision."},
         project_name="healthcare-agent-prod",
         tags=["review"],
     )
    
     client.update_run(
         run_id.id,
         outputs={
             "risk_level": "high",
             "recommendation": "Escalate to urgent care and notify clinician."
         },
     )
    
  5. Instrument evaluations for production quality control

    Once the workflow is running, add regression tests and dataset-based evaluations in LangSmith. This is how you catch bad routing changes before they hit patients or staff.

     from langsmith.evaluation import evaluate
    
     def run_triage(inputs):
         return app.invoke(inputs)
    
     dataset_inputs = [
         {"symptom_summary": "Mild sore throat for two days."},
         {"symptom_summary": "Chest pain and dizziness."},
     ]
    
     results = evaluate(
         run_triage,
         data=dataset_inputs,
         experiment_prefix="triage-eval",
     )
    
     print(results)
    

Testing the Integration

Run a single end-to-end invocation and confirm that both the workflow result and the trace appear in LangSmith.

result = app.invoke(
    {"symptom_summary": "Fever and persistent cough for one week."},
    config={
        "run_name": "integration-test",
        "tags": ["test", "healthcare"],
        "metadata": {"env": "staging"},
    },
)

print(result)

Expected output:

{
  'symptom_summary': 'Fever and persistent cough for one week.',
  'risk_level': 'medium',
  'recommendation': 'Schedule same-day nurse review.'
}

In LangSmith, you should see:

  • one parent trace for the graph run
  • child spans for each node
  • input/output payloads per step
  • tags and metadata attached to the run

Real-World Use Cases

  • Clinical intake assistants that route patients based on symptoms, then escalate only high-risk cases to clinicians.
  • Prior authorization workflows that collect evidence documents, classify request urgency, and log every decision path for auditability.
  • Care coordination agents that summarize patient messages, trigger follow-up tasks, and give operations teams full trace visibility in LangSmith.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides