How to Integrate LangGraph for healthcare with LangSmith for production AI
LangGraph for healthcare gives you the orchestration layer for clinical workflows, while LangSmith gives you the observability layer to see what your agent actually did in production. Combined, they let you build healthcare agents that are not just functional, but traceable, testable, and auditable across every decision path.
Prerequisites
- •Python 3.10+
- •A LangChain/LangGraph-compatible project environment
- •Installed packages:
- •
langgraph - •
langchain - •
langsmith - •a model provider package such as
langchain-openai
- •
- •A LangSmith account and API key
- •Environment variables configured:
- •
LANGSMITH_API_KEY - •
LANGSMITH_TRACING=true - •
LANGSMITH_PROJECT=healthcare-agent-prod - •your model provider key, for example
OPENAI_API_KEY
- •
- •A healthcare workflow design that avoids sending PHI unless your compliance posture allows it
Integration Steps
- •
Install the SDKs and enable tracing
Start by installing the core packages and turning on LangSmith tracing at the process level. LangGraph will emit execution events, and LangSmith will capture them as traces.
pip install langgraph langchain langsmith langchain-openaiimport os os.environ["LANGSMITH_TRACING"] = "true" os.environ["LANGSMITH_PROJECT"] = "healthcare-agent-prod" os.environ["LANGSMITH_API_KEY"] = "lsv2_***" os.environ["OPENAI_API_KEY"] = "sk-***" - •
Build a LangGraph workflow for a healthcare use case
Use a graph to model a simple patient triage flow. In production, this is where you encode routing logic, guardrails, escalation paths, and human review points.
from typing import TypedDict, Literal from langgraph.graph import StateGraph, END from langchain_openai import ChatOpenAI llm = ChatOpenAI(model="gpt-4o-mini", temperature=0) class TriageState(TypedDict): symptom_summary: str risk_level: str recommendation: str def assess_risk(state: TriageState) -> TriageState: prompt = f""" You are a clinical triage assistant. Classify this symptom summary into low, medium, or high risk. Symptom summary: {state["symptom_summary"]} """ response = llm.invoke(prompt) return { **state, "risk_level": response.content.strip().lower(), } def recommend_next_step(state: TriageState) -> TriageState: risk = state["risk_level"] if "high" in risk: recommendation = "Escalate to urgent care and notify clinician." elif "medium" in risk: recommendation = "Schedule same-day nurse review." else: recommendation = "Provide self-care guidance and monitor symptoms." return {**state, "recommendation": recommendation} graph = StateGraph(TriageState) graph.add_node("assess_risk", assess_risk) graph.add_node("recommend", recommend_next_step) graph.set_entry_point("assess_risk") graph.add_edge("assess_risk", "recommend") graph.add_edge("recommend", END) app = graph.compile() - •
Attach LangSmith tracing to the graph run
The key integration point is that LangGraph runs inside a traced execution context. If tracing is enabled via environment variables, LangSmith automatically captures node-level execution when you invoke the compiled graph.
For more explicit control in production systems, pass metadata such as patient cohort, workflow name, or environment tags.
result = app.invoke( {"symptom_summary": "Chest tightness with shortness of breath after walking upstairs."}, config={ "run_name": "triage-workflow", "tags": ["healthcare", "triage", "prod"], "metadata": { "department": "urgent_care", "workflow_version": "v1.2.0", }, }, ) print(result) - •
Add custom spans or structured feedback with LangSmith
When you need deeper visibility than default traces, use LangSmith’s client directly. This is useful for logging evaluation labels, annotating runs after clinician review, or attaching domain-specific metadata.
from langsmith import Client client = Client() run_id = client.create_run( name="manual-triage-check", run_type="chain", inputs={"symptom_summary": "Severe headache and blurred vision."}, project_name="healthcare-agent-prod", tags=["review"], ) client.update_run( run_id.id, outputs={ "risk_level": "high", "recommendation": "Escalate to urgent care and notify clinician." }, ) - •
Instrument evaluations for production quality control
Once the workflow is running, add regression tests and dataset-based evaluations in LangSmith. This is how you catch bad routing changes before they hit patients or staff.
from langsmith.evaluation import evaluate def run_triage(inputs): return app.invoke(inputs) dataset_inputs = [ {"symptom_summary": "Mild sore throat for two days."}, {"symptom_summary": "Chest pain and dizziness."}, ] results = evaluate( run_triage, data=dataset_inputs, experiment_prefix="triage-eval", ) print(results)
Testing the Integration
Run a single end-to-end invocation and confirm that both the workflow result and the trace appear in LangSmith.
result = app.invoke(
{"symptom_summary": "Fever and persistent cough for one week."},
config={
"run_name": "integration-test",
"tags": ["test", "healthcare"],
"metadata": {"env": "staging"},
},
)
print(result)
Expected output:
{
'symptom_summary': 'Fever and persistent cough for one week.',
'risk_level': 'medium',
'recommendation': 'Schedule same-day nurse review.'
}
In LangSmith, you should see:
- •one parent trace for the graph run
- •child spans for each node
- •input/output payloads per step
- •tags and metadata attached to the run
Real-World Use Cases
- •Clinical intake assistants that route patients based on symptoms, then escalate only high-risk cases to clinicians.
- •Prior authorization workflows that collect evidence documents, classify request urgency, and log every decision path for auditability.
- •Care coordination agents that summarize patient messages, trigger follow-up tasks, and give operations teams full trace visibility in LangSmith.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit