How to Integrate LangGraph for pension funds with LangSmith for production AI
Opening
If you’re building AI agents for pension fund operations, you need more than a chat loop. LangGraph gives you stateful orchestration for workflows like benefit eligibility checks, contribution exceptions, and retirement case triage, while LangSmith gives you traceability, evaluation, and debugging in production.
Combined, they let you run agentic workflows with audit-friendly traces, failure analysis, and regression testing. That matters when your system touches regulated member data and every decision path needs to be inspectable.
Prerequisites
- •Python 3.10+
- •A LangChain/LangGraph-compatible environment
- •Access to LangSmith with an API key
- •A working OpenAI or other model provider key
- •
langgraph,langsmith,langchain, and a model SDK installed - •Environment variables configured:
- •
LANGSMITH_API_KEY - •
LANGSMITH_TRACING=true - •
LANGSMITH_PROJECT=pension-fund-agent - •
OPENAI_API_KEY
- •
Install the packages:
pip install langgraph langsmith langchain-openai
Integration Steps
- •Set up LangSmith tracing before you build the graph.
LangSmith works best when tracing is enabled at process startup. This ensures every node execution in your graph gets captured.
import os
os.environ["LANGSMITH_TRACING"] = "true"
os.environ["LANGSMITH_PROJECT"] = "pension-fund-agent"
os.environ["LANGSMITH_API_KEY"] = os.getenv("LANGSMITH_API_KEY")
- •Define the graph state and nodes in LangGraph.
For a pension fund workflow, keep the state explicit. Here we track the member query, the model response, and a decision flag for routing.
from typing import TypedDict
from langgraph.graph import StateGraph, END
from langchain_openai import ChatOpenAI
class PensionState(TypedDict):
query: str
answer: str
needs_human_review: bool
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
def assess_query(state: PensionState) -> PensionState:
prompt = (
"You are a pension fund assistant. "
"Classify whether this query needs human review due to policy ambiguity, "
"benefit disputes, or missing data.\n\n"
f"Query: {state['query']}"
)
result = llm.invoke(prompt)
text = result.content.lower()
return {
**state,
"answer": result.content,
"needs_human_review": any(
phrase in text for phrase in ["human review", "ambiguous", "missing data", "cannot determine"]
),
}
- •Build the workflow graph and compile it.
This is where LangGraph becomes useful for pension operations. You can route low-risk questions to automation and send high-risk cases to review queues.
def route_case(state: PensionState) -> str:
return "human_review" if state["needs_human_review"] else "done"
def human_review(state: PensionState) -> PensionState:
return {
**state,
"answer": f"[HUMAN REVIEW REQUIRED] {state['answer']}",
}
graph = StateGraph(PensionState)
graph.add_node("assess_query", assess_query)
graph.add_node("human_review", human_review)
graph.set_entry_point("assess_query")
graph.add_conditional_edges(
"assess_query",
route_case,
{
"human_review": "human_review",
"done": END,
},
)
graph.add_edge("human_review", END)
app = graph.compile()
- •Attach LangSmith tracing through runtime execution.
If tracing is enabled via environment variables, every .invoke() call on the compiled graph gets traced automatically through LangChain instrumentation. You can also wrap custom spans with Client when you want explicit logging around business events like claim escalation or benefit recalculation.
from langsmith import Client
client = Client()
run_metadata = {
"system": "pension-fund-agent",
"workflow": "member-query-triage",
}
result = app.invoke(
{"query": "Can I withdraw my pension early if I moved abroad?",
"answer": "",
"needs_human_review": False},
config={
"metadata": run_metadata,
"tags": ["pension", "triage", "production"],
},
)
client.create_run(
name="pension-triage-request",
run_type="chain",
inputs={"query": result["query"]},
outputs={"answer": result["answer"]},
)
- •Add evaluation hooks for production regression testing.
LangSmith’s real value shows up when you test prompts and graph behavior against known pension scenarios. Store representative cases and compare outputs across versions before deploying changes.
test_cases = [
{"query": "What happens to my pension if I retire at 60?",
"answer": "",
"needs_human_review": False},
{"query": "My employer says contributions were deducted but not received.",
"answer": "",
"needs_human_review": False},
]
for case in test_cases:
output = app.invoke(case, config={"tags": ["eval", "pension"]})
print(output["answer"])
Testing the Integration
Run a simple smoke test with one low-risk query and one ambiguous query. You should see different routing behavior, and the run should appear in LangSmith under your project.
low_risk = app.invoke({
"query": "How do I update my beneficiary details?",
"answer": "",
"needs_human_review": False,
}, config={"tags": ["smoke-test"]})
high_risk = app.invoke({
"query": "I think my defined benefit calculation is wrong and I want compensation.",
"answer": "",
"needs_human_review": False,
}, config={"tags": ["smoke-test"]})
print("LOW RISK:", low_risk["answer"])
print("HIGH RISK:", high_risk["answer"])
Expected output:
LOW RISK: <model answer about beneficiary updates>
HIGH RISK: [HUMAN REVIEW REQUIRED] <model answer indicating ambiguity or escalation>
In LangSmith, verify:
- •A project named
pension-fund-agent - •Traces for each
.invoke()call - •Metadata tags like
pension,triage, orsmoke-test - •Inputs/outputs visible per run
Real-World Use Cases
- •
Member query triage
- •Route routine questions like contribution dates or statement access to automation.
- •Escalate disputes, missing records, or policy exceptions to humans with full trace context.
- •
Benefit eligibility workflows
- •Chain steps for age checks, vesting status, contribution history, and plan rules.
- •Use LangSmith traces to debug where eligibility decisions diverge from expected outcomes.
- •
Compliance-safe agent operations
- •Record every branch taken by the agent for audit review.
- •Run regression tests on prompts whenever plan rules or policy language changes.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit