LangGraph Tutorial (TypeScript): adding cost tracking for advanced developers

By Cyprian AaronsUpdated 2026-04-22
langgraphadding-cost-tracking-for-advanced-developerstypescript

This tutorial shows you how to add per-run cost tracking to a LangGraph app in TypeScript by capturing token usage from model calls and aggregating it in graph state. You need this when you want hard numbers for billing, budget alerts, or per-user usage limits instead of guessing from logs.

What You'll Need

  • Node.js 18+
  • TypeScript 5+
  • @langchain/langgraph
  • @langchain/openai
  • openai API key set as OPENAI_API_KEY
  • A project configured for ESM or a bundler that supports ES modules
  • Basic familiarity with LangGraph state, nodes, and edges

Step-by-Step

  1. Start with a graph state that can hold both the conversation and the running cost totals. Keep the cost fields explicit so you can persist them later or emit them to your observability stack.
import { Annotation } from "@langchain/langgraph";

export const GraphState = Annotation.Root({
  messages: Annotation<any[]>({
    default: () => [],
    reducer: (left, right) => left.concat(right),
  }),
  promptTokens: Annotation<number>({
    default: () => 0,
    reducer: (left, right) => left + right,
  }),
  completionTokens: Annotation<number>({
    default: () => 0,
    reducer: (left, right) => left + right,
  }),
  totalCostUsd: Annotation<number>({
    default: () => 0,
    reducer: (left, right) => left + right,
  }),
});

export type GraphStateType = typeof GraphState.State;
  1. Add a small pricing function. This keeps pricing isolated from the graph logic and makes it easy to update when model prices change.
export function calculateCostUsd(
  promptTokens: number,
  completionTokens: number
): number {
  const promptRatePer1M = 5.00; // gpt-4o example
  const completionRatePer1M = 15.00;

  return (
    (promptTokens / 1_000_000) * promptRatePer1M +
    (completionTokens / 1_000_000) * completionRatePer1M
  );
}
  1. Build a node that calls the model and returns both the assistant message and usage metadata. The important part is reading response_metadata.tokenUsage, which is present on OpenAI chat responses through LangChain wrappers.
import { ChatOpenAI } from "@langchain/openai";
import { AIMessage, HumanMessage } from "@langchain/core/messages";
import type { GraphStateType } from "./state.js";
import { calculateCostUsd } from "./cost.js";

const model = new ChatOpenAI({
  model: "gpt-4o",
  temperature: 0,
});

export async function assistantNode(state: GraphStateType) {
  const response = await model.invoke(state.messages);

  const tokenUsage = response.response_metadata?.tokenUsage ?? {};
  const promptTokens = tokenUsage.promptTokens ?? tokenUsage.prompt_tokens ?? 0;
  const completionTokens =
    tokenUsage.completionTokens ?? tokenUsage.completion_tokens ?? 0;

  return {
    messages: [response],
    promptTokens,
    completionTokens,
    totalCostUsd: calculateCostUsd(promptTokens, completionTokens),
  };
}
  1. Wire the node into a minimal graph and compile it. This version is enough to test cost tracking end-to-end without adding routing or tool execution yet.
import { StateGraph, START, END } from "@langchain/langgraph";
import { GraphState } from "./state.js";
import { assistantNode } from "./assistant-node.js";

const builder = new StateGraph(GraphState)
  .addNode("assistant", assistantNode)
  .addEdge(START, "assistant")
  .addEdge("assistant", END);

export const graph = builder.compile();
  1. Run the graph with an initial user message and print the tracked usage after execution. If you want this in production, send these values to your database or metrics backend instead of just logging them.
import { HumanMessage } from "@langchain/core/messages";
import { graph } from "./graph.js";

async function main() {
  const result = await graph.invoke({
    messages: [new HumanMessage("Summarize why banks track LLM spend.")],
  });

  
console.log({
    promptTokens: result.promptTokens,
    completionTokens: result.completionTokens,
    totalCostUsd: result.totalCostUsd,
    lastMessage:
      result.messages[result.messages.length - 1]?.content?.toString(),
  });
}

main().catch((error) => {
  
console.error(error);
  
process.exit(1);
});

Testing It

Run the script with a real OPENAI_API_KEY and confirm that promptTokens, completionTokens, and totalCostUsd are all greater than zero after a successful call. If any of those fields stay at zero, inspect the raw response object and verify the metadata shape returned by your model version.

Then run a second request with a much longer prompt and compare totals. The token counts should increase predictably, which tells you the accounting is tied to actual usage rather than fixed estimates.

For more confidence, put two invocations through the same compiled graph and make sure each run starts fresh unless you intentionally persist state between runs. In production systems, that separation matters because cost should usually be tracked per request, per customer, or per workflow execution.

Next Steps

  • Add middleware that writes totalCostUsd to Postgres or Redis at the end of each run
  • Extend the pricing table for multiple models and route costs by modelName
  • Track tool-call costs separately so agentic workflows can expose true end-to-end spend

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides