LangGraph Tutorial (Python): adding cost tracking for beginners

By Cyprian AaronsUpdated 2026-04-22
langgraphadding-cost-tracking-for-beginnerspython

This tutorial shows how to add per-run cost tracking to a LangGraph app in Python, using real OpenAI token usage data and a simple callback handler. You need this when you want to know what each graph execution costs, which node is expensive, or when you need audit-friendly spend reporting for internal tools.

What You'll Need

  • Python 3.10+
  • langgraph
  • langchain-openai
  • langchain-core
  • An OpenAI API key in OPENAI_API_KEY
  • Basic familiarity with LangGraph nodes and state
  • A terminal with pip

Install the packages first:

pip install langgraph langchain-openai langchain-core

Step-by-Step

  1. Set up a minimal graph state and a cost tracker.
    We’ll keep the state small and track total cost in a plain Python object. For beginners, this is easier than wiring cost into the graph state itself.
from typing import TypedDict, Annotated
from langgraph.graph import StateGraph, START, END

class GraphState(TypedDict):
    prompt: str
    response: str

class CostTracker:
    def __init__(self):
        self.total_cost = 0.0
        self.runs = []

    def add(self, run_name: str, prompt_tokens: int, completion_tokens: int, model: str):
        input_cost = (prompt_tokens / 1_000_000) * 5.0
        output_cost = (completion_tokens / 1_000_000) * 15.0
        run_cost = input_cost + output_cost
        self.total_cost += run_cost
        self.runs.append(
            {"run_name": run_name, "model": model, "cost": run_cost}
        )
  1. Create an LLM call that records token usage after each invocation.
    OpenAI responses include token counts in response_metadata. We’ll use that metadata to calculate cost for each node call.
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o-mini")

tracker = CostTracker()

def generate_response(state: GraphState) -> GraphState:
    result = llm.invoke(state["prompt"])
    usage = result.response_metadata.get("token_usage", {})
    prompt_tokens = usage.get("prompt_tokens", 0)
    completion_tokens = usage.get("completion_tokens", 0)

    tracker.add(
        run_name="generate_response",
        prompt_tokens=prompt_tokens,
        completion_tokens=completion_tokens,
        model="gpt-4o-mini",
    )

    return {"prompt": state["prompt"], "response": result.content}
  1. Build the LangGraph and compile it.
    This graph has one node, which is enough to show the pattern clearly. You can copy the same tracking approach into every node later.
workflow = StateGraph(GraphState)

workflow.add_node("generate_response", generate_response)
workflow.add_edge(START, "generate_response")
workflow.add_edge("generate_response", END)

app = workflow.compile()
  1. Run the graph and print the tracked cost.
    After execution, inspect both the response and the accumulated spend. This is the part you can wire into logs or metrics later.
if __name__ == "__main__":
    result = app.invoke({"prompt": "Write one sentence explaining what LangGraph does.", "response": ""})

    print("Response:", result["response"])
    print("Total cost:", round(tracker.total_cost, 8))

    for item in tracker.runs:
        print(item)
  1. Make it more useful by tracking per-node costs separately.
    In real graphs, you usually want node-level visibility so you can see which step is expensive. The simplest way is to give each node its own function and call tracker.add() inside each one.
def summarize(state: GraphState) -> GraphState:
    result = llm.invoke(f"Summarize this in one line: {state['prompt']}")
    usage = result.response_metadata.get("token_usage", {})
    tracker.add(
        run_name="summarize",
        prompt_tokens=usage.get("prompt_tokens", 0),
        completion_tokens=usage.get("completion_tokens", 0),
        model="gpt-4o-mini",
    )
    return {"prompt": state["prompt"], "response": result.content}

Testing It

Run the script with a short prompt first so you can see a non-zero token count without spending much. If everything is wired correctly, you should get a normal model response plus a small total cost printed at the end.

If token_usage comes back empty, check that you are using a chat model that returns usage metadata through LangChain’s OpenAI wrapper. Also confirm your OPENAI_API_KEY is set in the environment before running the script.

For real validation, call the graph several times with different prompt lengths and watch the total cost increase accordingly. Longer prompts should produce higher prompt-token counts and higher tracked spend.

Next Steps

  • Add a custom callback handler so cost tracking works across every LLM call without editing each node.
  • Store per-run metrics in Postgres or SQLite instead of keeping them in memory.
  • Extend the tracker to support multiple models with different pricing tables.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides