Haystack Tutorial (TypeScript): adding cost tracking for intermediate developers

By Cyprian AaronsUpdated 2026-04-21
haystackadding-cost-tracking-for-intermediate-developerstypescript

This tutorial shows you how to add cost tracking to a Haystack TypeScript pipeline so every LLM call is measured and logged with token usage and estimated spend. You need this when you’re running multi-step agents, comparing model costs, or putting guardrails around production usage.

What You'll Need

  • Node.js 18+
  • A TypeScript project with ts-node or a build step already set up
  • Haystack TypeScript package installed
  • An OpenAI API key in OPENAI_API_KEY
  • A terminal where you can run the script and inspect JSON output
  • Basic familiarity with Haystack pipelines, generators, and components

Step-by-Step

  1. Install the packages you need. For this example, we’ll use Haystack plus OpenAI’s SDK under the hood so we can read token usage from the response metadata.
npm install @haystack-ai/core openai dotenv
npm install -D typescript tsx @types/node
  1. Set your environment variable and define a small helper for cost calculation. Keep the pricing table in code at first; once this is working, move it to config or a database.
import "dotenv/config";

type ModelPricing = {
  inputPer1M: number;
  outputPer1M: number;
};

const PRICING: Record<string, ModelPricing> = {
  "gpt-4o-mini": { inputPer1M: 0.15, outputPer1M: 0.6 },
};

export function estimateCost(
  model: string,
  inputTokens: number,
  outputTokens: number,
): number {
  const pricing = PRICING[model];
  if (!pricing) return 0;

  return (
    (inputTokens / 1_000_000) * pricing.inputPer1M +
    (outputTokens / 1_000_000) * pricing.outputPer1M
  );
}
  1. Build a simple Haystack pipeline that calls an LLM and returns the raw response. The important part is that we keep the model name explicit so the cost calculator can map usage to pricing.
import { Pipeline } from "@haystack-ai/core";
import { OpenAIChatGenerator } from "@haystack-ai/core/components/generators/openai";

const pipeline = new Pipeline();

pipeline.addComponent("generator", new OpenAIChatGenerator({
  apiKey: process.env.OPENAI_API_KEY!,
  model: "gpt-4o-mini",
}));

pipeline.connect("generator.replies", "output");

const result = await pipeline.run({
  generator: {
    messages: [{ role: "user", content: "Summarize why token tracking matters in one sentence." }],
  },
});

console.log(JSON.stringify(result, null, 2));
  1. Add a wrapper that extracts token usage from the model response and computes cost after each run. In production, this is where you would send metrics to Datadog, Prometheus, CloudWatch, or your audit store.
import { estimateCost } from "./cost.js";

type Usage = {
  prompt_tokens?: number;
  completion_tokens?: number;
};

function readUsage(rawResult: unknown): { inputTokens: number; outputTokens: number } {
  const result = rawResult as {
    generator?: {
      replies?: Array<{
        meta?: { usage?: Usage };
      }>;
    };
  };

  const usage = result.generator?.replies?.[0]?.meta?.usage;
  return {
    inputTokens: usage?.prompt_tokens ?? 0,
    outputTokens: usage?.completion_tokens ?? 0,
  };
}

const { inputTokens, outputTokens } = readUsage(result);
const costUSD = estimateCost("gpt-4o-mini", inputTokens, outputTokens);

console.log({
  model: "gpt-4o-mini",
  inputTokens,
  outputTokens,
  costUSD: Number(costUSD.toFixed(6)),
});
  1. Put it together in one executable script so you can run it locally and see both the answer and its estimated cost. This version is what I’d use as a starting point before wiring metrics into a real service boundary.
import "dotenv/config";
import { Pipeline } from "@haystack-ai/core";
import { OpenAIChatGenerator } from "@haystack-ai/core/components/generators/openai";

const PRICING = {
  "gpt-4o-mini": { inputPer1M: 0.15, outputPer1M: 0.6 },
} as const;

function estimateCost(model: keyof typeof PRICING, inputTokens: number, outputTokens: number) {
  const pricing = PRICING[model];
  return (inputTokens / 1_000_000) * pricing.inputPer1M +
         (outputTokens / 1_000_000) * pricing.outputPer1M;
}

const pipeline = new Pipeline();
pipeline.addComponent("generator", new OpenAIChatGenerator({
  apiKey: process.env.OPENAI_API_KEY!,
  model: "gpt-4o-mini",
}));

const result = await pipeline.run({
  generator: {
    messages: [{ role: "user", content: "Give me one practical reason to track LLM costs." }],
  },
});

const reply = result.generator.replies[0];
const usage = reply.meta?.usage ?? {};
const inputTokens = usage.prompt_tokens ?? usage.input_tokens ?? 0;
const outputTokens = usage.completion_tokens ?? usage.output_tokens ?? reply.text.split(/\s+/).length;

console.log(reply.text);
console.log({
  model: "gpt-4o-mini",
  inputTokens,
  outputTokens,
  costUSD: Number(estimateCost("gpt-4o-mini", inputTokens, outputTokens).toFixed(6)),
});

Testing It

Run the script with OPENAI_API_KEY set and confirm you get both an LLM answer and a JSON object with token counts plus costUSD. If inputTokens comes back as 0, inspect the raw reply.meta object because different provider adapters expose usage under slightly different keys.

Try three prompts of different lengths and compare the reported cost; longer prompts should increase inputTokens, while more verbose answers should increase outputTokens. If you want a stronger test, log the raw result object once and verify that your parsing logic matches the exact structure returned by your chosen generator component.

Next Steps

  • Move pricing into configuration so finance or platform teams can update rates without code changes.
  • Emit cost events to your observability stack with fields like tenantId, workflowName, model, inputTokens, and costUSD.
  • Add per-request budgets and reject runs that exceed a maximum estimated spend before calling the model.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides