Haystack Tutorial (TypeScript): caching embeddings for beginners

By Cyprian AaronsUpdated 2026-04-21

haystackcaching-embeddings-for-beginnerstypescript

This tutorial shows you how to cache text embeddings in a TypeScript Haystack pipeline so repeated documents do not get re-embedded on every run. You need this when your ingestion job reprocesses the same content often, because embedding calls are slow, expensive, and usually the first thing to optimize.

What You'll Need

•Node.js 18+ and npm
•A TypeScript project with ts-node or a build step
•
Haystack TypeScript packages:
- •@haystack/core
- •@haystack/integrations
•An embedding model provider package supported by your setup
•An API key for your embedding provider
•
A place to persist cache data:
- •simple local JSON file for learning
- •Redis, PostgreSQL, or a document store for production

Step-by-Step

•Start with a plain embedding function and a deterministic cache key.
The key should be based on the exact text you embed plus the model name, otherwise you will return stale vectors when you switch models.

import crypto from "node:crypto";

function makeEmbeddingCacheKey(model: string, text: string): string {
  return crypto.createHash("sha256").update(`${model}:${text}`).digest("hex");
}

const modelName = "text-embedding-3-small";
const sampleText = "Policy number 12345 was updated yesterday.";

console.log(makeEmbeddingCacheKey(modelName, sampleText));

•Create a small cache adapter around your storage layer.
For beginners, an in-memory Map is enough to prove the pattern. In production, swap the Map for Redis or a database table with the same get and set shape.

type EmbeddingVector = number[];

class EmbeddingCache {
  private store = new Map<string, EmbeddingVector>();

  get(key: string): EmbeddingVector | undefined {
    return this.store.get(key);
  }

  set(key: string, value: EmbeddingVector): void {
    this.store.set(key, value);
  }
}

const cache = new EmbeddingCache();
cache.set("abc", [0.12, 0.34, 0.56]);
console.log(cache.get("abc"));

•Wire the cache into a real Haystack embedding component.
The important part is that you check the cache before calling the model, then write the result back after a miss. This keeps your Haystack pipeline logic clean and makes caching invisible to downstream components.

import { OpenAITextEmbedder } from "@haystack/integrations/openai";

const embedder = new OpenAITextEmbedder({
  apiKey: process.env.OPENAI_API_KEY!,
  model: "text-embedding-3-small",
});

async function embedWithCache(text: string): Promise<number[]> {
  const key = makeEmbeddingCacheKey("text-embedding-3-small", text);
  const cached = cache.get(key);

  if (cached) return cached;

  const result = await embedder.run({ text });
  const vector = result.embedding;
  cache.set(key, vector);
  return vector;
}

•Use the cached embedder inside a simple ingestion loop.
This is where you get the real benefit: duplicate records skip the external API call entirely. Notice that repeated text values reuse the same cached vector even if they appear in different documents.

const texts = [
  "Customer requested a card replacement.",
  "Customer requested a card replacement.",
  "Claim status changed to pending review.",
];

async function main() {
  for (const text of texts) {
    const vector = await embedWithCache(text);
    console.log(text, vector.length);
  }
}

main().catch(console.error);

•Persist cache entries if you want them across restarts.
A local Map disappears when Node exits, which is fine for learning but useless in real pipelines. The production version should store { key, model, text_hash, embedding_json, created_at } in Redis or SQL so reruns stay warm.

type CacheRecord = {
  key: string;
  embedding: number[];
};

class FileBackedEmbeddingCache {
  private store = new Map<string, number[]>();

  get(key: string): number[] | undefined {
    return this.store.get(key);
  }

  set(record: CacheRecord): void {
    this.store.set(record.key, record.embedding);
  }
}

const persistentCache = new FileBackedEmbeddingCache();
persistentCache.set({ key: "demo", embedding: [1, 2, 3] });
console.log(persistentCache.get("demo"));

Testing It

Run the script twice with the same input texts. On the first run, every unique string should trigger an API call; on the second run, repeated strings should hit the cache and return immediately.

If you want proof beyond console output, add logging around the cache branch and count misses versus hits. For example, log "cache hit" when cache.get(key) returns data and "cache miss" right before calling embedder.run().

A good sanity check is changing only one character in the input text. That should produce a different hash and force a fresh embedding call.

Next Steps

•Replace the in-memory Map with Redis and add TTLs for stale embeddings
•Store embeddings by document chunk hash so re-indexing only processes changed chunks
•Add cache metrics like hit rate and average embedding latency to your pipeline logs

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit