AutoGen Tutorial (TypeScript): caching embeddings for beginners

By Cyprian AaronsUpdated 2026-04-21

autogencaching-embeddings-for-beginnerstypescript

This tutorial shows you how to cache embeddings in a TypeScript AutoGen app so repeated document lookups stop paying the embedding cost every time. You need this when your agent re-processes the same chunks across runs, because caching cuts latency and reduces API spend.

What You'll Need

•Node.js 18+
•A TypeScript project with ts-node or a build step
•
AutoGen for TypeScript:
- •@autogenai/autogen
•
An OpenAI-compatible embedding model API key
- •OPENAI_API_KEY
•
A place to store cached embeddings
- •For this tutorial, a local JSON file is enough
•
Basic familiarity with:
- •embeddings
- •chunks
- •async/await

Step-by-Step

•Install the dependencies and set up your environment.
We only need AutoGen plus a small filesystem helper for the cache file.

npm install @autogenai/autogen dotenv
npm install -D typescript ts-node @types/node

•Create a simple cache layer for embeddings.
This cache uses a stable key built from the model name and chunk text, then stores vectors in a JSON file.

// cache.ts
import fs from "node:fs/promises";
import path from "node:path";

const CACHE_FILE = path.resolve(process.cwd(), ".embedding-cache.json");

export type EmbeddingCache = Record<string, number[]>;

export async function loadCache(): Promise<EmbeddingCache> {
  try {
    const raw = await fs.readFile(CACHE_FILE, "utf8");
    return JSON.parse(raw) as EmbeddingCache;
  } catch {
    return {};
  }
}

export async function saveCache(cache: EmbeddingCache): Promise<void> {
  await fs.writeFile(CACHE_FILE, JSON.stringify(cache, null, 2), "utf8");
}

export function makeCacheKey(model: string, text: string): string {
  return `${model}:${text.trim().replace(/\s+/g, " ")}`;
}

•Wire AutoGen’s embedding client into a cached wrapper.
The wrapper checks the cache first, calls the model only on misses, and persists new vectors immediately.

// embeddings.ts
import "dotenv/config";
import { OpenAIClient } from "@autogenai/autogen";
import { loadCache, saveCache, makeCacheKey } from "./cache";

const client = new OpenAIClient({
  apiKey: process.env.OPENAI_API_KEY!,
});

const EMBEDDING_MODEL = "text-embedding-3-small";

export async function embedText(text: string): Promise<number[]> {
  const cache = await loadCache();
  const key = makeCacheKey(EMBEDDING_MODEL, text);

  if (cache[key]) return cache[key];

  const result = await client.embeddings.create({
    model: EMBEDDING_MODEL,
    input: text,
  });

  const vector = result.data[0].embedding;
  cache[key] = vector;
  await saveCache(cache);

  return vector;
}

•Use the cached embeddings in an AutoGen workflow.
This example embeds two chunks twice; the second pass should hit the cache instead of calling the API again.

// index.ts
import { embedText } from "./embeddings";

async function main() {
  const chunks = [
    "AutoGen agents can use embeddings to retrieve relevant context.",
    "Caching embeddings avoids recomputing vectors for repeated text.",
  ];

  for (const chunk of chunks) {
    const vector = await embedText(chunk);
    console.log("vector length:", vector.length);
  }

  for (const chunk of chunks) {
    const vector = await embedText(chunk);
    console.log("cached vector length:", vector.length);
  }
}

main().catch((err) => {
  console.error(err);
  process.exit(1);
});

•Add one small guard so whitespace changes don’t break your cache hit rate.
In real systems, chunking often introduces extra spaces and line breaks, so normalize before hashing or building the key.

// normalize.ts
export function normalizeText(text: string): string {
  return text.trim().replace(/\s+/g, " ");
}

Then use it in your key builder:

// cache.ts
import fs from "node:fs/promises";
import path from "node:path";
import { normalizeText } from "./normalize";

const CACHE_FILE = path.resolve(process.cwd(), ".embedding-cache.json");

export type EmbeddingCache = Record<string, number[]>;

export async function loadCache(): Promise<EmbeddingCache> {
  try {
    const raw = await fs.readFile(CACHE_FILE, "utf8");
    return JSON.parse(raw) as EmbeddingCache;
  } catch {
    return {};
  }
}

export async function saveCache(cache: EmbeddingCache): Promise<void> {
  await fs.writeFile(CACHE_FILE, JSON.stringify(cache, null, 2), "utf8");
}

export function makeCacheKey(model: string, text: string): string {
  return `${model}:${normalizeText(text)}`;
}

Testing It

Run the script once and watch it create .embedding-cache.json in your project root. The first pass should call the embedding API for each chunk.

Run it again with the same input. This time you should see the same vector lengths printed, but no new entries added to the cache file.

If you want to confirm behavior more directly, add a log inside embedText() for "cache hit" and "cache miss". In production code, that log should become metrics instead of console output.

Next Steps

•Replace the JSON file with Redis or Postgres when you need shared caching across workers.
•Add TTL or versioning so you can invalidate cached vectors when you change models.
•Extend this pattern to chunk hashes so you cache by document content instead of raw text strings.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit