LangChain Tutorial (TypeScript): caching embeddings for intermediate developers

By Cyprian AaronsUpdated 2026-04-21

langchaincaching-embeddings-for-intermediate-developerstypescript

This tutorial shows how to cache embeddings in a LangChain TypeScript app so repeated requests for the same text don’t keep hitting your embedding provider. You need this when you’re indexing the same documents multiple times, re-running tests, or rebuilding retrieval pipelines where embedding calls are one of the slowest and most expensive parts.

What You'll Need

•Node.js 18+
•TypeScript 5+
•A LangChain TypeScript project already set up
•An OpenAI API key in OPENAI_API_KEY
•
These packages:
- •langchain
- •@langchain/openai
- •@langchain/community
- •typescript
- •tsx or another TypeScript runner

Step-by-Step

•Start with a normal embedding model. This is the base object LangChain will wrap with a cache layer, so keep it simple and deterministic.

import "dotenv/config";
import { OpenAIEmbeddings } from "@langchain/openai";

const embeddings = new OpenAIEmbeddings({
  model: "text-embedding-3-small",
});

async function main() {
  const vector = await embeddings.embedQuery("LangChain caching example");
  console.log(vector.length);
}

main().catch(console.error);

•Add a cache store. For local development, MemoryVectorStore is enough to prove the pattern, but in production you usually want Redis or another shared store. The important part is that the cache can remember vectors by text content.

import "dotenv/config";
import { OpenAIEmbeddings } from "@langchain/openai";
import { CacheBackedEmbeddings } from "langchain/embeddings/cache_backed";
import { LocalFileStore } from "@langchain/community/storage/file_system";

const underlying = new OpenAIEmbeddings({
  model: "text-embedding-3-small",
});

const store = new LocalFileStore("./embedding-cache");

const cachedEmbeddings = CacheBackedEmbeddings.fromBytesStore(
  underlying,
  store,
  {
    namespace: "openai-text-embedding-3-small",
  }
);

async function main() {
  const vector = await cachedEmbeddings.embedQuery("LangChain caching example");
  console.log(vector.length);
}

main().catch(console.error);

•Embed multiple documents through the cached wrapper. The first call will populate the cache, and later calls with the same texts should reuse stored vectors instead of calling OpenAI again.

import "dotenv/config";
import { OpenAIEmbeddings } from "@langchain/openai";
import { CacheBackedEmbeddings } from "langchain/embeddings/cache_backed";
import { LocalFileStore } from "@langchain/community/storage/file_system";

const embeddings = new OpenAIEmbeddings({
  model: "text-embedding-3-small",
});

const store = new LocalFileStore("./embedding-cache");

const cachedEmbeddings = CacheBackedEmbeddings.fromBytesStore(
  embeddings,
  store,
  { namespace: "docs-v1" }
);

async function main() {
  const texts = [
    "Claims processing workflow",
    "Underwriting risk review",
    "Claims processing workflow",
  ];

  const vectors = await cachedEmbeddings.embedDocuments(texts);
  console.log(vectors.map((v) => v.length));
}

main().catch(console.error);

•Use the same cached embeddings inside a retriever pipeline. This is where caching starts paying off, because document ingestion often gets repeated across deployments, test runs, or backfills.

import "dotenv/config";
import { Document } from "@langchain/core/documents";
import { MemoryVectorStore } from "langchain/vectorstores/memory";
import { OpenAIEmbeddings } from "@langchain/openai";
import { CacheBackedEmbeddings } from "langchain/embeddings/cache_backed";
import { LocalFileStore } from "@langchain/community/storage/file_system";

const baseEmbeddings = new OpenAIEmbeddings({
  model: "text-embedding-3-small",
});

const cacheStore = new LocalFileStore("./embedding-cache");

const embeddings = CacheBackedEmbeddings.fromBytesStore(
  baseEmbeddings,
  cacheStore,
  { namespace: "policy-docs-v1" }
);

async function main() {
  const docs = [
    new Document({ pageContent: "A claim must be reviewed within five business days." }),
    new Document({ pageContent: "Policyholders can request a coverage summary." }),
  ];

  const vectorStore = await MemoryVectorStore.fromDocuments(docs, embeddings);
  const results = await vectorStore.similaritySearch("claim review timeline", 1);

  console.log(results[0].pageContent);
}

main().catch(console.error);

•If you want this to survive process restarts cleanly, keep the namespace stable and version it when your source data changes. A namespace like policy-docs-v1 means old cached vectors won’t collide with a new embedding model or a major document rewrite.

function getEmbeddingNamespace(model: string, datasetVersion: string) {
  return `${model}:${datasetVersion}`;
}

console.log(getEmbeddingNamespace("text-embedding-3-small", "policy-docs-v1"));
console.log(getEmbeddingNamespace("text-embedding-3-small", "policy-docs-v2"));

Testing It

Run the script twice with the same inputs. On the first run, LangChain will create cache files under ./embedding-cache; on the second run, it should reuse those entries and avoid recomputing identical embeddings.

If you want to verify behavior more aggressively, add timing around embedDocuments() and compare first-run vs second-run latency. You can also inspect your API usage in the OpenAI dashboard to confirm repeated runs stop increasing embedding calls for unchanged text.

For a production-style test, change one document string and leave the others untouched. Only the modified text should trigger a fresh embedding request if your cache keying is working correctly.

Next Steps

•Swap LocalFileStore for Redis-backed storage so multiple app instances share the same embedding cache.
•Add namespace versioning tied to your document pipeline so stale vectors don’t survive schema or content changes.
•Cache chunked document embeddings during ingestion before building your vector index, not after retrieval starts failing.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit