LangChain Tutorial (TypeScript): building a RAG pipeline for beginners

By Cyprian AaronsUpdated 2026-04-21

langchainbuilding-a-rag-pipeline-for-beginnerstypescript

This tutorial builds a minimal Retrieval-Augmented Generation (RAG) pipeline in TypeScript using LangChain. You’ll load documents, split them into chunks, embed them into a vector store, and answer questions with retrieved context instead of relying on the model’s memory alone.

What You'll Need

•Node.js 18+
•A TypeScript project with ts-node or tsx
•An OpenAI API key set as OPENAI_API_KEY
•
These packages:
- •langchain
- •@langchain/openai
- •@langchain/community
- •@langchain/core
- •typescript
- •tsx or ts-node

Install everything with:

npm install langchain @langchain/openai @langchain/community @langchain/core
npm install -D typescript tsx @types/node

Step-by-Step

•Start by creating an embeddings model and a chat model. In RAG, embeddings power retrieval, while the chat model turns retrieved context into an answer.

import { ChatOpenAI, OpenAIEmbeddings } from "@langchain/openai";

const embeddings = new OpenAIEmbeddings({
  model: "text-embedding-3-small",
});

const llm = new ChatOpenAI({
  model: "gpt-4o-mini",
  temperature: 0,
});

•Next, load a few documents and split them into chunks. Chunking matters because retrieval works better on smaller pieces than on long raw files.

import { Document } from "@langchain/core/documents";
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";

const docs = [
  new Document({
    pageContent:
      "RAG combines retrieval and generation. The retriever fetches relevant chunks from a vector store.",
    metadata: { source: "notes-1" },
  }),
  new Document({
    pageContent:
      "A vector store keeps embeddings for each chunk. Similarity search returns the closest matches to a query.",
    metadata: { source: "notes-2" },
  }),
];

const splitter = new RecursiveCharacterTextSplitter({
  chunkSize: 100,
  chunkOverlap: 20,
});

const splitDocs = await splitter.splitDocuments(docs);
console.log(`Split into ${splitDocs.length} chunks`);

•Now store those chunks in a vector database. For beginners, an in-memory vector store is enough to understand the flow without adding infrastructure.

import { MemoryVectorStore } from "langchain/vectorstores/memory";

const vectorStore = await MemoryVectorStore.fromDocuments(
  splitDocs,
  embeddings
);

const retriever = vectorStore.asRetriever(2);

•Build the retrieval chain and tell the model how to use the context. This is the core RAG pattern: retrieve first, then generate using only the retrieved text.

import {
  ChatPromptTemplate,
  MessagesPlaceholder,
} from "@langchain/core/prompts";
import {
  RunnablePassthrough,
  RunnableSequence,
} from "@langchain/core/runnables";
import { StringOutputParser } from "@langchain/core/output_parsers";

const prompt = ChatPromptTemplate.fromMessages([
  [
    "system",
    "Answer only using the provided context. If the context does not contain the answer, say you don't know.",
  ],
  ["human", "Question: {question}\n\nContext:\n{context}"],
]);

const formatDocs = (docs: any[]) =>
  docs.map((doc) => doc.pageContent).join("\n\n");

const ragChain = RunnableSequence.from([
  {
    question: new RunnablePassthrough(),
    context: retriever.pipe(formatDocs),
  },
  prompt,
  llm,
  new StringOutputParser(),
]);

•Finally, ask a question and print the result. Keep your first test simple so you can verify retrieval is actually happening before you expand the pipeline.

const question = "What does RAG combine?";
const answer = await ragChain.invoke(question);

console.log("Question:", question);
console.log("Answer:", answer);

Testing It

Run the file with tsx or your preferred TypeScript runner:

npx tsx src/rag.ts

If everything is wired correctly, the output should mention that RAG combines retrieval and generation. Try asking something that exists in one of your chunks, then ask something unrelated like “What is my bank balance?”; the chain should respond that it does not know.

If retrieval looks wrong, print the retrieved chunks before sending them to the model. That tells you whether the issue is chunking, embeddings, or prompt design.

Next Steps

•Replace MemoryVectorStore with a persistent store like Pinecone, pgvector, or Weaviate.
•Add metadata filters so you can scope retrieval by tenant, product line, or document type.
•Learn LangChain’s LCEL composition patterns so you can add reranking, query rewriting, and citations cleanly

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit