LangChain Tutorial (TypeScript): building a RAG pipeline for intermediate developers

By Cyprian AaronsUpdated 2026-04-21
langchainbuilding-a-rag-pipeline-for-intermediate-developerstypescript

This tutorial builds a production-shaped Retrieval-Augmented Generation pipeline in TypeScript using LangChain: load documents, split them, embed them into a vector store, and answer user questions with retrieved context. You need this when a plain LLM is not enough and you want answers grounded in your own PDFs, docs, or knowledge base instead of model memory.

What You'll Need

  • Node.js 18+ installed
  • A TypeScript project initialized
  • An OpenAI API key exported as OPENAI_API_KEY
  • Packages:
    • langchain
    • @langchain/openai
    • @langchain/community
    • @langchain/core
    • typescript
    • tsx or ts-node for running TypeScript directly
  • A source document to index, for example:
    • PDF
    • .txt
    • Markdown files

Install the dependencies:

npm install langchain @langchain/openai @langchain/community @langchain/core
npm install -D typescript tsx @types/node

Step-by-Step

  1. Start by loading a document from disk and splitting it into chunks. RAG works better when you retrieve smaller, focused chunks instead of stuffing whole documents into the prompt.
import { TextLoader } from "@langchain/community/document_loaders/fs/text";
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";

async function main() {
  const loader = new TextLoader("./data/policy.txt");
  const docs = await loader.load();

  const splitter = new RecursiveCharacterTextSplitter({
    chunkSize: 800,
    chunkOverlap: 120,
  });

  const chunks = await splitter.splitDocuments(docs);
  console.log(`Loaded ${docs.length} doc(s), split into ${chunks.length} chunk(s)`);
}

main();
  1. Next, embed the chunks and store them in a vector database. For local development, an in-memory store is fine; in production, swap this for Pinecone, pgvector, or another persistent backend.
import { TextLoader } from "@langchain/community/document_loaders/fs/text";
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
import { OpenAIEmbeddings } from "@langchain/openai";
import { MemoryVectorStore } from "langchain/vectorstores/memory";

async function buildIndex() {
  const loader = new TextLoader("./data/policy.txt");
  const docs = await loader.load();

  const splitter = new RecursiveCharacterTextSplitter({
    chunkSize: 800,
    chunkOverlap: 120,
  });

  const chunks = await splitter.splitDocuments(docs);
  const embeddings = new OpenAIEmbeddings({ model: "text-embedding-3-small" });
  const vectorStore = await MemoryVectorStore.fromDocuments(chunks, embeddings);

  return vectorStore;
}

buildIndex().then(() => console.log("Vector index ready"));
  1. Now wire retrieval to an LLM. The retriever finds the most relevant chunks for a question, and the model uses those chunks as context to generate an answer.
import { ChatOpenAI } from "@langchain/openai";
import { createStuffDocumentsChain } from "langchain/chains/combine_documents";
import { createRetrievalChain } from "langchain/chains/retrieval";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { TextLoader } from "@langchain/community/document_loaders/fs/text";
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
import { OpenAIEmbeddings } from "@langchain/openai";
import { MemoryVectorStore } from "langchain/vectorstores/memory";

async function main() {
  const loader = new TextLoader("./data/policy.txt");
  const docs = await loader.load();

  const splitter = new RecursiveCharacterTextSplitter({ chunkSize: 800, chunkOverlap: 120 });
  const chunks = await splitter.splitDocuments(docs);

  const embeddings = new OpenAIEmbeddings({ model: "text-embedding-3-small" });
  const vectorStore = await MemoryVectorStore.fromDocuments(chunks, embeddings);
  const retriever = vectorStore.asRetriever(4);

  const llm = new ChatOpenAI({ model: "gpt-4o-mini", temperature: 0 });

  const prompt = ChatPromptTemplate.fromMessages([
    ["system", "Answer only using the provided context. If the context is insufficient, say you don't know."],
    ["human", "{input}"],
    ["system", "Context:\n{context}"],
  ]);

  const combineDocsChain = await createStuffDocumentsChain({ llm, prompt });
}
  1. Finish by invoking the retrieval chain with a real question. This is the part you will actually call from your app or API route.
import { ChatOpenAI } from "@langchain/openai";
import { createStuffDocumentsChain } from "langchain/chains/combine_documents";
import { createRetrievalChain } from "langchain/chains/retrieval";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { TextLoader } from "@langchain/community/document_loaders/fs/text";
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
import { OpenAIEmbeddings } from "@langchain/openai";
import { MemoryVectorStore } from "langchain/vectorstores/memory";

async function main() {
  const docs = await new TextLoader("./data/policy.txt").load();
  

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides