LangChain Tutorial (TypeScript): building a RAG pipeline for beginners
This tutorial builds a minimal Retrieval-Augmented Generation (RAG) pipeline in TypeScript using LangChain. You’ll load documents, split them into chunks, embed them into a vector store, and answer questions with retrieved context instead of relying on the model’s memory alone.
What You'll Need
- •Node.js 18+
- •A TypeScript project with
ts-nodeortsx - •An OpenAI API key set as
OPENAI_API_KEY - •These packages:
- •
langchain - •
@langchain/openai - •
@langchain/community - •
@langchain/core - •
typescript - •
tsxorts-node
- •
Install everything with:
npm install langchain @langchain/openai @langchain/community @langchain/core
npm install -D typescript tsx @types/node
Step-by-Step
- •Start by creating an embeddings model and a chat model. In RAG, embeddings power retrieval, while the chat model turns retrieved context into an answer.
import { ChatOpenAI, OpenAIEmbeddings } from "@langchain/openai";
const embeddings = new OpenAIEmbeddings({
model: "text-embedding-3-small",
});
const llm = new ChatOpenAI({
model: "gpt-4o-mini",
temperature: 0,
});
- •Next, load a few documents and split them into chunks. Chunking matters because retrieval works better on smaller pieces than on long raw files.
import { Document } from "@langchain/core/documents";
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
const docs = [
new Document({
pageContent:
"RAG combines retrieval and generation. The retriever fetches relevant chunks from a vector store.",
metadata: { source: "notes-1" },
}),
new Document({
pageContent:
"A vector store keeps embeddings for each chunk. Similarity search returns the closest matches to a query.",
metadata: { source: "notes-2" },
}),
];
const splitter = new RecursiveCharacterTextSplitter({
chunkSize: 100,
chunkOverlap: 20,
});
const splitDocs = await splitter.splitDocuments(docs);
console.log(`Split into ${splitDocs.length} chunks`);
- •Now store those chunks in a vector database. For beginners, an in-memory vector store is enough to understand the flow without adding infrastructure.
import { MemoryVectorStore } from "langchain/vectorstores/memory";
const vectorStore = await MemoryVectorStore.fromDocuments(
splitDocs,
embeddings
);
const retriever = vectorStore.asRetriever(2);
- •Build the retrieval chain and tell the model how to use the context. This is the core RAG pattern: retrieve first, then generate using only the retrieved text.
import {
ChatPromptTemplate,
MessagesPlaceholder,
} from "@langchain/core/prompts";
import {
RunnablePassthrough,
RunnableSequence,
} from "@langchain/core/runnables";
import { StringOutputParser } from "@langchain/core/output_parsers";
const prompt = ChatPromptTemplate.fromMessages([
[
"system",
"Answer only using the provided context. If the context does not contain the answer, say you don't know.",
],
["human", "Question: {question}\n\nContext:\n{context}"],
]);
const formatDocs = (docs: any[]) =>
docs.map((doc) => doc.pageContent).join("\n\n");
const ragChain = RunnableSequence.from([
{
question: new RunnablePassthrough(),
context: retriever.pipe(formatDocs),
},
prompt,
llm,
new StringOutputParser(),
]);
- •Finally, ask a question and print the result. Keep your first test simple so you can verify retrieval is actually happening before you expand the pipeline.
const question = "What does RAG combine?";
const answer = await ragChain.invoke(question);
console.log("Question:", question);
console.log("Answer:", answer);
Testing It
Run the file with tsx or your preferred TypeScript runner:
npx tsx src/rag.ts
If everything is wired correctly, the output should mention that RAG combines retrieval and generation. Try asking something that exists in one of your chunks, then ask something unrelated like “What is my bank balance?”; the chain should respond that it does not know.
If retrieval looks wrong, print the retrieved chunks before sending them to the model. That tells you whether the issue is chunking, embeddings, or prompt design.
Next Steps
- •Replace
MemoryVectorStorewith a persistent store like Pinecone, pgvector, or Weaviate. - •Add metadata filters so you can scope retrieval by tenant, product line, or document type.
- •Learn LangChain’s LCEL composition patterns so you can add reranking, query rewriting, and citations cleanly
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit