LlamaIndex Tutorial (TypeScript): building a RAG pipeline for beginners
This tutorial shows you how to build a basic Retrieval-Augmented Generation (RAG) pipeline in TypeScript with LlamaIndex, using your own documents as context for an LLM. You’d use this when a plain chat model is not enough and you need answers grounded in company docs, policies, product specs, or support articles.
What You'll Need
- •Node.js 18+
- •A TypeScript project with
ts-nodeortsx - •These packages:
- •
llamaindex - •
dotenv
- •
- •An OpenAI API key
- •A small set of text files to index, stored locally in a folder like
./data
Step-by-Step
- •First install the dependencies and set up your environment variables. Keep the document source simple for now: plain
.txtfiles are enough to validate the full RAG flow.
npm init -y
npm install llamaindex dotenv
npm install -D typescript tsx @types/node
Create a .env file:
OPENAI_API_KEY=your_openai_api_key_here
- •Create a few sample documents that your RAG pipeline will search. The point is to give the retriever something real to work with, not to test against empty data.
mkdir -p data
cat > data/benefits.txt << 'EOF'
Employees are eligible for health insurance after 30 days.
Dental and vision coverage start on day one.
Remote workers receive a monthly home office stipend.
EOF
cat > data/leave-policy.txt << 'EOF'
Full-time employees get 20 days of paid leave per year.
Sick leave does not roll over into the next year.
Managers must approve leave requests at least 7 days in advance.
EOF
- •Build the index from those documents. This is the ingestion step: load files, chunk them, embed them, and store them in a searchable vector index.
import "dotenv/config";
import { SimpleDirectoryReader, VectorStoreIndex } from "llamaindex";
async function main() {
const docs = await new SimpleDirectoryReader().loadData({
directoryPath: "./data",
});
const index = await VectorStoreIndex.fromDocuments(docs);
console.log(`Indexed ${docs.length} documents`);
return index;
}
main();
- •Add retrieval and question answering on top of the index. This is the actual RAG part: retrieve relevant chunks first, then pass them into the LLM as context for the answer.
import "dotenv/config";
import {
SimpleDirectoryReader,
VectorStoreIndex,
} from "llamaindex";
async function main() {
const docs = await new SimpleDirectoryReader().loadData({
directoryPath: "./data",
});
const index = await VectorStoreIndex.fromDocuments(docs);
const queryEngine = index.asQueryEngine({
similarityTopK: 2,
});
const response = await queryEngine.query({
query: "How many vacation days do full-time employees get?",
});
console.log(response.toString());
}
main();
- •Wrap it in a reusable script so you can ask different questions without changing code every time. For beginners, keeping the app single-file is fine as long as it’s easy to run and inspect.
import "dotenv/config";
import {
SimpleDirectoryReader,
VectorStoreIndex,
} from "llamaindex";
async function ask(question: string) {
const docs = await new SimpleDirectoryReader().loadData({
directoryPath: "./data",
});
const index = await VectorStoreIndex.fromDocuments(docs);
const queryEngine = index.asQueryEngine({ similarityTopK: 2 });
const response = await queryEngine.query({ query: question });
console.log(`Q: ${question}`);
console.log(`A: ${response.toString()}`);
}
ask("When can employees start dental and vision coverage?");
- •Run the script with
tsx. If you want to keep iterating later, split ingestion and querying into separate commands so you don’t rebuild the index on every request.
npx tsx rag.ts
Testing It
Use questions that clearly map to your sample documents, like “How many vacation days do full-time employees get?” or “When does health insurance start?” The answer should come directly from the indexed text, not from generic model knowledge.
If the model hallucinates or gives vague answers, check three things first: your documents contain the fact, similarityTopK is high enough to retrieve it, and your API key is valid. Also verify that ./data points to the right folder and that your files are readable.
A good sanity check is asking one question whose answer exists in only one file and another question whose answer requires combining two facts from different files. If retrieval works, both should be answered with grounded text and minimal guessing.
Next Steps
- •Split ingestion from querying and persist the vector store instead of rebuilding it every run.
- •Add metadata to documents so you can filter by department, policy type, or date.
- •Replace plain text files with PDFs or HTML sources once your local pipeline is working.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit