LlamaIndex Tutorial (TypeScript): chunking large documents for intermediate developers

By Cyprian AaronsUpdated 2026-04-21
llamaindexchunking-large-documents-for-intermediate-developerstypescript

This tutorial shows how to split large documents into chunks with LlamaIndex in TypeScript, then feed those chunks into an index you can query. You need this when your source files are too large for a single prompt, or when retrieval quality drops because the model is seeing too much irrelevant text at once.

What You'll Need

  • Node.js 18+
  • A TypeScript project with ts-node or a build step
  • An OpenAI API key set as OPENAI_API_KEY
  • These packages:
    • llamaindex
    • dotenv
    • typescript
    • ts-node if you want to run TypeScript directly

Install them:

npm install llamaindex dotenv
npm install -D typescript ts-node @types/node

Step-by-Step

  1. Start by creating a small TypeScript entry file and loading environment variables. This keeps your API key out of code and gives you a clean place to wire the pipeline together.
import "dotenv/config";
import { Document, SimpleDirectoryReader } from "llamaindex";

async function main() {
  const reader = new SimpleDirectoryReader();
  const documents = await reader.loadData({
    directoryPath: "./data",
  });

  console.log(`Loaded ${documents.length} documents`);
}

main().catch(console.error);
  1. Next, define chunking behavior explicitly. For large documents, the default settings are often fine, but in production you want control over chunk size and overlap so retrieval has enough context without bloating embeddings.
import "dotenv/config";
import { Document, Settings, SentenceSplitter } from "llamaindex";

Settings.chunkSize = 800;
Settings.chunkOverlap = 120;
Settings.nodeParser = new SentenceSplitter({
  chunkSize: Settings.chunkSize,
  chunkOverlap: Settings.chunkOverlap,
});

const doc = new Document({
  text: `Your very large document text goes here...`,
  metadata: { source: "policy-handbook.md" },
});

console.log("Chunking configured");
  1. Now split the document into nodes. This is the part that actually turns one long document into retrievable pieces, which is what you want before indexing.
import "dotenv/config";
import {
  Document,
  SentenceSplitter,
  Settings,
} from "llamaindex";

Settings.nodeParser = new SentenceSplitter({
  chunkSize: 800,
  chunkOverlap: 120,
});

async function main() {
  const doc = new Document({
    text: `
      Section 1: Claims handling...
      Section 2: Underwriting rules...
      Section 3: Exceptions...
    `,
    metadata: { source: "manual.txt" },
  });

  const nodes = await Settings.nodeParser.getNodesFromDocuments([doc]);
  console.log(`Created ${nodes.length} chunks`);

  for (const node of nodes.slice(0, 3)) {
    console.log(node.text.slice(0, 120));
    console.log("---");
  }
}

main().catch(console.error);
  1. Build an index from those chunks and query it. Once the document is chunked, LlamaIndex stores each piece separately so retrieval can pull back only the relevant sections instead of the whole file.
import "dotenv/config";
import {
  Document,
  VectorStoreIndex,
} from "llamaindex";

async function main() {
  const doc = new Document({
    text: `
      Claims must be acknowledged within two business days.
      Escalations require manager approval.
      Fraud indicators should be logged immediately.
    `,
    metadata: { source: "claims-policy.md" },
  });

  const index = await VectorStoreIndex.fromDocuments([doc]);
  
నconst queryEngine = index.asQueryEngine();
const response = await queryEngine.query({
    query: "What is the escalation process?",
});

console.log(response.toString());
}

main().catch(console.error);
  1. For real workloads, load files from disk and chunk them before indexing. This keeps your code close to how teams actually use LlamaIndex in internal knowledge bases and policy search tools.
import "dotenv/config";
import { SimpleDirectoryReader, VectorStoreIndex } from "llamaindex";

async function main() {
  const reader = new SimpleDirectoryReader();
  const docs = await reader.loadData({ directoryPath: "./data" });

  const index = await VectorStoreIndex.fromDocuments(docs);
  const queryEngine = index.asQueryEngine();

const response = await queryEngine.query({
    query: "Summarize the exceptions policy",
});

console.log(response.toString());
}

main().catch(console.error);

Testing It

Run the script against a few long .txt or .md files in ./data. You should see the number of chunks created increase as document length grows, while queries still return focused answers instead of dumping entire documents back at you.

If retrieval looks noisy, reduce chunkSize or increase chunkOverlap slightly and test again. If answers miss context near section boundaries, increase overlap first before making chunks larger.

A good smoke test is to ask questions that target specific sections, like “What are the escalation rules?” or “What exceptions are listed?” If the answer cites the right part of the document consistently, your chunking setup is doing its job.

Next Steps

  • Add metadata filters so queries can target a specific source file or department
  • Swap in a persistent vector store for production workloads
  • Experiment with different splitters for PDFs, markdown manuals, and OCR’d text

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides