AutoGen Tutorial (TypeScript): chunking large documents for intermediate developers

By Cyprian AaronsUpdated 2026-04-21

autogenchunking-large-documents-for-intermediate-developerstypescript

This tutorial shows how to split a large document into token-safe chunks, send those chunks through AutoGen in TypeScript, and keep the output structured enough to use in downstream pipelines. You need this when a single document is too large for one model call, or when you want per-chunk summaries, extraction, or classification before merging results.

What You'll Need

•Node.js 18+ installed
•A TypeScript project with ts-node or a build step
•An OpenAI API key set in OPENAI_API_KEY
•
These packages:
- •@autogen-ai/core
- •@autogen-ai/openai
- •tiktoken
•A large text file to test with, such as a contract, policy, or report

Step-by-Step

•Start by installing the dependencies and setting up a minimal TypeScript project. The important part here is using tiktoken for token-aware chunking instead of splitting by character count.

npm init -y
npm install @autogen-ai/core @autogen-ai/openai tiktoken
npm install -D typescript ts-node @types/node
npx tsc --init

•Create a tokenizer-backed chunker. This keeps each chunk under a token limit and preserves sentence boundaries as much as possible. For production use, this is the difference between predictable behavior and random context-window failures.

import { encoding_for_model } from "tiktoken";

export function chunkText(text: string, maxTokens = 1200): string[] {
  const enc = encoding_for_model("gpt-4o-mini");
  const sentences = text.match(/[^.!?]+[.!?]+|\S+/g) ?? [text];
  const chunks: string[] = [];
  let current = "";
  let currentTokens = 0;

  for (const sentence of sentences) {
    const candidate = current ? `${current} ${sentence}` : sentence;
    const tokens = enc.encode(candidate).length;

    if (tokens > maxTokens && current) {
      chunks.push(current.trim());
      current = sentence;
      currentTokens = enc.encode(sentence).length;
    } else {
      current = candidate;
      currentTokens = tokens;
    }
  }

  if (current.trim()) chunks.push(current.trim());
  enc.free();
  return chunks;
}

•Set up an AutoGen agent that can summarize each chunk consistently. The key pattern is to keep the prompt narrow and force structured output so later aggregation is easier.

import { AssistantAgent } from "@autogen-ai/core";
import { OpenAIChatCompletionClient } from "@autogen-ai/openai";

const modelClient = new OpenAIChatCompletionClient({
  model: "gpt-4o-mini",
});

export const summarizer = new AssistantAgent({
  name: "summarizer",
  modelClient,
  systemMessage:
    "You summarize document chunks for downstream processing. Return concise bullet points only.",
});

•Wire the chunker and agent together in an executable script. This example reads a file, splits it, summarizes each chunk, and prints the results with chunk indexes so you can trace where each summary came from.

import fs from "node:fs/promises";
import { chunkText } from "./chunkText";
import { summarizer } from "./summarizer";

async function main() {
  const inputPath = process.argv[2];
  if (!inputPath) throw new Error("Usage: ts-node index.ts <file-path>");

  const text = await fs.readFile(inputPath, "utf8");
  const chunks = chunkText(text, 1200);

  console.log(`Chunks created: ${chunks.length}`);

  for (let i = 0; i < chunks.length; i++) {
    const result = await summarizer.run([
      {
        role: "user",
        content: `Summarize chunk ${i + 1}/${chunks.length}:\n\n${chunks[i]}`,
      },
    ]);

    console.log(`\n=== Chunk ${i + 1} ===`);
    console.log(result.messages.at(-1)?.content);
  }
}

main().catch((err) => {
  console.error(err);
  process.exit(1);
});

•If you need a final merged summary, add a second pass over the chunk summaries. This is the standard map-reduce pattern for long documents: map each chunk to a compact representation, then reduce those representations into one answer.

import { AssistantAgent } from "@autogen-ai/core";
import { OpenAIChatCompletionClient } from "@autogen-ai/openai";

const reducerClient = new OpenAIChatCompletionClient({ model: "gpt-4o-mini" });

const reducer = new AssistantAgent({
  name: "reducer",
  modelClient: reducerClient,
  systemMessage:
    "You combine multiple chunk summaries into one concise final summary with no filler.",
});

async function mergeSummaries(summaries: string[]) {
  const result = await reducer.run([
    {
      role: "user",
      content: `Combine these summaries into one final summary:\n\n${summaries.join("\n\n")}`,
    },
  ]);

  return result.messages.at(-1)?.content ?? "";
}

Testing It

Run the script against a real document that is clearly larger than your model’s comfortable context window. You should see multiple chunks printed first, then one summary per chunk.

Check that no single chunk causes token-limit errors and that the summaries stay focused on the content of each section. If your output looks noisy, lower the chunk size or tighten the system message so the agent produces shorter bullets.

A good sanity check is to compare the merged summary against the source document’s table of contents or section headings. If those align, your chunking strategy is probably stable enough for production use.

Next Steps

•Add overlap between chunks so references crossing boundaries are not lost
•Replace plain summaries with structured JSON extraction using a schema validator
•Add retry logic and rate-limit handling before using this in batch pipelines

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit