AutoGen Tutorial (TypeScript): chunking large documents for advanced developers

By Cyprian AaronsUpdated 2026-04-21
autogenchunking-large-documents-for-advanced-developerstypescript

This tutorial shows how to split large documents into token-safe chunks in TypeScript, send them through AutoGen agents, and aggregate the results without blowing past model limits. You need this when a single contract, policy, claim file, or research packet is too large to process in one prompt and you still want deterministic, production-friendly behavior.

What You'll Need

  • Node.js 18+
  • A TypeScript project with ts-node or tsx
  • @autogenai/autogen installed
  • An OpenAI-compatible API key in OPENAI_API_KEY
  • A text document to process, ideally .txt for the first pass
  • Basic familiarity with AutoGen agents and async/await

Step-by-Step

  1. Start by installing the package and setting up a small TypeScript project. For document chunking, keep the runtime simple: one agent for analysis, one helper function for splitting text, and one orchestrator script.
npm init -y
npm install @autogenai/autogen
npm install -D typescript tsx @types/node
  1. Create a chunking helper that splits on paragraph boundaries and enforces a rough character budget. This is not perfect tokenization, but it is stable enough for production workflows where you control input format.
// chunk.ts
export function chunkText(text: string, maxChars = 4000): string[] {
  const paragraphs = text.split(/\n\s*\n/);
  const chunks: string[] = [];
  let current = "";

  for (const paragraph of paragraphs) {
    const next = current ? `${current}\n\n${paragraph}` : paragraph;
    if (next.length > maxChars && current) {
      chunks.push(current);
      current = paragraph;
    } else {
      current = next;
    }
  }

  if (current) chunks.push(current);
  return chunks;
}
  1. Wire up an AutoGen assistant agent that will summarize each chunk in a structured way. The important part is to keep the output schema consistent so you can merge results later without manual cleanup.
// agent.ts
import { AssistantAgent } from "@autogenai/autogen";

export const summarizer = new AssistantAgent({
  name: "summarizer",
  modelClient: {
    apiKey: process.env.OPENAI_API_KEY!,
    model: "gpt-4o-mini",
  },
  systemMessage:
    "You summarize document chunks for downstream aggregation. Return concise bullets with facts, risks, dates, names, and open questions.",
});
  1. Process every chunk sequentially and collect the summaries. Sequential execution is easier to reason about than parallel calls when you are working with long legal or insurance documents that need traceability.
// run.ts
import { readFileSync } from "node:fs";
import { chunkText } from "./chunk";
import { summarizer } from "./agent";

async function main() {
  const text = readFileSync("./document.txt", "utf8");
  const chunks = chunkText(text, 4000);
  const summaries: string[] = [];

  for (let i = 0; i < chunks.length; i++) {
    const result = await summarizer.run(
      `Summarize chunk ${i + 1}/${chunks.length}:\n\n${chunks[i]}`
    );
    summaries.push(String(result));
  }

  console.log(JSON.stringify({ chunks: chunks.length, summaries }, null, 2));
}

main().catch((err) => {
  console.error(err);
  process.exit(1);
});
  1. Add a second pass that merges all chunk summaries into one final answer. This gives you a map-reduce pattern that works well for long documents because each chunk stays within context limits while the final synthesis remains compact.
// merge.ts
import { AssistantAgent } from "@autogenai/autogen";

const merger = new AssistantAgent({
  name: "merger",
  modelClient: {
    apiKey: process.env.OPENAI_API_KEY!,
    model: "gpt-4o-mini",
  },
  systemMessage:
    "You combine multiple chunk summaries into one coherent final report. Remove duplicates and preserve critical details.",
});

export async function mergeSummaries(summaries: string[]) {
  const prompt = `Merge these summaries into one report:\n\n${summaries.join("\n\n---\n\n")}`;
  const result = await merger.run(prompt);
  return String(result);
}

Testing It

Put a real multi-page document in document.txt, then run the script with your API key set in the environment. You should see the number of chunks printed along with one summary per chunk, followed by a merged report if you call the merge step.

Check that no single request exceeds your target size by temporarily logging each chunk length before sending it to AutoGen. If you get truncated answers or context errors, reduce maxChars until the worst-case chunk fits comfortably below your model’s practical limit.

For a stronger test, use a document with headings, tables copied as plain text, and repeated clauses. Good chunking preserves section boundaries and makes repeated content obvious in the merged output instead of collapsing everything into one noisy summary.

Next Steps

  • Replace character-based splitting with token-based splitting using your model’s tokenizer budget.
  • Add structured outputs so each chunk returns JSON instead of free-form text.
  • Parallelize chunk processing with concurrency limits once you have stable observability and retry handling.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides