AutoGen Tutorial (TypeScript): building a RAG pipeline for intermediate developers

By Cyprian AaronsUpdated 2026-04-21

autogenbuilding-a-rag-pipeline-for-intermediate-developerstypescript

This tutorial shows you how to build a retrieval-augmented generation pipeline in TypeScript with AutoGen: ingest documents, index them with embeddings, retrieve the right chunks, and answer user questions with grounded context. You need this when a plain chat model is not enough and you want answers tied to your own docs, policies, or knowledge base.

What You'll Need

•Node.js 18+ installed
•A TypeScript project initialized with npm init -y
•
Packages:
- •autogen
- •openai
- •dotenv
- •ts-node and typescript for local execution
•
An OpenAI API key in .env:
- •OPENAI_API_KEY=...
•
A folder of source documents, for example:
- •./data/policy.txt
- •./data/faq.txt

Step-by-Step

•Start by installing dependencies and setting up a minimal TypeScript config. Keep this boring and explicit; RAG pipelines fail more from bad plumbing than bad prompts.

npm install autogen openai dotenv
npm install -D typescript ts-node @types/node
npx tsc --init

•Create a small document loader and chunker. For production, you would split by tokens, but for a working baseline, fixed-size chunks are enough to prove the retrieval path end to end.

import fs from "node:fs/promises";

export async function loadAndChunk(path: string, chunkSize = 800) {
  const text = await fs.readFile(path, "utf8");
  const chunks: string[] = [];

  for (let i = 0; i < text.length; i += chunkSize) {
    chunks.push(text.slice(i, i + chunkSize));
  }

  return chunks.map((content, index) => ({
    id: `${path}:${index}`,
    content,
    source: path,
  }));
}

•Build an embedding index over your chunks. This example uses OpenAI embeddings directly so the retrieval layer stays simple and deterministic.

import "dotenv/config";
import OpenAI from "openai";

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

export type Chunk = { id: string; content: string; source: string };

export async function embedText(text: string) {
  const res = await client.embeddings.create({
    model: "text-embedding-3-small",
    input: text,
  });
  return res.data[0].embedding;
}

export async function buildIndex(chunks: Chunk[]) {
  const indexed = [];
  for (const chunk of chunks) {
    indexed.push({ ...chunk, embedding: await embedText(chunk.content) });
  }
  return indexed;
}

•Add cosine similarity retrieval. This is the core of RAG: turn the question into an embedding, compare it against your chunk embeddings, and keep the top matches.

function cosineSimilarity(a: number[], b: number[]) {
  let dot = 0;
  let magA = 0;
  let magB = 0;

  for (let i = 0; i < a.length; i++) {
    dot += a[i] * b[i];
    magA += a[i] * a[i];
    magB += b[i] * b[i];
  }

  return dot / (Math.sqrt(magA) * Math.sqrt(magB));
}

export async function retrieve(query: string, index: any[], k = 3) {
  const queryEmbedding = await embedText(query);

  return index
    .map((chunk) => ({
      ...chunk,
      score: cosineSimilarity(queryEmbedding, chunk.embedding),
    }))
    .sort((a, b) => b.score - a.score)
    .slice(0, k);
}

•Wire retrieval into an AutoGen assistant using a system prompt that forces grounded answers. The trick is to pass retrieved context into the chat call every time instead of hoping the model “remembers” anything useful.

import { AssistantAgent } from "autogen";
import "dotenv/config";

const assistant = new AssistantAgent({
  name: "rag_assistant",
});

export async function answerWithContext(question: string, retrievedChunks: any[]) {
  const context = retrievedChunks
    .map((c, i) => `[#${i + 1} | ${c.source} | score=${c.score.toFixed(3)}]\n${c.content}`)
    .join("\n\n");

  const result = await assistant.generateReply([
    {
      role: "system",
      content:
        "Answer only using the provided context. If the context is insufficient, say you do not know.",
    },
    {
      role: "user",
      content: `Context:\n${context}\n\nQuestion:\n${question}`,
    },
  ]);

  return result;
}

•Put it together in one executable entrypoint. This script loads files, indexes them once, retrieves relevant passages per query, and prints the grounded answer.

import { loadAndChunk } from "./loader";
import { buildIndex } from "./embed";
import { retrieve } from "./retrieve";
import { answerWithContext } from "./agent";

async function main() {

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

AutoGen Tutorial (TypeScript): building a RAG pipeline for intermediate developers

What You'll Need

Step-by-Step

Keep learning

Want the complete 8-step roadmap?

Related Guides