AutoGen Tutorial (TypeScript): building a RAG pipeline for intermediate developers
By Cyprian AaronsUpdated 2026-04-21
autogenbuilding-a-rag-pipeline-for-intermediate-developerstypescript
This tutorial shows you how to build a retrieval-augmented generation pipeline in TypeScript with AutoGen: ingest documents, index them with embeddings, retrieve the right chunks, and answer user questions with grounded context. You need this when a plain chat model is not enough and you want answers tied to your own docs, policies, or knowledge base.
What You'll Need
- •Node.js 18+ installed
- •A TypeScript project initialized with
npm init -y - •Packages:
- •
autogen - •
openai - •
dotenv - •
ts-nodeandtypescriptfor local execution
- •
- •An OpenAI API key in
.env:- •
OPENAI_API_KEY=...
- •
- •A folder of source documents, for example:
- •
./data/policy.txt - •
./data/faq.txt
- •
Step-by-Step
- •Start by installing dependencies and setting up a minimal TypeScript config. Keep this boring and explicit; RAG pipelines fail more from bad plumbing than bad prompts.
npm install autogen openai dotenv
npm install -D typescript ts-node @types/node
npx tsc --init
- •Create a small document loader and chunker. For production, you would split by tokens, but for a working baseline, fixed-size chunks are enough to prove the retrieval path end to end.
import fs from "node:fs/promises";
export async function loadAndChunk(path: string, chunkSize = 800) {
const text = await fs.readFile(path, "utf8");
const chunks: string[] = [];
for (let i = 0; i < text.length; i += chunkSize) {
chunks.push(text.slice(i, i + chunkSize));
}
return chunks.map((content, index) => ({
id: `${path}:${index}`,
content,
source: path,
}));
}
- •Build an embedding index over your chunks. This example uses OpenAI embeddings directly so the retrieval layer stays simple and deterministic.
import "dotenv/config";
import OpenAI from "openai";
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
export type Chunk = { id: string; content: string; source: string };
export async function embedText(text: string) {
const res = await client.embeddings.create({
model: "text-embedding-3-small",
input: text,
});
return res.data[0].embedding;
}
export async function buildIndex(chunks: Chunk[]) {
const indexed = [];
for (const chunk of chunks) {
indexed.push({ ...chunk, embedding: await embedText(chunk.content) });
}
return indexed;
}
- •Add cosine similarity retrieval. This is the core of RAG: turn the question into an embedding, compare it against your chunk embeddings, and keep the top matches.
function cosineSimilarity(a: number[], b: number[]) {
let dot = 0;
let magA = 0;
let magB = 0;
for (let i = 0; i < a.length; i++) {
dot += a[i] * b[i];
magA += a[i] * a[i];
magB += b[i] * b[i];
}
return dot / (Math.sqrt(magA) * Math.sqrt(magB));
}
export async function retrieve(query: string, index: any[], k = 3) {
const queryEmbedding = await embedText(query);
return index
.map((chunk) => ({
...chunk,
score: cosineSimilarity(queryEmbedding, chunk.embedding),
}))
.sort((a, b) => b.score - a.score)
.slice(0, k);
}
- •Wire retrieval into an AutoGen assistant using a system prompt that forces grounded answers. The trick is to pass retrieved context into the chat call every time instead of hoping the model “remembers” anything useful.
import { AssistantAgent } from "autogen";
import "dotenv/config";
const assistant = new AssistantAgent({
name: "rag_assistant",
});
export async function answerWithContext(question: string, retrievedChunks: any[]) {
const context = retrievedChunks
.map((c, i) => `[#${i + 1} | ${c.source} | score=${c.score.toFixed(3)}]\n${c.content}`)
.join("\n\n");
const result = await assistant.generateReply([
{
role: "system",
content:
"Answer only using the provided context. If the context is insufficient, say you do not know.",
},
{
role: "user",
content: `Context:\n${context}\n\nQuestion:\n${question}`,
},
]);
return result;
}
- •Put it together in one executable entrypoint. This script loads files, indexes them once, retrieves relevant passages per query, and prints the grounded answer.
import { loadAndChunk } from "./loader";
import { buildIndex } from "./embed";
import { retrieve } from "./retrieve";
import { answerWithContext } from "./agent";
async function main() {
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit