CrewAI Tutorial (TypeScript): building a RAG pipeline for advanced developers

By Cyprian AaronsUpdated 2026-04-21
crewaibuilding-a-rag-pipeline-for-advanced-developerstypescript

This tutorial builds a production-shaped RAG pipeline in CrewAI TypeScript: ingest documents, chunk them, retrieve relevant context, and generate grounded answers. You’d use this when you need an agent workflow that answers from your own data instead of hallucinating from model memory.

What You'll Need

  • Node.js 20+
  • A TypeScript project with tsconfig.json
  • CrewAI TypeScript package
  • An LLM API key, for example:
    • OPENAI_API_KEY
  • An embeddings provider API key, for example:
    • OPENAI_API_KEY if you use OpenAI embeddings
  • A document source:
    • local .md or .txt files
    • or a database / blob store if you want to extend it later
  • Basic familiarity with:
    • async/await
    • ES modules
    • vector search concepts

Install the core dependencies:

npm install crewai @langchain/openai @langchain/community dotenv
npm install -D typescript tsx @types/node

Step-by-Step

1) Set up the project and environment

Keep secrets in .env, and make sure your runtime can read them before any agent or tool is created. For a RAG pipeline, both generation and embeddings need to be stable and explicit.

// src/env.ts
import "dotenv/config";

export const env = {
  OPENAI_API_KEY: process.env.OPENAI_API_KEY ?? "",
};

if (!env.OPENAI_API_KEY) {
  throw new Error("Missing OPENAI_API_KEY");
}

2) Load documents and split them into chunks

RAG fails fast when chunks are too large or too noisy. Use a deterministic splitter so retrieval stays predictable across runs.

// src/load-docs.ts
import { DirectoryLoader } from "@langchain/community/document_loaders/fs/directory";
import { TextLoader } from "@langchain/community/document_loaders/fs/text";
import { RecursiveCharacterTextSplitter } from "@langchain/textsplitters";

export async function loadChunks() {
  const loader = new DirectoryLoader("./docs", {
    ".txt": (path) => new TextLoader(path),
    ".md": (path) => new TextLoader(path),
  });

  const docs = await loader.load();
  const splitter = new RecursiveCharacterTextSplitter({
    chunkSize: 800,
    chunkOverlap: 120,
  });

  return splitter.splitDocuments(docs);
}

3) Build a vector store for retrieval

This example uses OpenAI embeddings plus an in-memory vector store. That’s enough to validate the pipeline locally; swap the store later for Pinecone, Weaviate, or pgvector when you need persistence.

// src/vector-store.ts
import { OpenAIEmbeddings } from "@langchain/openai";
import { MemoryVectorStore } from "langchain/vectorstores/memory";
import { env } from "./env";

export async function buildVectorStore(chunks: any[]) {
  const embeddings = new OpenAIEmbeddings({
    apiKey: env.OPENAI_API_KEY,
    model: "text-embedding-3-small",
  });

  return MemoryVectorStore.fromDocuments(chunks, embeddings);
}

4) Create a retriever tool and CrewAI agents

The retriever should return only the most relevant chunks. The answer agent should be constrained to those chunks and instructed to cite the retrieved context instead of inventing facts.

// src/crew.ts
import { Agent, Task, Crew } from "crewai";

export function createCrew(retrieveContext: (q: string) => Promise<string>) {
  const retrieverAgent = new Agent({
    role: "Retriever",
    goal: "Find the most relevant context for a user question.",
    backstory: "You are precise and only return evidence from the knowledge base.",
    verbose: true,
    allowDelegation: false,
  });

  const answerAgent = new Agent({
    role: "Answerer",
    goal: "Answer using only retrieved context.",
    backstory: "You produce grounded answers with concise reasoning.",
    verbose: true,
    allowDelegation: false,
  });

  const retrieveTask = new Task({
    description: "Retrieve relevant context for the question.",
    expectedOutput: "A compact context block.",
    agent: retrieverAgent,
  });

  const answerTask = new Task({
    description:
      "Answer the user's question using only the retrieved context. If missing, say so.",
    expectedOutput: "A final answer with no unsupported claims.",
    agent: answerAgent,
    context: [retrieveTask],
  });

  return new Crew({
    agents: [retrieverAgent, answerAgent],
    tasks: [retrieveTask, answerTask],
    verbose: true,
  });
}

5) Wire retrieval into execution

This is where the pipeline becomes real. Retrieve top-k chunks first, then pass them into the crew as grounded context.

// src/index.ts
import { ChatOpenAI } from "@langchain/openai";
import { loadChunks } from "./load-docs";
import { buildVectorStore } from "./vector-store";
import { createCrew } from "./crew";
import "./env";

async function main() {
  const chunks = await loadChunks();
  const store = await buildVectorStore(chunks);

  const question = "What does our policy say about document retention?";
  const docs = await store.similaritySearch(question, 4);

   const context = docs.map((d) => d.pageContent).join("\n\n---\n\n");

   const llm = new ChatOpenAI({
     apiKey: process.env.OPENAI_API_KEY!,
     modelName: "gpt-4o-mini",
     temperature: 0,
   });

   const crew = createCrew(async () => context);
   const result = await crew.kickoff();

   console.log(String(result));
}

main().catch(console.error);

Testing It

Put a few .md or .txt files into ./docs, then run the app with npx tsx src/index.ts. Ask questions whose answers exist in those files first; you should see retrieval logs and an answer that quotes or paraphrases only what was found.

Then ask something outside the corpus. A good RAG setup should respond with uncertainty instead of fabricating details, which tells you your prompt boundaries are doing their job.

If retrieval quality is weak, inspect chunk sizes and top-k results before touching the prompt. In production, most “LLM problems” are actually document parsing or retrieval problems.

Next Steps

  • Replace MemoryVectorStore with a persistent store like pgvector or Pinecone.
  • Add metadata filters for tenant ID, document type, and effective date.
  • Add citation formatting so every answer includes source file names and chunk IDs.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides