CrewAI Tutorial (TypeScript): streaming agent responses for advanced developers

By Cyprian AaronsUpdated 2026-04-21
crewaistreaming-agent-responses-for-advanced-developerstypescript

This tutorial shows you how to stream CrewAI agent responses in a TypeScript app so you can surface partial output as it’s generated, not after the run finishes. You need this when your agent is doing longer reasoning, calling tools, or writing multi-step outputs and you want better UX, live logs, or incremental processing.

What You'll Need

  • Node.js 18+
  • A TypeScript project with ts-node or tsx
  • CrewAI installed for TypeScript
  • An LLM API key set in your environment
  • A terminal that supports streaming output
  • Basic familiarity with agents, tasks, and crews in CrewAI

Install the packages:

npm install @crewai/core dotenv
npm install -D typescript tsx @types/node

Set your environment variable:

export OPENAI_API_KEY="your-key-here"

Step-by-Step

  1. Create a small TypeScript entrypoint that wires up an agent, task, and crew. The important part is enabling streaming on the underlying model so token-level output can be emitted while the task runs.
import "dotenv/config";
import { Agent, Task, Crew } from "@crewai/core";

const agent = new Agent({
  role: "Senior Banking Analyst",
  goal: "Explain fraud detection tradeoffs clearly",
  backstory: "You write concise technical summaries for risk teams.",
});

const task = new Task({
  description: "Write a short fraud detection summary for a product manager.",
  expectedOutput: "A concise summary with practical recommendations.",
  agent,
});

const crew = new Crew({
  agents: [agent],
  tasks: [task],
});
  1. Use a streaming-capable model configuration. In CrewAI TypeScript, the exact model wrapper depends on your provider setup, but the pattern is the same: pass a streaming flag into the LLM configuration used by the agent.
import { OpenAI } from "@crewai/core";

const llm = new OpenAI({
  model: "gpt-4o-mini",
  apiKey: process.env.OPENAI_API_KEY!,
  temperature: 0.2,
  stream: true,
});

const agent = new Agent({
  role: "Senior Banking Analyst",
  goal: "Explain fraud detection tradeoffs clearly",
  backstory: "You write concise technical summaries for risk teams.",
  llm,
});
  1. Attach a callback handler so you can print each chunk as it arrives. This is the part most developers miss: streaming is only useful if you consume the chunks and do something with them immediately.
const llm = new OpenAI({
  model: "gpt-4o-mini",
  apiKey: process.env.OPENAI_API_KEY!,
  temperature: 0.2,
  stream: true,
  callbacks: [
    {
      handleLLMNewToken(token: string) {
        process.stdout.write(token);
      },
    },
  ],
});
  1. Run the crew and await the final result. You still get a completed response at the end, but now the user sees output progressively instead of waiting for the full completion.
async function main() {
  const result = await crew.kickoff();
  console.log("\n\n--- FINAL RESULT ---");
  console.log(result);
}

main().catch((error) => {
  console.error(error);
  process.exit(1);
});
  1. If you want to stream structured progress instead of raw tokens, wrap task execution with your own event boundary logging. This is useful in production when you need auditability for bank or insurance workflows.
async function main() {
  console.log("[crew] starting");
  const result = await crew.kickoff();
  console.log("\n[crew] completed");
  console.log(JSON.stringify(result, null, 2));
}

main().catch((error) => {
  console.error("[crew] failed", error);
  process.exit(1);
});
  1. Put it all together in one executable file and run it with tsx. Keep the code path simple until streaming is stable, then add tools, memory, or multiple agents.
import "dotenv/config";
import { Agent, Task, Crew, OpenAI } from "@crewai/core";

const llm = new OpenAI({
  model: "gpt-4o-mini",
  apiKey: process.env.OPENAI_API_KEY!,
  temperature: 0.2,
  stream: true,
});

const agent = new Agent({
  role: "Senior Banking Analyst",
  goal: "Explain fraud detection tradeoffs clearly",
  backstory: "You write concise technical summaries for risk teams.",
  llm,
});

const task = new Task({
  description: "Write a short fraud detection summary for a product manager.",
})

task.agent = agent;

const crew = new Crew({ agents: [agent], tasks: [task] });

(async () => {
  const result = await crew.kickoff();
})().catch(console.error);

Testing It

Run the file with npx tsx src/index.ts and watch for token output in your terminal before the final response prints. If nothing streams until completion, your provider config is probably missing stream: true or your callback isn’t attached to the actual LLM instance used by the agent.

Use a prompt that forces multi-sentence output so streaming is obvious. For example, ask for a comparison table or a step-by-step explanation; short answers often finish too quickly to notice chunking.

If you’re integrating this into an API server, verify that your HTTP response stays open while chunks are emitted. For Node servers, that usually means writing each token to res.write() and ending with res.end() once kickoff() resolves.

Next Steps

  • Add tool calling and confirm tool results don’t break streaming behavior.
  • Replace terminal writes with Server-Sent Events for browser clients.
  • Add per-token tracing so you can audit outputs in regulated workflows.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides