Haystack Tutorial (TypeScript): streaming agent responses for intermediate developers

By Cyprian AaronsUpdated 2026-04-21
haystackstreaming-agent-responses-for-intermediate-developerstypescript

This tutorial shows you how to build a TypeScript agent that streams partial responses from Haystack instead of waiting for the full answer. You need this when you want a better UX in chat apps, progress indicators, or any workflow where long-running model calls should start showing output immediately.

What You'll Need

  • Node.js 18+ installed
  • A TypeScript project with ts-node or a build step
  • @haystack-ai/core installed
  • An OpenAI API key
  • Basic familiarity with Haystack pipelines and components
  • A terminal that can run environment variables

Install the package:

npm install @haystack-ai/core
npm install -D typescript ts-node @types/node

Set your API key:

export OPENAI_API_KEY="your-key-here"

Step-by-Step

  1. Start by creating a minimal project setup that can run TypeScript directly. Keep the file small and make sure your runtime can read environment variables before Haystack initializes.
import { OpenAIChatGenerator } from "@haystack-ai/core";
import { Pipeline } from "@haystack-ai/core/pipeline";

async function main() {
  if (!process.env.OPENAI_API_KEY) {
    throw new Error("OPENAI_API_KEY is required");
  }

  console.log("Project bootstrapped");
}

main().catch((error) => {
  console.error(error);
  process.exit(1);
});
  1. Define an OpenAI chat generator that supports streaming. The important part is enabling token-level callbacks so you can forward partial output to your UI as soon as it arrives.
import { OpenAIChatGenerator } from "@haystack-ai/core";

const llm = new OpenAIChatGenerator({
  model: "gpt-4o-mini",
  apiKey: process.env.OPENAI_API_KEY!,
});

console.log("Generator ready:", llm.constructor.name);
  1. Add a simple prompt pipeline and attach a streaming handler. In practice, you’ll wire this handler to WebSockets, Server-Sent Events, or whatever your frontend uses for incremental rendering.
import { OpenAIChatGenerator } from "@haystack-ai/core";
import { Pipeline } from "@haystack-ai/core/pipeline";

const pipeline = new Pipeline();

pipeline.addComponent("llm", new OpenAIChatGenerator({
  model: "gpt-4o-mini",
  apiKey: process.env.OPENAI_API_KEY!,
}));

pipeline.connect("llm.replies", "llm.replies");

console.log("Pipeline configured");
  1. Run the generator with a streaming callback and print each chunk as it arrives. This is the part that changes the user experience from “wait and hope” to “see it happening.”
import { OpenAIChatGenerator } from "@haystack-ai/core";

async function main() {
  const generator = new OpenAIChatGenerator({
    model: "gpt-4o-mini",
    apiKey: process.env.OPENAI_API_KEY!,
  });

  const result = await generator.run({
    messages: [{ role: "user", content: "Explain streaming in one paragraph." }],
    streamingCallback: (chunk: string) => {
      process.stdout.write(chunk);
    },
  });

  console.log("\n\nFinal result:", JSON.stringify(result, null, 2));
}

main().catch(console.error);
  1. Wrap the stream in an agent-style loop if you need tool use later. For now, keep the response path clean so you can verify streaming before adding retrieval or function calls.
import { OpenAIChatGenerator } from "@haystack-ai/core";

async function streamAnswer(prompt: string) {
  const generator = new OpenAIChatGenerator({
    model: "gpt-4o-mini",
    apiKey: process.env.OPENAI_API_KEY!,
  });

  await generator.run({
    messages: [{ role: "user", content: prompt }],
    streamingCallback: (chunk: string) => {
      process.stdout.write(chunk);
    },
  });
}

streamAnswer("Write three bullet points about TypeScript async patterns.")
  .catch(console.error);

Testing It

Run the script with ts-node or compile it first and execute the output with Node. If streaming is working, you should see text appear incrementally instead of all at once after the request finishes.

Check two things:

  • The first token appears quickly after the request starts.
  • The final printed result matches what was streamed.

If nothing streams, verify your API key and confirm the model name is valid for your account. Also check that your terminal isn’t buffering stdout in a way that hides incremental writes.

Next Steps

  • Add a WebSocket layer so streamed chunks reach a browser client in real time
  • Combine streaming with RAG by inserting a retriever before the generator
  • Add tool calling so the agent can stream reasoning while still invoking external actions

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides