Haystack Tutorial (TypeScript): streaming agent responses for beginners

By Cyprian AaronsUpdated 2026-04-21
haystackstreaming-agent-responses-for-beginnerstypescript

This tutorial shows how to stream an agent’s intermediate and final responses in Haystack using TypeScript. You need this when you want token-by-token or event-by-event output in a chat UI, CLI, or backend service instead of waiting for the full answer.

What You'll Need

  • Node.js 18+ installed
  • A TypeScript project with ts-node or a build step
  • Haystack JS/TS packages installed:
    • @haystack-ai/core
    • @haystack-ai/components
  • An OpenAI API key set as OPENAI_API_KEY
  • A terminal that can run async Node scripts
  • Basic familiarity with Haystack pipelines and components

Step-by-Step

  1. Install the packages and set up your project.
    If you already have a TypeScript app, just add the Haystack packages and make sure your environment has the OpenAI key available.
npm install @haystack-ai/core @haystack-ai/components
npm install -D typescript ts-node @types/node
export OPENAI_API_KEY="your-key-here"
  1. Create a minimal streaming agent pipeline.
    The key piece is using a generator-style run method so you can consume streamed events as they arrive instead of waiting for one final object.
import { Pipeline } from "@haystack-ai/core";
import {
  OpenAIChatGenerator,
  ChatPromptBuilder,
} from "@haystack-ai/components";

const promptBuilder = new ChatPromptBuilder({
  template: [
    { role: "system", content: "You are a concise banking assistant." },
    { role: "user", content: "{{question}}" },
  ],
});

const llm = new OpenAIChatGenerator({
  model: "gpt-4o-mini",
});

const pipeline = new Pipeline();
pipeline.addComponent("prompt_builder", promptBuilder);
pipeline.addComponent("llm", llm);
pipeline.connect("prompt_builder.prompt", "llm.messages");
  1. Run the pipeline in streaming mode and print chunks as they arrive.
    This is the part beginners usually miss: you do not call the normal run() path if you want live output. You iterate over the stream and render each event immediately.
async function main() {
  const stream = await pipeline.stream({
    prompt_builder: {
      question: "Explain what an escrow account is in one paragraph.",
    },
  });

  for await (const event of stream) {
    if (event.component === "llm" && event.output?.content) {
      process.stdout.write(event.output.content);
    }
  }

  process.stdout.write("\n");
}

main().catch((err) => {
  console.error(err);
  process.exit(1);
});
  1. Add basic agent-like behavior with tool use.
    Streaming becomes more useful when the model can decide whether it needs external data. In production, this usually means wiring tools into the LLM loop, but for a beginner tutorial, keep the first version focused on streaming text correctly.
import { ToolInvoker } from "@haystack-ai/components";

const toolInvoker = new ToolInvoker({
  tools: [],
});

const agentPipeline = new Pipeline();
agentPipeline.addComponent("prompt_builder", promptBuilder);
agentPipeline.addComponent("llm", llm);
agentPipeline.addComponent("tool_invoker", toolInvoker);

agentPipeline.connect("prompt_builder.prompt", "llm.messages");
agentPipeline.connect("llm.replies", "tool_invoker.replies");
  1. Make the output usable in a real app.
    In practice, you will want to buffer chunks for persistence while still streaming them to the user interface. That gives you both low-latency UX and a complete transcript for logging or audits.
async function runAndBuffer(question: string) {
  let fullText = "";

  const stream = await pipeline.stream({
    prompt_builder: { question },
  });

  for await (const event of stream) {
    const chunk = event.output?.content ?? "";
    if (event.component === "llm" && chunk) {
      fullText += chunk;
      process.stdout.write(chunk);
    }
  }

  return fullText;
}

Testing It

Run the script with ts-node or compile it first with tsc and execute the generated JavaScript. If your API key is valid, you should see text appear gradually instead of all at once.

Test with short prompts first so you can confirm the stream is flowing end to end. Then try a longer prompt like “Summarize KYC checks for retail banking customers” and verify that partial output appears before completion.

If nothing streams, check three things first:

  • OPENAI_API_KEY is present in the shell running Node
  • Your model name is valid for your account
  • You are iterating over pipeline.stream(...), not calling a non-streaming run method

Next Steps

  • Add real tools, such as a policy lookup function or internal FAQ search, so the agent can stream both reasoning-adjacent updates and final answers.
  • Wrap the stream in an HTTP endpoint using Server-Sent Events so your frontend can render tokens live.
  • Add transcript logging and redaction before storing streamed responses in production systems.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides