AutoGen Tutorial (TypeScript): implementing guardrails for advanced developers

By Cyprian AaronsUpdated 2026-04-21

autogenimplementing-guardrails-for-advanced-developerstypescript

This tutorial shows how to add guardrails to an AutoGen TypeScript agent so it can validate inputs, constrain tool use, and block unsafe outputs before they reach a user or downstream system. You need this when your agent is handling regulated workflows like claims, KYC, policy servicing, or internal ops where “mostly correct” is not good enough.

What You'll Need

•Node.js 18+ and npm
•A TypeScript project with tsconfig.json
•autogen package installed
•zod for schema validation
•An OpenAI API key exported as OPENAI_API_KEY
•Basic familiarity with AutoGen agents and async/await

Install the dependencies:

npm install autogen zod
npm install -D typescript tsx @types/node

Set your environment variable:

export OPENAI_API_KEY="your-key-here"

Step-by-Step

•Start by defining the shape of allowed requests and allowed model output. In production, guardrails work better when they are explicit schemas instead of “best effort” prompt instructions.

import { z } from "zod";

export const TaskSchema = z.object({
  customerId: z.string().min(1),
  requestType: z.enum(["policy_lookup", "claim_status", "billing_question"]),
  question: z.string().min(10).max(500),
});

export const AnswerSchema = z.object({
  summary: z.string().min(1).max(300),
  riskLevel: z.enum(["low", "medium"]),
  needsHumanReview: z.boolean(),
});

•Create a guardrail function that validates incoming user content before the agent sees it. This is where you reject malformed payloads, unsupported request types, or obvious prompt-injection attempts.

import { TaskSchema } from "./schemas.js";

export function validateUserInput(input: unknown): string {
  const parsed = TaskSchema.safeParse(input);
  if (!parsed.success) {
    throw new Error(`Invalid task payload: ${parsed.error.message}`);
  }

  const text = `${parsed.data.customerId} ${parsed.data.question}`.toLowerCase();
  const blockedPatterns = [
    "ignore previous instructions",
    "reveal system prompt",
    "developer message",
    "exfiltrate",
  ];

  if (blockedPatterns.some((p) => text.includes(p))) {
    throw new Error("Blocked unsafe input");
  }

  return JSON.stringify(parsed.data);
}

•Build the agent with a strict system message and force structured output behavior in the prompt. The important part here is not just asking for JSON; it is making the model produce only what your downstream parser expects.

import { AssistantAgent } from "autogen";
import { AnswerSchema } from "./schemas.js";

export function createGuardedAgent() {
  return new AssistantAgent({
    name: "guarded_support_agent",
    modelClient: {
      model: "gpt-4o-mini",
      apiKey: process.env.OPENAI_API_KEY!,
    },
    systemMessage: [
      "You are a support assistant for regulated customer service.",
      "Only answer using the required JSON shape.",
      "If information is missing or uncertain, set needsHumanReview=true.",
      `Allowed fields: ${JSON.stringify(AnswerSchema.shape)}`,
      "Never provide legal advice, policy interpretation beyond the provided data, or internal reasoning.",
    ].join(" "),
  });
}

•Wrap the agent call with post-generation validation. This is the second guardrail layer: even if the model drifts, your app refuses anything that does not match policy.

import { AnswerSchema } from "./schemas.js";

export function validateAgentOutput(rawText: string) {
  let parsedJson: unknown;

  try {
    parsedJson = JSON.parse(rawText);
  } catch {
    throw new Error("Agent did not return valid JSON");
  }

  const parsed = AnswerSchema.safeParse(parsedJson);
  if (!parsed.success) {
    throw new Error(`Agent output failed validation: ${parsed.error.message}`);
  }

  return parsed.data;
}

•Put it together in a runnable entrypoint. This example sends one validated task to AutoGen, parses the response, and blocks anything outside your contract.

import { createGuardedAgent } from "./agent.js";
import { validateUserInput } from "./input-guard.js";
import { validateAgentOutput } from "./output-guard.js";

async function main() {
  const agent = createGuardedAgent();

  const safeInput = validateUserInput({
    customerId: "CUST-1042",
    requestType: "claim_status",
    question: "What is the current status of claim CLM-88321?",
  });

  const result = await agent.run(safeInput);
  const content =
    typeof result === "string"
      ? result
      : JSON.stringify(result);

  const finalOutput = validateAgentOutput(content);
  console.log(finalOutput);
}

main().catch((err) => {
  console.error(err.message);
  process.exit(1);
});

•Add a human-review path for anything ambiguous or high-risk. In real systems, this prevents the agent from making final decisions on cases that need policy checks or manual approval.

type ReviewDecision =
| { action: "auto_approve"; payload: unknown }
| { action: "send_to_human"; reason: string };

export function decideRouting(output: { needsHumanReview: boolean; riskLevel: string }): ReviewDecision {
  if (output.needsHumanReview || output.riskLevel === "medium") {
    return {
      action: "send_to_human",
      reason: "Model flagged uncertainty or medium risk",
    };
  }

  return {
    action: "auto_approve",
    payload: output,
  };
}

Testing It

Run the entrypoint with a normal request first and confirm you get a JSON object matching AnswerSchema. Then try an input containing a prompt-injection phrase like “ignore previous instructions” and verify it fails before calling the model.

Next, force a bad output by loosening the system message or asking for free-form text; your post-generation validator should reject it. If you are wiring this into an API, test that rejected requests return a controlled error response and never reach downstream business logic.

Next Steps

•Add tool-level guardrails for any external API calls the agent can make
•Store validation failures in logs with request IDs for auditability
•Add policy-based routing so claims, billing, and complaints use different guardrail rules

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit