CrewAI Tutorial (TypeScript): rate limiting API calls for intermediate developers

By Cyprian AaronsUpdated 2026-04-21

crewairate-limiting-api-calls-for-intermediate-developerstypescript

This tutorial shows how to add rate limiting to CrewAI-powered API calls in TypeScript so your agents stop hammering upstream services and start behaving like production software. You need this when you’re calling APIs with strict quotas, burst limits, or per-minute billing and you want predictable retries instead of random 429 failures.

What You'll Need

•Node.js 18+ installed
•A TypeScript project with tsconfig.json
•CrewAI TypeScript package installed
•An API key for the model provider you’re using with CrewAI
•A target API you want to protect with rate limiting
•p-limit for concurrency control
•bottleneck for request pacing

Install the packages:

npm install @crewai/crewai bottleneck p-limit dotenv
npm install -D typescript tsx @types/node

Step-by-Step

•Start by defining a small rate-limited HTTP client. This keeps the throttling logic outside your agent code, which is where it belongs. You want one reusable boundary that controls request volume no matter how many tasks your crew runs.

import Bottleneck from "bottleneck";

type ApiResponse = {
  ok: boolean;
  status: number;
  data: unknown;
};

const limiter = new Bottleneck({
  maxConcurrent: 2,
  minTime: 500,
});

export async function rateLimitedFetch(
  url: string,
  init?: RequestInit
): Promise<ApiResponse> {
  return limiter.schedule(async () => {
    const res = await fetch(url, init);
    const data = await res.json().catch(() => null);

    return {
      ok: res.ok,
      status: res.status,
      data,
    };
  });
}

•Next, wrap the API call in a tool function. CrewAI agents should call tools, not raw network functions, because that gives you one place to enforce retries, headers, logging, and throttling.

import { rateLimitedFetch } from "./rateLimitedFetch";

export async function getCustomerRiskScore(customerId: string) {
  const response = await rateLimitedFetch(
    `https://api.example.com/customers/${customerId}/risk`,
    {
      headers: {
        Authorization: `Bearer ${process.env.EXTERNAL_API_KEY}`,
        "Content-Type": "application/json",
      },
    }
  );

  if (!response.ok) {
    throw new Error(`Risk API failed with status ${response.status}`);
  }

  return response.data;
}

•Now create a CrewAI tool and agent around that function. The important part is that the agent sees a single tool entry point while your limiter stays hidden underneath.

import "dotenv/config";
import { Agent, Task, Crew } from "@crewai/crewai";
import { getCustomerRiskScore } from "./getCustomerRiskScore";

const riskTool = {
  name: "get_customer_risk_score",
  description: "Fetch the customer risk score from the external risk API",
  execute: async (input: { customerId: string }) => {
    return await getCustomerRiskScore(input.customerId);
  },
};

const analyst = new Agent({
  role: "Risk Analyst",
  goal: "Assess customer risk using external API data",
  backstory: "You work in a regulated environment and must respect vendor limits.",
  tools: [riskTool],
});

•Add a task that forces the agent to use the tool once per customer instead of firing off uncontrolled parallel requests. If you need multiple customers, handle batching outside the task and let the limiter absorb bursts.

const task = new Task({
  description:
    "For customer CUST-1001, fetch the risk score and summarize whether the account should be reviewed.",
  expectedOutput:
    "A short risk summary with the retrieved score and a review recommendation.",
  agent: analyst,
});

const crew = new Crew({
  agents: [analyst],
  tasks: [task],
});

async function main() {
  const result = await crew.kickoff();
  console.log(result);
}

main().catch((err) => {
  console.error(err);
  process.exit(1);
});

•If you have multiple requests to run, cap concurrency before they reach CrewAI. This prevents your own worker pool from creating a thundering herd that overwhelms the limiter or upstream API.

import pLimit from "p-limit";
import { getCustomerRiskScore } from "./getCustomerRiskScore";

const limit = pLimit(3);

async function mainBatch() {
  const customerIds = ["CUST-1001", "CUST-1002", "CUST-1003", "CUST-1004"];

  const results = await Promise.all(
    customerIds.map((customerId) =>
      limit(async () => ({
        customerId,
        risk: await getCustomerRiskScore(customerId),
      }))
    )
  );

  console.log(results);
}

mainBatch().catch(console.error);

Testing It

Run the single-task version first and confirm it returns a valid summary without hitting your API’s quota warnings. Then run the batch version with four or five IDs and watch the timestamps in your logs; requests should spread out instead of firing all at once.

If your vendor returns 429 Too Many Requests, lower maxConcurrent or increase minTime. If responses are too slow but still within quota, raise concurrency slightly while keeping minTime conservative.

A good smoke test is to print before and after each scheduled request so you can see Bottleneck doing its job. You should see serialized pacing even when Promise.all is used above it.

Next Steps

•Add exponential backoff for 429 and 503 responses
•Move rate-limit settings into environment variables per vendor
•Add Redis-backed distributed limiting if multiple app instances share one API quota

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit