LlamaIndex Tutorial (TypeScript): implementing retry logic for beginners

By Cyprian AaronsUpdated 2026-04-21

llamaindeximplementing-retry-logic-for-beginnerstypescript

This tutorial shows you how to add retry logic around LlamaIndex TypeScript calls so your agent can recover from transient failures like rate limits, timeouts, and flaky upstream APIs. You need this because production AI workflows fail for boring reasons all the time, and a single failed request should not take down the whole user interaction.

What You'll Need

•Node.js 18+
•TypeScript 5+
•A LlamaIndex TypeScript project already set up
•@llamaindex/openai
•dotenv
•An OpenAI API key
•Basic familiarity with Settings, Document, and VectorStoreIndex

Step-by-Step

•
Install the packages you need.

Use the OpenAI integration for LlamaIndex TS and a small retry helper. I’m using p-retry here because it keeps the retry policy explicit and easy to test.
```
npm install llamaindex @llamaindex/openai dotenv p-retry
npm install -D typescript tsx @types/node
```

•

Set up your environment and LlamaIndex client.

Keep your API key in .env, then configure the LLM once at startup. This keeps retries focused on request execution, not on re-initializing clients.

import "dotenv/config";
import { Settings } from "llamaindex";
import { OpenAI } from "@llamaindex/openai";

if (!process.env.OPENAI_API_KEY) {
  throw new Error("OPENAI_API_KEY is missing");
}

Settings.llm = new OpenAI({
  model: "gpt-4o-mini",
  apiKey: process.env.OPENAI_API_KEY,
});

•

Wrap your LlamaIndex call in a retry function.

The main idea is simple: put the failing operation inside a function that can be called again if it throws a transient error. For beginners, start with one retry wrapper around your query call instead of trying to retry every layer of the stack.

import pRetry from "p-retry";

function isRetryableError(error: unknown): boolean {
  const message = error instanceof Error ? error.message : String(error);

  return (
    message.includes("429") ||
    message.includes("rate limit") ||
    message.includes("timeout") ||
    message.includes("ECONNRESET") ||
    message.includes("ETIMEDOUT")
  );
}

export async function withRetry<T>(operation: () => Promise<T>): Promise<T> {
  return pRetry(operation, {
    retries: 3,
    factor: 2,
    minTimeout: 500,
    maxTimeout: 4000,
    randomize: true,
    onFailedAttempt: (error) => {
      console.log(
        `Attempt ${error.attemptNumber} failed. ${error.retriesLeft} retries left.`
      );
    },
    shouldRetry: (error) => isRetryableError(error as unknown),
  });
}

•

Build an index and run queries through the retry wrapper.

This example uses a tiny document set so you can focus on the retry pattern itself. In real code, you would reuse the same index and only wrap the call that talks to the model or embedding service.

import { Document, VectorStoreIndex } from "llamaindex";
import { Settings } from "llamaindex";
import { OpenAI } from "@llamaindex/openai";
import "dotenv/config";
import { withRetry } from "./withRetry";

if (!process.env.OPENAI_API_KEY) {
  throw new Error("OPENAI_API_KEY is missing");
}

Settings.llm = new OpenAI({
  model: "gpt-4o-mini",
  apiKey: process.env.OPENAI_API_KEY,
});

async function main() {
  const docs = [
    new Document({ text: "LlamaIndex helps build RAG applications." }),
    new Document({ text: "Retry logic protects against temporary API failures." }),
  ];

  const index = await VectorStoreIndex.fromDocuments(docs);
  const queryEngine = index.asQueryEngine();

  const response = await withRetry(() =>
    queryEngine.query({
      query: "Why do we use retry logic?",
    })
  );

  console.log(response.toString());
}

main().catch((err) => {
  console.error(err);
  process.exit(1);
});

•

Add better failure handling for production use.

Not every error should be retried. Authentication errors, invalid requests, and bad prompts usually fail consistently, so retrying them just burns time and money.

 export function classifyError(error: unknown): "retry" | "fail" {
   const message = error instanceof Error ? error.message : String(error);

   if (
     message.includes("401") ||
     message.includes("403") ||
     message.includes("invalid api key") ||
     message.includes("bad request")
   ) {
     return "fail";
   }

   return isRetryableError(error) ? "retry" : "fail";
 }

 export async function withStrictRetry<T>(operation: () => Promise<T>): Promise<T> {
   return pRetry(operation, {
     retries: 3,
     minTimeout: 500,
     maxTimeout: 4000,
     shouldRetry: (error) => classifyError(error as unknown) === "retry",
   });
 }

Testing It

Run the script normally first and confirm you get a valid answer back from the query engine. Then temporarily force a failure by using an invalid network condition or by pointing to a bad API key so you can see whether retries are attempted before the final failure.

You should see log lines like Attempt 1 failed... when transient errors happen. If you get an authentication error, it should fail immediately when using the stricter classifier.

For a more realistic test, lower your timeout at the HTTP layer or simulate rate limiting in a staging environment. The important check is that transient failures recover while permanent failures do not loop forever.

Next Steps

•Add exponential backoff jitter plus structured logging for observability.
•Retry only idempotent operations like retrieval and generation, not side-effecting tool calls.
•Move this wrapper into a shared SDK module so every agent in your codebase uses the same policy.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit