How to Fix 'callback not firing in production' in LlamaIndex (TypeScript)

By Cyprian AaronsUpdated 2026-04-21

callback-not-firing-in-productionllamaindextypescript

If you’re seeing callback not firing in production with LlamaIndex TypeScript, the issue is usually not the callback system itself. It means your handler was registered, but the event never made it to the place where LlamaIndex actually emits it, or production changed the execution path enough that your dev setup hid the bug.

This usually shows up when moving from local tests to a serverless runtime, a background job, or an API route where the request returns before async work finishes. In LlamaIndex TS, that often involves CallbackManager, BaseCallbackHandler, OpenAI, QueryEngine, or ChatEngine flows.

The Most Common Cause

The #1 cause is not awaiting the async LlamaIndex call chain. In production, the function exits early, the process freezes, or the response is sent before callbacks like onLLMStart, onLLMEnd, or onRetrieveEnd can fire.

Here’s the broken pattern:

Broken	Fixed
```ts
import { OpenAI } from "llamaindex";
import { CallbackManager } from "llamaindex/callbacks";

const callbackManager = new CallbackManager({ handlers: [myHandler], });

const llm = new OpenAI({ model: "gpt-4o-mini", callbackManager, });

export async function handler(req: Request) { llm.complete("Summarize this text"); // not awaited return new Response("ok"); } |ts import { OpenAI } from "llamaindex"; import { CallbackManager } from "llamaindex/callbacks";

const callbackManager = new CallbackManager({ handlers: [myHandler], });

const llm = new OpenAI({ model: "gpt-4o-mini", callbackManager, });

export async function handler(req: Request) { const result = await llm.complete("Summarize this text"); return Response.json({ text: result.text }); }


If you’re using a query engine, the same bug looks like this:

```ts
// Broken
queryEngine.query("What is in this document?");
return new Response("done");

// Fixed
const response = await queryEngine.query("What is in this document?");
return Response.json({ answer: response.toString() });

In production, un-awaited promises are the classic reason you see no callback output even though local logs looked fine.

Other Possible Causes

1. Your handler is attached to the wrong object

LlamaIndex TS has multiple layers. If you attach a callback to one instance but execute another, nothing fires.

// Broken
const managerA = new CallbackManager({ handlers: [myHandler] });
const managerB = new CallbackManager({ handlers: [] });

const llm = new OpenAI({ model: "gpt-4o-mini", callbackManager: managerB });

// Fixed
const manager = new CallbackManager({ handlers: [myHandler] });

const llm = new OpenAI({ model: "gpt-4o-mini", callbackManager: manager });

Watch for this when creating Settings, ServiceContext, retrievers, and query engines in different files.

2. Production tree-shaking or bundling removed your handler code

Some bundlers strip code that looks unused. If your custom handler is only referenced indirectly, it may disappear from the final bundle.

// Snippet to keep explicit references
export const callbackHandler = new MyCallbackHandler();
export const callbackManager = new CallbackManager({
  handlers: [callbackHandler],
});

If you’re on Next.js, Vite SSR, or Bun, make sure the file exporting your handler is actually imported by runtime code.

3. The runtime exits before background work completes

This is common in serverless environments. The function returns a response and the platform tears down execution before callbacks flush.

// Broken in serverless/background jobs
queueMicrotask(() => {
  queryEngine.query("Explain policy exclusions");
});
return Response.json({ ok: true });

// Fixed
await queryEngine.query("Explain policy exclusions");
return Response.json({ ok: true });

If you need background processing, move it into a real job worker and don’t rely on request lifetime.

4. Your custom handler doesn’t implement the event method you expect

A typo in method names means no visible error unless you inspect logs carefully.

// Broken
class MyCallbackHandler {
  onLLMStartr(event: any) {
    console.log("start");
  }
}

// Fixed
import type { BaseCallbackHandler } from "llamaindex/callbacks";

class MyCallbackHandler implements BaseCallbackHandler {
  onLLMStart(event) {
    console.log("LLM start", event);
  }
}

Also verify you’re implementing the right hooks for your flow:

•onLLMStart
•onLLMEnd
•onRetrieveStart
•onRetrieveEnd
•onAgentAction
•onToolStart
•onToolEnd

How to Debug It

•
Confirm the call is awaited
- •Search for .query(, .complete(, .chat(, and any custom pipeline calls.
- •If there’s no await, fix that first.
- •Add a log immediately before and after the call.
•
Log inside every callback method
- •Put a direct console.log() in onLLMStart, onLLMEnd, and retrieval hooks.
- •If start fires but end does not, you likely have an exception or early exit.
- •If none fire, your handler isn’t wired correctly.
•
Verify one shared CallbackManager instance
- •Print object references where you create and consume them.
- •Make sure your OpenAI, retriever, and query engine all use the same manager.
- •Avoid constructing a second client without callbacks in another module.
•
Run production-like locally
- •Build and run the bundled artifact instead of relying on ts-node/dev mode.
- •
  For Next.js:
```
npm run build && npm run start
```
- •For serverless code paths, test with a cold start and no debugger attached.

If you see errors like:

•TypeError: handler.onLLMStart is not a function
•UnhandledPromiseRejectionWarning
•missing spans/events with no corresponding logs

then you’ve narrowed it down to wiring, lifecycle, or runtime termination.

Prevention

•Always await LlamaIndex calls that trigger callbacks.
•Keep callback registration close to client construction so there’s one source of truth.
•Test under production bundling and runtime conditions before shipping.

The pattern is simple: if LlamaIndex TS callbacks work locally but not in production, assume lifecycle mismatch first. In most cases, fixing the await chain and making sure one shared CallbackManager reaches every component resolves it fast.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit