How to Fix 'callback not firing when scaling' in LlamaIndex (TypeScript)
When you see callback not firing when scaling in a LlamaIndex TypeScript app, it usually means your callback handler is attached in one place, but the actual query/indexing work is happening in another. In practice, this shows up when you move from a single-node setup to parallel workers, a larger Document batch, or a custom retriever/query engine where callbacks are not propagated.
The symptom is simple: your code runs, the index returns results, but your CallbackManager events like onRetrieveStart, onLLMEnd, or custom trace handlers never fire.
The Most Common Cause
The #1 cause is creating the index, retriever, or query engine without passing the same CallbackManager instance all the way through.
In LlamaIndex TypeScript, callbacks are not magical global state. If you instantiate a new Settings, ServiceContext, or handler inside a worker or helper function, you can end up with one callback tree at construction time and another at execution time.
Broken vs fixed pattern
| Broken | Fixed |
|---|---|
| Callback manager created in one scope, but index/retriever uses defaults | Same CallbackManager passed into every component |
| Worker builds its own engine | Shared engine/config passed into workers |
| Handler attached after objects are already constructed | Handler attached before index/query engine creation |
// BROKEN
import {
Document,
VectorStoreIndex,
CallbackManager,
ConsoleCallbackHandler,
} from "llamaindex";
const cb = new CallbackManager([new ConsoleCallbackHandler()]);
async function buildIndex() {
const docs = [new Document({ text: "Hello world" })];
// Index created without cb
return await VectorStoreIndex.fromDocuments(docs);
}
async function run() {
const index = await buildIndex();
// Query engine also created without cb
const queryEngine = index.asQueryEngine();
const response = await queryEngine.query({
query: "What is in the document?",
});
console.log(response.toString());
}
// FIXED
import {
Document,
VectorStoreIndex,
CallbackManager,
ConsoleCallbackHandler,
} from "llamaindex";
const cb = new CallbackManager([new ConsoleCallbackHandler()]);
async function buildIndex() {
const docs = [new Document({ text: "Hello world" })];
return await VectorStoreIndex.fromDocuments(docs, {
callbackManager: cb,
});
}
async function run() {
const index = await buildIndex();
const queryEngine = index.asQueryEngine({
callbackManager: cb,
});
const response = await queryEngine.query({
query: "What is in the document?",
});
console.log(response.toString());
}
If you are using a newer LlamaIndex TS API where callbacks hang off Settings, the same rule applies:
import { Settings } from "llamaindex";
Settings.callbackManager = cb;
Set it once at startup, before constructing indexes, retrievers, agents, or tools.
Other Possible Causes
1) Parallel execution creates separate contexts
If you scale with Promise.all, each task may construct its own internal pipeline and lose your shared callback state.
// Problematic
await Promise.all(
chunks.map(async (chunk) => {
const index = await VectorStoreIndex.fromDocuments([chunk]);
return index.asQueryEngine().query({ query: "..." });
})
);
Fix by building once and reusing the same engine:
const index = await VectorStoreIndex.fromDocuments(chunks, {
callbackManager: cb,
});
const engine = index.asQueryEngine({ callbackManager: cb });
await Promise.all(
queries.map((query) => engine.query({ query }))
);
2) You attached the handler too late
If you create the retriever first and set callbacks afterward, those internal references may already be frozen.
const index = await VectorStoreIndex.fromDocuments(docs);
const engine = index.asQueryEngine();
Settings.callbackManager = cb; // too late
Move the setup earlier:
Settings.callbackManager = cb;
const index = await VectorStoreIndex.fromDocuments(docs);
const engine = index.asQueryEngine();
3) A custom retriever or tool drops the callback context
This happens when you wrap LlamaIndex classes and forget to forward config.
class MyRetriever {
constructor(private baseRetriever: any) {}
async retrieve(query: string) {
return this.baseRetriever.retrieve(query); // no callback propagation
}
}
Pass through options explicitly:
class MyRetriever {
constructor(private baseRetriever: any) {}
async retrieve(query: string, options?: { callbackManager?: any }) {
return this.baseRetriever.retrieve(query, options);
}
}
4) Mixed package versions between core and integrations
A classic TypeScript failure mode is using mismatched versions of llamaindex and an integration package. You may see errors like:
- •
TypeError: handler.onEvent is not a function - •
Cannot read properties of undefined (reading 'callbackManager') - •callbacks registering but never firing
Check your dependency tree:
npm ls llamaindex @llamaindex/core @llamaindex/openai
Then align versions in package.json so all LlamaIndex packages come from the same release line.
How to Debug It
- •
Confirm where the callback manager is created
- •Search for
new CallbackManager(...)andSettings.callbackManager. - •Make sure it runs before any
VectorStoreIndex,QueryEngine, or agent construction.
- •Search for
- •
Add a noisy handler
- •Use
ConsoleCallbackHandlerfirst. - •If console events do not appear, the issue is propagation, not your custom handler logic.
- •Use
- •
Check whether you are rebuilding objects per request
- •If each request creates a fresh retriever or engine inside a worker, compare that path with your working local path.
- •The bug often appears only under load because object graphs differ.
- •
Log resolved config at runtime
- •Print whether your engine sees a callback manager:
console.log("callback manager:", !!(engine as any).callbackManager);
console.log("settings callback:", !!(Settings as any).callbackManager);
If one is true and the other is false, you found the break in propagation.
Prevention
- •Set
Settings.callbackManageronce at app bootstrap, before any LlamaIndex object creation. - •Pass
callbackManagerexplicitly into indexes, retrievers, agents, and query engines instead of assuming inheritance. - •Keep all LlamaIndex packages on compatible versions and verify with
npm lsduring upgrades.
If you want stable tracing in production, treat callbacks like database connections: create them early, pass them explicitly, and never assume they survive wrapper layers or worker boundaries.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit