How to Fix 'intermittent 500 errors in production' in LlamaIndex (TypeScript)
Intermittent 500s in LlamaIndex TypeScript usually mean your app is throwing inside the retrieval or synthesis path, but only for certain requests. In practice, this shows up when a specific document chunk, query shape, model call, or async boundary trips an exception that your API layer turns into a generic 500.
The annoying part is that the same code can work 90% of the time and fail under load, with long prompts, empty indexes, or malformed metadata. The fix is usually not “retry harder”; it’s identifying the unstable part of the pipeline and making it deterministic.
The Most Common Cause
The #1 cause I see is passing invalid or incomplete data into VectorStoreIndex or query-time components, then letting the error bubble up from inside LlamaIndex.
Typical symptoms:
- •
Error: Cannot read properties of undefined - •
TypeError: Cannot read properties of null - •
LlamaIndexError: Failed to retrieve nodes - •
LlamaIndexError: OpenAI API error: 400 Bad Request
This often happens when documents are built from partial DB rows, optional fields, or inconsistent metadata.
| Broken pattern | Fixed pattern |
|---|---|
| ```ts | |
| import { Document, VectorStoreIndex } from "llamaindex"; |
const docs = rows.map((row) => new Document({ text: row.content, metadata: { customerId: row.customer_id, source: row.source.toLowerCase(), // crashes if source is null }, }) );
const index = await VectorStoreIndex.fromDocuments(docs);
|ts
import { Document, VectorStoreIndex } from "llamaindex";
const docs = rows .filter((row) => typeof row.content === "string" && row.content.length > 0) .map( (row) => new Document({ text: row.content, metadata: { customerId: String(row.customer_id ?? ""), source: String(row.source ?? "unknown").toLowerCase(), }, }) );
const index = await VectorStoreIndex.fromDocuments(docs);
The broken version fails intermittently because one bad row is enough to poison the whole request. The fixed version normalizes input before LlamaIndex touches it.
## Other Possible Causes
### 1. Model client misconfiguration
If your OpenAI/Anthropic client is missing a key, using the wrong model name, or hitting rate limits, LlamaIndex will surface a runtime error during query execution.
```ts
// Broken
const response = await queryEngine.query({
query: "Summarize this policy",
});
// Fixed
const response = await queryEngine.query({
query: "Summarize this policy",
});
That looks identical because the real fix is outside the call site:
import { OpenAI } from "llamaindex";
const llm = new OpenAI({
model: process.env.OPENAI_MODEL ?? "gpt-4o-mini",
apiKey: process.env.OPENAI_API_KEY,
});
Watch for errors like:
- •
OpenAI API error: 401 Unauthorized - •
OpenAI API error: 429 Too Many Requests - •
LlamaIndexError: Failed to generate response
2. Empty retrieval results
Some chains assume at least one node exists. If your retriever returns nothing, downstream synthesis can throw.
const retriever = index.asRetriever({ similarityTopK: 5 });
const nodes = await retriever.retrieve("claim status");
// Broken if nodes.length === 0
const answer = await queryEngine.query("claim status");
Fix by handling empty results explicitly:
const nodes = await retriever.retrieve("claim status");
if (nodes.length === 0) {
return {
answer: "No relevant context found.",
sources: [],
};
}
3. Context window overflow
Large chunks or too many retrieved nodes can push prompts past model limits. That often appears as sporadic failures on longer queries.
const index = await VectorStoreIndex.fromDocuments(docs, {
chunkSize: 4096,
});
Use smaller chunks and cap retrieval:
const index = await VectorStoreIndex.fromDocuments(docs, {
chunkSize: 1024,
});
const retriever = index.asRetriever({ similarityTopK: 3 });
Typical message:
- •
LlamaIndexError: Context length exceeded - •
OpenAI API error: This model's maximum context length is ...
4. Unhandled async errors in your route handler
A lot of “intermittent” production failures are just swallowed promise rejections.
// Broken
app.post("/ask", (req, res) => {
const result = queryEngine.query(req.body.question);
res.json(result);
});
Fix with explicit await and try/catch:
app.post("/ask", async (req, res) => {
try {
const result = await queryEngine.query(req.body.question);
res.json(result);
} catch (err) {
console.error("LlamaIndex query failed", err);
res.status(500).json({ error: "Query failed" });
}
});
How to Debug It
- •
Log the exact failing input
- •Print the raw document row, query string, and metadata before constructing
Document. - •You want to know whether failures correlate with a specific tenant, file type, or field value.
- •Print the raw document row, query string, and metadata before constructing
- •
Wrap each LlamaIndex stage separately
- •Split ingestion, indexing, retrieval, and synthesis into separate try/catch blocks.
- •Don’t catch everything at the route level only; that hides which stage is failing.
- •
Check for empty or malformed data
- •Validate:
- •
textis non-empty - •metadata values are strings/numbers only
- •retrieved node count is not zero
- •
- •If you use Zod or Valibot upstream, validate before creating
Document.
- •Validate:
- •
Reproduce with one failing payload
- •Save the exact request body that produced the
500. - •Run it locally against staging credentials and enable verbose logging around:
- •document creation
- •embedding calls
- •retriever output
- •final prompt size
- •Save the exact request body that produced the
Prevention
- •Normalize all inbound content before building
Documentobjects. - •Put hard caps on chunk size and retrieval count so prompt growth stays predictable.
- •Add request-level tracing around
VectorStoreIndex, retrievers, and query execution so production failures point to one stage instead of “somewhere in LlamaIndex.”
If you’re seeing intermittent 500 Internal Server Error responses with LlamaIndex TypeScript, treat it like a data-shape or boundary problem first. In most cases the fix is boring engineering discipline: validate inputs, constrain prompt size, and stop letting undefined values reach the indexing pipeline.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit