How to Fix 'deployment crash during development' in LlamaIndex (Python)
A deployment crash during development error in LlamaIndex usually means your app is trying to instantiate or call a deployment-backed component before the runtime is actually ready. In practice, this shows up when you wire up an LLM, embedding model, or vector store incorrectly, then trigger a request during local dev, notebook execution, or hot reload.
The message often appears alongside ValueError, RuntimeError, OpenAI API key not found, or failures inside ServiceContext, Settings, VectorStoreIndex.from_documents(), or QueryEngine.query().
The Most Common Cause
The #1 cause is initializing LlamaIndex components at import time, then reusing stale state during development reloads. This is common in FastAPI, Streamlit, Flask debug mode, and notebooks where the module gets re-imported.
Here’s the broken pattern:
# broken.py
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.llms.openai import OpenAI
llm = OpenAI(model="gpt-4o") # created at import time
docs = SimpleDirectoryReader("./data").load_data()
index = VectorStoreIndex.from_documents(docs)
query_engine = index.as_query_engine(llm=llm)
response = query_engine.query("Summarize the documents")
print(response)
And here’s the fixed version:
# fixed.py
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.llms.openai import OpenAI
def build_query_engine():
llm = OpenAI(model="gpt-4o")
docs = SimpleDirectoryReader("./data").load_data()
index = VectorStoreIndex.from_documents(docs)
return index.as_query_engine(llm=llm)
if __name__ == "__main__":
query_engine = build_query_engine()
response = query_engine.query("Summarize the documents")
print(response)
Why this works:
- •Import-time side effects are removed.
- •Reloads don’t create half-initialized objects.
- •The LLM and index are built only when the process is actually running.
If you’re using FastAPI or Streamlit, do the same thing inside a startup hook or cached factory function.
Other Possible Causes
1) Missing or invalid API credentials
A very common failure is:
- •
ValueError: No API key found for OpenAI - •
openai.AuthenticationError: Incorrect API key provided
Broken:
from llama_index.llms.openai import OpenAI
llm = OpenAI(model="gpt-4o")
Fixed:
import os
from llama_index.llms.openai import OpenAI
llm = OpenAI(
model="gpt-4o",
api_key=os.environ["OPENAI_API_KEY"],
)
If you rely on .env, make sure it loads before LlamaIndex initializes.
2) Mixing old and new LlamaIndex APIs
A lot of “crash during development” reports come from code written for older versions of LlamaIndex. You’ll see errors around ServiceContext deprecation or missing imports.
Broken:
from llama_index import ServiceContext
from llama_index.llms.openai import OpenAI
service_context = ServiceContext.from_defaults(
llm=OpenAI(model="gpt-4o")
)
Fixed:
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
Settings.llm = OpenAI(model="gpt-4o")
If your project upgraded from pre-0.10 code, audit all imports. The new package layout matters.
3) Bad document loading path or empty input set
Another common issue is building an index from no documents. That can trigger downstream failures that look like deployment instability.
Broken:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
docs = SimpleDirectoryReader("./missing-folder").load_data()
index = VectorStoreIndex.from_documents(docs)
Fixed:
from pathlib import Path
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
data_dir = Path("./data")
if not data_dir.exists():
raise FileNotFoundError(f"Missing data directory: {data_dir}")
docs = SimpleDirectoryReader(str(data_dir)).load_data()
if not docs:
raise ValueError("No documents loaded")
index = VectorStoreIndex.from_documents(docs)
This one matters in development because hot reload can run your loader before files are mounted or copied into place.
4) Event loop or async misuse in notebooks and web apps
If you see errors like:
- •
RuntimeError: This event loop is already running - •
asyncio.run() cannot be called from a running event loop
you’re probably calling async LlamaIndex code incorrectly.
Broken:
import asyncio
result = asyncio.run(query_engine.aquery("What is in the docs?"))
Fixed:
# In async context:
result = await query_engine.aquery("What is in the docs?")
Or if you’re in a sync app, keep everything sync and avoid mixing patterns.
How to Debug It
- •
Read the first real exception
- •Don’t stop at “deployment crash during development.”
- •Find the root error:
ValueError,AuthenticationError,RuntimeError, or a deprecation warning from LlamaIndex.
- •
Disable reload/hot-reload temporarily
- •For FastAPI: run without
--reload. - •For Streamlit: remove expensive initialization from global scope.
- •If the crash disappears, you have an import-time side effect problem.
- •For FastAPI: run without
- •
Print each initialization step
- •Log when you load documents.
- •Log when you create
OpenAI(...). - •Log when you call
VectorStoreIndex.from_documents(...). - •The last printed line tells you where it dies.
- •
Check version compatibility
- •Run:
pip show llama-index openai - •Then verify your code matches that version’s API.
- •If you see old imports like
ServiceContext, migrate them.
- •Run:
Prevention
- •Build LlamaIndex objects inside functions, startup hooks, or dependency factories — not at module import time.
- •Pin versions of
llama-indexand provider SDKs likeopenaiso upgrades don’t break your app mid-development. - •Add guardrails around document loading:
- •validate paths
- •check for empty document lists
- •fail fast with explicit errors instead of letting index creation explode later
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit