How to Fix 'async event loop error during development' in LlamaIndex (Python)
If you’re seeing RuntimeError: This event loop is already running or RuntimeError: asyncio.run() cannot be called from a running event loop while developing with LlamaIndex, you’re hitting an async/sync boundary problem. It usually shows up in notebooks, FastAPI apps, Streamlit, or any environment that already owns the event loop.
In LlamaIndex, this often happens when you call a synchronous helper that internally tries to start its own loop, while your app is already inside async def code or an interactive runtime.
The Most Common Cause
The #1 cause is calling asyncio.run() or a sync LlamaIndex wrapper from inside an already-running event loop.
This happens a lot with code like index.as_query_engine().query(...), VectorStoreIndex.from_documents(...) inside notebook cells, or custom wrappers around RetrieverQueryEngine and QueryEngineTool that mix sync and async calls.
Broken vs fixed pattern
| Broken | Fixed |
|---|---|
| Starts a new loop inside an existing async context | Uses await all the way through |
| Common in Jupyter, FastAPI, Streamlit | Works in async-native apps |
# BROKEN
import asyncio
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
docs = SimpleDirectoryReader("./data").load_data()
index = VectorStoreIndex.from_documents(docs)
async def main():
query_engine = index.as_query_engine()
# This can trigger:
# RuntimeError: This event loop is already running
result = asyncio.run(query_engine.aquery("What is in the documents?"))
print(result)
asyncio.run(main())
# FIXED
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
docs = SimpleDirectoryReader("./data").load_data()
index = VectorStoreIndex.from_documents(docs)
async def main():
query_engine = index.as_query_engine()
result = await query_engine.aquery("What is in the documents?")
print(result)
# In a script:
import asyncio
asyncio.run(main())
If you’re already inside a framework-managed async function, don’t call asyncio.run() at all. Use await directly.
Other Possible Causes
1. Using sync LlamaIndex APIs inside async framework handlers
FastAPI routes, Starlette endpoints, and some LangServe-style handlers run on an active event loop. Calling .query() instead of .aquery() can force blocking behavior and trigger loop issues later.
# BROKEN
@app.get("/search")
async def search(q: str):
response = query_engine.query(q) # sync call in async handler
return {"answer": str(response)}
# FIXED
@app.get("/search")
async def search(q: str):
response = await query_engine.aquery(q)
return {"answer": str(response)}
2. Notebook/Jupyter auto-await conflicts
Jupyter already runs an event loop. If you use asyncio.run() or create your own loop manually, you’ll get:
- •
RuntimeError: This event loop is already running - •sometimes followed by
RuntimeWarning: coroutine was never awaited
# BROKEN in Jupyter
import asyncio
result = asyncio.run(query_engine.aquery("Summarize this"))
# FIXED in Jupyter
result = await query_engine.aquery("Summarize this")
3. Mixing callback handlers that block the loop
Some custom callback handlers or instrumentation code does blocking I/O inside async callbacks. That can surface as an event-loop failure when LlamaIndex emits events through classes like CallbackManager.
# BROKEN
from llama_index.core.callbacks import CallbackManager
class MyHandler:
def on_event_start(self, *args, **kwargs):
import time
time.sleep(2) # blocks the loop
callback_manager = CallbackManager([MyHandler()])
Use non-blocking code or move heavy work out of the callback path.
4. Creating nested loops in worker threads or task runners
If you’re using Celery, background jobs, or custom thread pools and calling async LlamaIndex methods incorrectly, you can end up with:
- •
RuntimeError: There is no current event loop in thread - •
RuntimeError: Event loop is closed
# BROKEN
def worker():
asyncio.run(query_engine.aquery("Check policy terms"))
Prefer one top-level async entrypoint per process. If you must bridge sync to async in worker code, manage the boundary once and keep it consistent.
How to Debug It
- •
Check whether your code path is already async
- •If you’re inside
async def, do not useasyncio.run(). - •Replace
.query()with.aquery()and.chat()with.achat()where available.
- •If you’re inside
- •
Look at the exact traceback
- •
RuntimeError: This event loop is already running - •
RuntimeError: asyncio.run() cannot be called from a running event loop - •
RuntimeWarning: coroutine was never awaited
These point to different boundary mistakes.
- •
- •
Find the first sync call into LlamaIndex
- •Search for
.query(,.chat(,.retrieve(, or any wrapper that hides an internal coroutine. - •Check whether your app framework expects async handlers.
- •Search for
- •
Test outside the framework
- •Run the same LlamaIndex call in a plain Python script.
- •If it works there but fails in FastAPI/Jupyter/Streamlit, the issue is your runtime integration, not LlamaIndex itself.
Prevention
- •Use async end-to-end when your app framework is async.
- •Pair
aquery,achat, and other async methods withawait.
- •Pair
- •Don’t wrap coroutine calls with
asyncio.run()unless you control the whole process entrypoint. - •Keep notebook code notebook-native.
- •In Jupyter use
await ..., not manual loop management.
- •In Jupyter use
- •Add one integration test for each runtime you support:
- •plain script
- •notebook-style execution
- •web handler / API route
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit