How to Fix 'deployment crash in production' in LangChain (Python)
A deployment crash in production error in LangChain usually means your app started fine in development, but something in the runtime environment is different enough to kill the process once traffic hits it. In practice, this shows up during model initialization, tool execution, or when your chain hits a path you never exercised locally.
Most of the time, the root cause is not LangChain itself. It’s bad config, missing secrets, unhandled exceptions inside a chain/tool, or an async/sync mismatch that only surfaces under real load.
The Most Common Cause
The #1 cause is an exception escaping from a LangChain runnable, tool, or model call and taking down your process because nothing catches it.
A common pattern is calling invoke() directly inside request handlers without guarding failures from the model provider or tool layer.
| Broken pattern | Fixed pattern |
|---|---|
No exception handling around ChatOpenAI.invoke() | Wrap model calls and return a controlled error |
| Tool exceptions bubble up to the web server | Catch tool errors and map them to 4xx/5xx responses |
# broken.py
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage
llm = ChatOpenAI(model="gpt-4o-mini")
def handle_request(user_input: str):
# If this fails with AuthenticationError, RateLimitError,
# APIConnectionError, or ValidationError, your worker can crash.
response = llm.invoke([HumanMessage(content=user_input)])
return {"answer": response.content}
# fixed.py
import logging
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage
logger = logging.getLogger(__name__)
llm = ChatOpenAI(model="gpt-4o-mini")
def handle_request(user_input: str):
try:
response = llm.invoke([HumanMessage(content=user_input)])
return {"answer": response.content}
except Exception as e:
logger.exception("LangChain request failed")
return {
"error": "model_request_failed",
"detail": str(e),
}
If you are using tools, the same problem applies. A tool that raises ValueError, KeyError, or a network exception will bubble through AgentExecutor unless you explicitly handle it.
from langchain_core.tools import tool
@tool
def lookup_policy(policy_id: str) -> str:
if not policy_id.startswith("POL-"):
raise ValueError("Invalid policy id")
return "policy data"
If that tool is attached to an agent and called with bad input, you’ll often see stack traces ending in langchain_core.tools.base.ToolException or the original Python exception.
Other Possible Causes
1. Missing environment variables in production
This is the classic deployment-only failure. Locally you have .env; in production the container starts without OPENAI_API_KEY, ANTHROPIC_API_KEY, or whatever provider key your chain needs.
# broken deployment env
OPENAI_API_KEY=
LANGCHAIN_TRACING_V2=true
# fixed deployment env
OPENAI_API_KEY=sk-...
LANGCHAIN_TRACING_V2=true
LANGCHAIN_API_KEY=lsv2_...
Typical error messages:
- •
openai.AuthenticationError: Error code: 401 - •
langchain_openai.OpenAIError: The api_key client option must be set - •
ValueError: Did not find openai_api_key
2. Async code running in a sync context
If you use ainvoke(), astream(), or async tools inside a sync web handler incorrectly, you can get event loop failures or hung workers.
# broken
result = chain.ainvoke({"question": "What is fraud?"}) # returns coroutine
# fixed
import asyncio
result = asyncio.run(chain.ainvoke({"question": "What is fraud?"}))
In FastAPI or any async framework, prefer:
@app.post("/chat")
async def chat(payload: dict):
return await chain.ainvoke(payload)
Common errors:
- •
RuntimeWarning: coroutine was never awaited - •
RuntimeError: This event loop is already running
3. Prompt / schema mismatch after deployment
If your prompt expects keys that your app no longer sends, LangChain will fail at runtime with validation errors.
from langchain_core.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate.from_template(
"Answer based on {context} and {question}"
)
# broken payload missing 'context'
prompt.invoke({"question": "Explain lapse risk"})
Typical error:
- •
KeyError: 'context' - •
pydantic_core._pydantic_core.ValidationError
Fix by validating input before invoking the chain:
payload = {"context": context_text, "question": user_question}
4. Dependency drift between local and production
LangChain moves fast. A version mismatch between langchain, langchain-core, and provider packages can break imports or runtime behavior.
# broken mix
langchain==0.2.15
langchain-core==0.1.52
langchain-openai==0.1.21
Pin compatible versions together and rebuild cleanly:
langchain==0.2.15
langchain-core==0.2.36
langchain-openai==0.1.23
You may see errors like:
- •
ImportError: cannot import name ... - •
TypeError: ... got an unexpected keyword argument ...
How to Debug It
- •
Read the first real exception in the stack trace
- •Ignore the final “deployment crashed” wrapper.
- •Look for the first application-level error like:
- •
openai.AuthenticationError - •
KeyError - •
ValidationError - •
ToolException
- •
- •
Run the exact production config locally
- •Use the same Python version.
- •Use the same package versions.
- •Load env vars exactly as prod does.
- •If Docker is used in prod, run Docker locally too.
- •
Isolate the failing LangChain call
- •Comment out tools.
- •Replace retrievers with static text.
- •Call only one step at a time.
print("before invoke")
result = chain.invoke({"question": "test"})
print("after invoke", result)
- •Add structured logging around every external dependency
- •Log model name.
- •Log prompt keys.
- •Log tool input shape.
- •Log whether env vars are present, not their values.
import os
print({
"has_openai_key": bool(os.getenv("OPENAI_API_KEY")),
"model": "gpt-4o-mini",
})
Prevention
- •
Wrap all LangChain entry points
- •Every
.invoke(),.ainvoke(), and tool call should have exception handling at the boundary of your API worker.
- •Every
- •
Pin versions and lock dependencies
- •Keep
langchain, provider integrations, andpydanticaligned. - •Rebuild from lockfiles in CI and production.
- •Keep
- •
Validate inputs before calling chains
- •Use Pydantic models for request payloads.
- •Make missing prompt variables impossible to ship.
If you’re seeing a deployment crash in production with LangChain Python, assume runtime mismatch first, not model failure first. In most cases, one of these four things is true: bad secrets, unhandled exceptions, async misuse, or version drift between environments.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit