How to Fix 'rate limit exceeded during development' in LangGraph (Python)

By Cyprian AaronsUpdated 2026-04-21

rate-limit-exceeded-during-developmentlanggraphpython

What this error means

rate limit exceeded during development usually means your LangGraph app is calling an LLM provider too often, too fast, or in a loop. In practice, this shows up when you’re testing a graph with recursion, retries, multiple nodes hitting the same model, or a bad control flow that keeps re-invoking the same step.

The actual provider error is often something like:

•openai.RateLimitError: Error code: 429
•anthropic.RateLimitError: 429 Too Many Requests
•google.api_core.exceptions.ResourceExhausted: 429 Resource has been exhausted

LangGraph does not create the rate limit by itself. It just makes it easier to accidentally trigger one because graphs can fan out, retry, and recurse.

The Most Common Cause

The #1 cause is an accidental loop in your graph logic. A node returns a state that routes back to itself or to another node that immediately calls the model again, so one user action becomes dozens of LLM requests.

Here’s the broken pattern I see most often:

Broken	Fixed
The graph keeps routing back into the same model node	The graph stops at a terminal state or conditionally exits
No guard on iteration count	Explicit max-steps / done flag
Model call happens on every pass	Model call happens once per turn

# BROKEN
from typing import TypedDict
from langgraph.graph import StateGraph, END
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o-mini")

class State(TypedDict):
    messages: list
    should_continue: bool

def agent_node(state: State):
    response = llm.invoke(state["messages"])
    return {
        "messages": state["messages"] + [response],
        "should_continue": True,   # always true = infinite loop risk
    }

def route(state: State):
    return "agent" if state["should_continue"] else END

graph = StateGraph(State)
graph.add_node("agent", agent_node)
graph.set_entry_point("agent")
graph.add_conditional_edges("agent", route)

app = graph.compile()

# FIXED
from typing import TypedDict
from langgraph.graph import StateGraph, END
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o-mini")

class State(TypedDict):
    messages: list
    steps: int

MAX_STEPS = 3

def agent_node(state: State):
    response = llm.invoke(state["messages"])
    return {
        "messages": state["messages"] + [response],
        "steps": state["steps"] + 1,
    }

def route(state: State):
    if state["steps"] >= MAX_STEPS:
        return END
    return "agent"

graph = StateGraph(State)
graph.add_node("agent", agent_node)
graph.set_entry_point("agent")
graph.add_conditional_edges("agent", route)

app = graph.compile()

If you are using StateGraph, MessageGraph, or a custom router, check whether your edge conditions ever allow the graph to terminate. A lot of “rate limit” bugs are really “my graph never stops” bugs.

Other Possible Causes

1. Multiple nodes call the same model in one run

If you have parallel branches or sequential nodes that all invoke ChatOpenAI, one input can become 5–10 API calls immediately.

def summarize(state):
    return {"summary": llm.invoke(state["messages"])}

def classify(state):
    return {"label": llm.invoke(state["messages"])}

# Both nodes hit the provider in the same execution path.

Fix by caching shared outputs in state and reusing them instead of calling the model twice.

2. Retry settings are too aggressive

Some wrappers retry on 429s automatically. If your app retries instantly without backoff, you can burn through limits faster.

llm = ChatOpenAI(
    model="gpt-4o-mini",
    max_retries=6,
)

If you already have graph-level retries plus provider-level retries, reduce one layer. Double retry stacks are common in development.

3. Streaming or UI refresh is triggering repeated invocations

A frontend rerender can accidentally re-run the backend endpoint that compiles and invokes the graph.

# Example smell:
# Every page refresh calls app.invoke(...)
result = app.invoke({"messages": messages})

Make sure invocation happens on user submit, not on render. In FastAPI/Streamlit/Next.js setups, this is a very common hidden source of duplicate requests.

4. Your prompt is causing tool-call churn

If the model keeps asking for tools because your tool output is incomplete or ambiguous, LangGraph may keep cycling through tool nodes and model nodes.

# Bad tool output example:
return {"result": "ok"}   # too vague for the agent to conclude anything

Return structured outputs with enough signal for the agent to stop:

return {"status": "done", "data": {...}}

How to Debug It

•
Count how many times each node runs Add logging inside every node. If agent_node prints 20 times for one request, you’ve found a loop or fan-out problem.
```
def agent_node(state):
    print(f"agent_node steps={state['steps']}")
    ...
```
•
Inspect your routing function Look at every conditional edge and confirm there is a valid exit path. If all branches point back into model nodes, you will hit rate limits quickly.
•
Temporarily disable retries Set max_retries=0 or lower it. If the error changes from repeated 429s to a single failure, retries were amplifying the issue.
•
Run with a hard step cap Add steps to state and stop after 2–3 iterations. If rate limits disappear, your graph logic was looping more than expected.

Prevention

•Add an explicit termination condition to every cyclic LangGraph workflow.
•Track per-run call counts in state so you can fail fast before hitting provider limits.
•Keep one layer of retries only: either wrapper-level retries or application-level retries, not both.

If you want a quick rule: when LangGraph throws rate-limit errors during development, assume your graph is making more calls than you think until proven otherwise.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit