How to Fix 'cold start latency during development' in LangGraph (TypeScript)

By Cyprian AaronsUpdated 2026-04-22

cold-start-latency-during-developmentlanggraphtypescript

Opening

cold start latency during development in LangGraph usually means your graph is paying the startup cost on every request instead of reusing initialized state. In TypeScript projects, this shows up most often during local dev with hot reload, serverless-style handlers, or when the graph is rebuilt inside a request path.

The symptom is simple: the first call feels slow, and every subsequent call in dev still feels slow because your process keeps recreating the graph, model client, or compiled workflow.

The Most Common Cause

The #1 cause is building the StateGraph and compiling it inside a request handler or function that runs on every invocation.

That pattern forces LangGraph to reconstruct nodes, edges, and model clients repeatedly. If you are also creating a new ChatOpenAI, database client, or embedding client inside that same path, the cold start gets worse.

Broken vs fixed

Broken pattern	Fixed pattern
Graph is created per request	Graph is created once at module scope
Model client is recreated per request	Model client is reused
Compile happens inside handler	Compile happens during startup

// broken.ts
import { StateGraph } from "@langchain/langgraph";
import { ChatOpenAI } from "@langchain/openai";

export async function handleRequest(input: string) {
  const llm = new ChatOpenAI({
    model: "gpt-4o-mini",
    apiKey: process.env.OPENAI_API_KEY,
  });

  const graph = new StateGraph({
    channels: {
      messages: {
        value: (x: string[]) => x,
        default: () => [],
      },
    },
  });

  graph.addNode("callModel", async (state) => {
    const res = await llm.invoke(state.messages);
    return { messages: [...state.messages, res.content as string] };
  });

  graph.setEntryPoint("callModel");

  const app = graph.compile();
  return app.invoke({ messages: [input] });
}

// fixed.ts
import { StateGraph } from "@langchain/langgraph";
import { ChatOpenAI } from "@langchain/openai";

const llm = new ChatOpenAI({
  model: "gpt-4o-mini",
  apiKey: process.env.OPENAI_API_KEY,
});

const graph = new StateGraph({
  channels: {
    messages: {
      value: (x: string[]) => x,
      default: () => [],
    },
  },
});

graph.addNode("callModel", async (state) => {
  const res = await llm.invoke(state.messages);
  return { messages: [...state.messages, res.content as string] };
});

graph.setEntryPoint("callModel");

export const app = graph.compile();

export async function handleRequest(input: string) {
  return app.invoke({ messages: [input] });
}

If you are using Express, Next.js route handlers, or a worker process, keep the compiled graph in module scope. Only pass per-request data into .invoke().

Other Possible Causes

1. Hot reload is reloading the module too often

In dev mode, Vite, Next.js, ts-node-dev, and nodemon can reload modules aggressively. That makes it look like LangGraph has cold start issues when the real problem is repeated process/module initialization.

// nextjs route.ts
export async function POST(req: Request) {
  // bad if this file rebuilds expensive clients every refresh
}

Fix by moving expensive initialization into a separate singleton module:

// lib/graph.ts
export const app = buildGraph();

Then import app from your route handler.

2. You are creating a new database or vector store connection inside a node

A LangGraph node should do work, not bootstrap infrastructure. If each node opens a fresh Postgres pool or vector store client, latency spikes hard during development.

// bad
graph.addNode("retrieve", async () => {
  const db = new Pool({ connectionString: process.env.DATABASE_URL });
  return db.query("select * from docs limit 5");
});

// good
const db = new Pool({ connectionString: process.env.DATABASE_URL });

graph.addNode("retrieve", async () => {
  return db.query("select * from docs limit 5");
});

3. Your node does synchronous heavy work before the first await

If your node parses large files, loads schemas, or reads local assets before yielding control, dev latency looks like a cold start issue.

graph.addNode("parse", async () => {
  const schema = JSON.parse(fs.readFileSync("./big-schema.json", "utf8"));
  return { schema };
});

Move that work to startup:

const schema = JSON.parse(fs.readFileSync("./big-schema.json", "utf8"));

graph.addNode("parse", async () => {
  return { schema };
});

4. You are not caching compiled graphs across test runs or local scripts

If you run tsx src/index.ts repeatedly or restart tests for every file change, you will always pay compile cost again. That is expected behavior.

Use a cached export:

let appCache:
  | ReturnType<typeof buildGraph>
  | undefined;

export function getApp() {
  if (!appCache) appCache = buildGraph();
  return appCache;
}

How to Debug It

•
Time graph construction separately from invocation
- •Add logs around new StateGraph(...), .compile(), and .invoke().
- •If compile time dominates, you are rebuilding too often.
•
Check whether initialization code lives inside a handler
- •Search for new ChatOpenAI, new Pool, new StateGraph, and .compile() inside route handlers or controller methods.
- •Anything there should usually move to module scope.
•
Run with module reload disabled
- •If latency disappears when hot reload is off, your problem is dev tooling re-importing modules.
- •That points to Next.js/Vite/nodemon behavior rather than LangGraph itself.
•
Inspect node-level logs
- •Add timestamps at the top of each node.
- •If delay happens before the first node log appears, startup/init code is the culprit.
- •If delay happens after entry into a specific node, that node has heavy sync work or slow I/O.

Prevention

•Build graphs once at startup and export the compiled app.
•Keep expensive clients outside nodes unless they must be request-scoped.
•Add timing logs for compile() and each major node so regressions show up immediately.

If you want stable dev performance in LangGraph TypeScript projects, treat graph construction like application bootstrapping, not request logic. That one change fixes most “cold start latency during development” reports before they become production problems.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit