How to Fix 'output parsing error in production' in CrewAI (TypeScript)

By Cyprian AaronsUpdated 2026-04-21
output-parsing-error-in-productioncrewaitypescript

What the error means

output parsing error in production usually means CrewAI got a model response it could not convert into the structured shape your task expected. In practice, this shows up when you ask for JSON, a typed object, or a specific schema, and the LLM returns extra prose, malformed JSON, or a field mismatch.

In TypeScript projects, this often happens after you wire a Task to expect structured output and then deploy with slightly different prompts, temperatures, or model behavior than you tested locally.

The Most Common Cause

The #1 cause is asking the agent for structured output without enforcing a strict format. CrewAI’s parser expects something machine-readable, but the model answers like a chat assistant.

Here’s the broken pattern:

import { Agent, Task, Crew } from "crewai";

const analyst = new Agent({
  role: "Financial Analyst",
  goal: "Summarize customer complaints",
  backstory: "You analyze support tickets for banking ops.",
});

const task = new Task({
  description: `
    Summarize the complaint and return JSON with:
    - category
    - severity
    - summary
  `,
  expectedOutput: "Valid JSON only",
  agent: analyst,
});

const crew = new Crew({
  agents: [analyst],
  tasks: [task],
});

const result = await crew.kickoff();

This looks fine until the model returns something like:

{
  "category": "fraud",
  "severity": "high",
  "summary": "Customer reports unauthorized card charges."
}

followed by extra text like:

Here is the structured output you requested.

That is enough to trigger errors such as:

  • Error: Output parsing error
  • CrewAIOutputParserError
  • Failed to parse task output as JSON

The fix is to make the contract explicit and validate it in code.

Broken patternFixed pattern
Loose instruction like “return JSON only”Explicit schema + strict formatting instructions
No validation after completionParse and validate before using output
Free-form agent responseStructured output type or formatter

Fixed version:

import { z } from "zod";
import { Agent, Task, Crew } from "crewai";

const ComplaintSchema = z.object({
  category: z.string(),
  severity: z.enum(["low", "medium", "high"]),
  summary: z.string(),
});

type ComplaintSummary = z.infer<typeof ComplaintSchema>;

const analyst = new Agent({
  role: "Financial Analyst",
  goal: "Summarize customer complaints into strict JSON",
  backstory: "You analyze support tickets for banking ops.",
});

const task = new Task({
  description: `
Return ONLY valid JSON matching this schema:
{
  "category": string,
  "severity": "low" | "medium" | "high",
  "summary": string
}

No markdown. No explanation.
`,
  expectedOutput: 'Strict JSON matching { category, severity, summary }',
  agent: analyst,
});

const crew = new Crew({
  agents: [analyst],
  tasks: [task],
});

const result = await crew.kickoff();
const parsed: ComplaintSummary = ComplaintSchema.parse(JSON.parse(result.toString()));

If your model still drifts, lower temperature and make the output contract even narrower. In production, vague formatting instructions are not enough.

Other Possible Causes

1. Your prompt contains conflicting instructions

If one part says “write a summary” and another says “return only JSON,” the model often tries to satisfy both.

description: `
Summarize the ticket in plain English.
Then return JSON with category and severity.
`

Fix it by making the task singular:

description: `
Return ONLY this JSON:
{
  "category": string,
  "severity": string,
  "summary": string
}
`

2. The schema changed but your parser did not

This happens when you add fields in TypeScript but forget to update the prompt or downstream parser.

// Parser expects:
{ category: string; severity: string; summary: string }

// Model now returns:
{ category: string; severity: string; summary: string; confidence: number }

Keep prompt and validation aligned:

const ComplaintSchema = z.object({
  category: z.string(),
  severity: z.enum(["low", "medium", "high"]),
  summary: z.string(),
  confidence: z.number().optional(),
});

3. The model is too creative for structured output

High temperature increases format drift.

const analyst = new Agent({
  role: "Analyst",
  goal: "...",
  backstory: "...",
  llmConfig: {
    temperature: 0.7,
  },
});

For production parsing tasks, keep it deterministic:

llmConfig: {
  temperature: 0,
}

4. Your task context is too large or noisy

Long context can push the model to ignore formatting constraints. This is common when you dump full ticket histories or logs into one task.

Bad:

description: `Analyze all of these logs:\n${hugeLogBlob}\nReturn JSON.`

Better:

description: `
Analyze this single ticket excerpt:
${ticketExcerpt}

Return ONLY valid JSON.
`

How to Debug It

  1. Log the raw model output before parsing

    • Don’t inspect only the final exception.
    • Print result.toString() or equivalent before JSON.parse.
  2. Check whether the failure is malformed JSON or schema mismatch

    • Malformed JSON means syntax issues like trailing commas or markdown fences.
    • Schema mismatch means valid JSON with wrong keys or wrong types.
  3. Temporarily reduce the prompt to one hard constraint

    • Remove examples, explanations, and secondary instructions.
    • Keep only “Return ONLY valid JSON matching this schema.”
  4. Run with deterministic settings

    • Set temperature to 0.
    • Use a smaller prompt and test on one representative input.
    • If it works locally but fails in prod, compare environment-specific prompts and model versions.

A practical debug loop looks like this:

try {
  const result = await crew.kickoff();
  console.log("RAW OUTPUT:", result.toString());

  const parsed = ComplaintSchema.parse(JSON.parse(result.toString()));
} catch (err) {
    console.error("CrewAI parse failure:", err);
}

If you see fenced code blocks like json ... , strip them before parsing or stop asking for markdown entirely.

Prevention

  • Use strict schemas with runtime validation (zod, valibot, or equivalent) before consuming any CrewAI output.
  • Set temperature: 0 for tasks that must parse cleanly.
  • Keep prompts narrow:
    • one task
    • one format
    • one output contract

If you’re building multi-step workflows, validate each step independently instead of letting one bad LLM response cascade into a production incident. That’s how these errors become expensive.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides