Error Handling Modernization - ADR (Connector-Agnostic)

Status: Proposed Date: 2025-11-07 Author: System Architecture Supersedes: None Related: LOGGING_ARCHITECTURE.md, connector-error-handling-standard.md

Context

The emp-job-queue system integrates with multiple external services (ComfyUI, OpenAI, Gemini, Glif, etc.), and currently suffers from inconsistent error handling:

User-Facing Problems:

"Error in component 'Group Image': [object Object]" (ComfyUI serialization)
"Unknown error type. Simple retry may help." (Generic fallback)
"Rate limit exceeded" (Lost component context)
Inconsistent error messages across different connectors

Root Causes:

Service-Specific Issues: Each external service returns errors differently
Context Loss: Component/workflow info lost during error propagation
Inconsistent Connector Handling: Each connector handles errors differently
Poor Descriptions: FailureClassifier fallbacks are too generic

Critical Principle:

ComfyUI is just one external service among many. TypeScript must be the source of truth for error classification - NOT Python, NOT any external service.

Current State

Multi-Connector Architecture

┌─────────────────────────────────────────────────────────────┐
│              External Services (We Don't Control)           │
│                                                               │
│  ComfyUI          OpenAI          Gemini         Glif        │
│  Python           REST API        REST API       REST API    │
│  WebSocket        JSON errors     JSON errors    JSON errors │
│  execution_error  {"error": {...  {"error": {... {"message":│
│  str(x) → [obj]   "code": 429}   "code": 400}   "fail"}     │
└─────────────────────────────────────────────────────────────┘
                            ↓
┌─────────────────────────────────────────────────────────────┐
│            TypeScript Connector Layer (WE CONTROL)          │
│                                                               │
│  ComfyUIRestStreamConnector  OpenAITextConnector            │
│  ComfyUIWebSocketConnector   GeminiImageConnector           │
│  ComfyUIRemoteConnector      GlifConnector                  │
│                                                               │
│  ⚠️ PROBLEM: Each connector handles errors differently      │
│  ⚠️ PROBLEM: Pattern matching is fragile and inconsistent   │
│  ⚠️ PROBLEM: Context (component, workflow) often lost       │
└─────────────────────────────────────────────────────────────┘
                            ↓
┌─────────────────────────────────────────────────────────────┐
│         FailureClassifier (TypeScript - Our Code)           │
│  ✅ Already connector-agnostic                              │
│  ✅ Already works across all services                       │
│  ⚠️ BUT: Fallback messages too generic                      │
│  ⚠️ BUT: Doesn't preserve context                           │
└─────────────────────────────────────────────────────────────┘

Specific Problems by Connector

Connector	Service	Current Issue	User Impact
ComfyUI REST Stream	ComfyUI	Pattern matching only, generic errors	"Unknown error" for uncommon issues
ComfyUI WebSocket	ComfyUI	15+ service-specific patterns	Works but fragile (breaks if ComfyUI changes wording)
OpenAI Text/Image	OpenAI	Uses HTTP error parser	Lost OpenAI error codes, generic descriptions
Gemini Image	Google	Uses HTTP error parser	Lost Google error details
Glif	Glif API	Uses HTTP error parser	Lost Glif-specific context

Key Insight: We already have ConnectorError.fromHTTPError() but it's too generic. Each connector needs service-specific enhancement while using common TypeScript infrastructure.

Decision

Implement connector-agnostic error handling with TypeScript as source of truth:

Principle 1: TypeScript Owns Error Classification

typescript

// Source of Truth: @emp/core/src/types/failure-classification.ts
export enum FailureType { ... }
export enum FailureReason { ... }
export class FailureClassifier { ... }

NOT:

❌ Python error codes (duplicates logic, ComfyUI-specific)
❌ Service-specific enums (doesn't scale to N services)
❌ External service error codes (we don't control them)

Principle 2: Service-Specific Parsers Enhance Generic Errors

Each service gets a lightweight parser that maps service errors → FailureClassifier:

typescript

// Service-specific knowledge
class OpenAIErrorEnhancer {
  static enhance(httpError: any): ConnectorError {
    // Map OpenAI error codes to FailureReason
    if (httpError.response?.data?.error?.code === 'context_length_exceeded') {
      return new ConnectorError(
        FailureType.VALIDATION_ERROR,
        FailureReason.INVALID_PAYLOAD,
        'Input exceeds OpenAI token limit',
        false,
        {
          serviceType: 'openai',
          maxTokens: httpError.response.data.error.param,
          suggestion: 'Reduce input text length or use a larger model'
        }
      );
    }
    // ... 10 more OpenAI-specific mappings

    // Fallback to generic HTTP handler
    return ConnectorError.fromHTTPError(httpError, 'openai');
  }
}

Principle 3: Always Preserve Context

Every ConnectorError must capture:

serviceType (comfyui, openai, gemini, etc.)
component (if from Studio V2 workflow)
workflow (workflow name/ID)
suggestion (actionable next step)
rawError (full error for debugging)

Proposed Architecture (Connector-Agnostic)

Component Overview

┌────────────────────────────────────────────────────────────────┐
│                External Services (Unchanged)                    │
│  - ComfyUI, OpenAI, Gemini, Glif, etc.                        │
│  - Each returns errors in different format                     │
└────────────────────────────────────────────────────────────────┘
                              ↓
┌────────────────────────────────────────────────────────────────┐
│         Service-Specific Error Enhancers (NEW)                  │
│  ┌──────────────────────────────────────────────────────────┐ │
│  │ OpenAIErrorEnhancer                                       │ │
│  │   - Maps OpenAI error codes → FailureReason             │ │
│  │   - Adds OpenAI-specific suggestions                     │ │
│  │                                                           │ │
│  │ GeminiErrorEnhancer                                      │ │
│  │   - Maps Google error codes → FailureReason             │ │
│  │   - Adds Gemini-specific suggestions                     │ │
│  │                                                           │ │
│  │ ComfyUIErrorEnhancer                                     │ │
│  │   - Parses ComfyUI log messages → FailureReason         │ │
│  │   - Adds node/workflow context                           │ │
│  │   - Fallback to FailureClassifier pattern matching      │ │
│  └──────────────────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────────────────┘
                              ↓
┌────────────────────────────────────────────────────────────────┐
│         FailureClassifier (Enhanced - Core Package)             │
│  ┌──────────────────────────────────────────────────────────┐ │
│  │ EXISTING (Keep):                                          │ │
│  │   - classify(message: string) → FailureClassification   │ │
│  │   - Pattern matching for generic errors                  │ │
│  │                                                           │ │
│  │ NEW (Add):                                               │ │
│  │   - Better fallback descriptions                         │ │
│  │   - Suggestion generation                                │ │
│  │   - Context preservation helpers                         │ │
│  └──────────────────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────────────────┘
                              ↓
┌────────────────────────────────────────────────────────────────┐
│               ConnectorError (Enhanced - Core Package)          │
│  ┌──────────────────────────────────────────────────────────┐ │
│  │ EXISTING (Keep):                                          │ │
│  │   - failureType, failureReason, retryable               │ │
│  │   - context: ConnectorErrorContext                       │ │
│  │   - fromError(), fromHTTPError()                         │ │
│  │                                                           │ │
│  │ NEW (Add):                                               │ │
│  │   - getUserMessage(): hierarchical display               │ │
│  │   - getSuggestion(): actionable next step                │ │
│  │   - preserveContext(component, workflow)                 │ │
│  └──────────────────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────────────────┘
                              ↓
┌────────────────────────────────────────────────────────────────┐
│                 Connector Base Pattern (Standard)               │
│  All connectors follow standard error handling pattern:        │
│                                                                 │
│  1. Catch service error                                        │
│  2. Use service-specific enhancer (if available)               │
│  3. Preserve context (component, workflow)                     │
│  4. Return ConnectorError                                      │
└────────────────────────────────────────────────────────────────┘

Implementation Phases (Quick Wins First)

Phase 1: Fix ComfyUI Object Serialization (1 Hour) ✅ QUICK WIN

Problem: "model=[object Object]" in ComfyUI errors Solution: Fix Python serialization (ComfyUI-specific, one-time fix)

File: packages/comfyui/execution.py:393

Replace format_value() with safe JSON serialization:

python

import json

def format_value_safe(x, max_depth=3, current_depth=0):
    """Serialize value for error reporting (prevents [object Object])."""
    if current_depth >= max_depth:
        return f"<{type(x).__name__}>"

    if isinstance(x, (int, float, bool, str, type(None))):
        return x
    elif isinstance(x, dict):
        return {k: format_value_safe(v, max_depth, current_depth + 1)
                for k, v in list(x.items())[:10]}
    elif isinstance(x, (list, tuple)):
        return [format_value_safe(v, max_depth, current_depth + 1)
                for v in list(x)[:10]]
    else:
        try:
            json.dumps(x)
            return x
        except (TypeError, ValueError):
            return f"<{type(x).__name__}>"

Impact: Zero code changes in TypeScript, fixes one service's serialization issue.

Phase 2: Enhance FailureClassifier Descriptions (2 Hours) ✅ QUICK WIN

Problem: Generic fallback messages like "Unclassified error" Solution: Better descriptions in existing ErrorDescriptions map

File: packages/core/src/types/connector-errors.ts:311

Enhance existing descriptions with context variables:

typescript

export const ErrorDescriptions: Record<FailureReason, string> = {
  // OLD: Generic and unhelpful
  [FailureReason.UNKNOWN_ERROR]:
    'An unknown error occurred. Please try again or contact support.',

  // NEW: Contextual and actionable
  [FailureReason.UNKNOWN_ERROR]:
    'An unexpected error occurred in {serviceType}: {message}. ' +
    'This may be a temporary issue - try again or contact support if it persists.',

  // OLD: Technical jargon
  [FailureReason.INVALID_PAYLOAD]:
    'Request contains invalid data. Please check your input parameters.',

  // NEW: Service-aware guidance
  [FailureReason.INVALID_PAYLOAD]:
    '{serviceType} rejected the request due to invalid data. ' +
    'Common causes: incorrect parameter format, missing required fields, or unsupported values. ' +
    'Check your workflow configuration and try again.',

  // ... enhance all 40+ error descriptions
};

Impact: All connectors get better error messages immediately (no connector changes needed).

Phase 3: Standard Connector Error Pattern (4 Hours) ✅ CRITICAL

Problem: Each connector handles errors differently Solution: Base class method for standardized error handling

File: apps/worker/src/connectors/base-connector.ts

Add standard error handling method:

typescript

export abstract class BaseConnector {
  /**
   * Standard error handling pattern for all connectors
   *
   * Usage in child connectors:
   *   catch (error) {
   *     throw this.createConnectorError(error, jobData, {
   *       suggestion: 'Try reducing image size'
   *     });
   *   }
   */
  protected createConnectorError(
    error: any,
    jobData: JobData,
    options?: {
      suggestion?: string;
      context?: Record<string, any>;
      retryable?: boolean;
    }
  ): ConnectorError {
    // Extract context from job data
    const component = this.extractComponentFromJobData(jobData);
    const workflow = this.extractWorkflowFromJobData(jobData);

    // Use service-specific enhancer if available
    let connectorError: ConnectorError;

    if (this.service_type === 'openai' && error.response) {
      connectorError = OpenAIErrorEnhancer.enhance(error);
    } else if (this.service_type === 'gemini' && error.response) {
      connectorError = GeminiErrorEnhancer.enhance(error);
    } else if (error instanceof Error) {
      connectorError = ConnectorError.fromError(error, this.service_type);
    } else {
      connectorError = ConnectorError.fromHTTPError(error, this.service_type);
    }

    // Preserve context
    connectorError.context = {
      ...connectorError.context,
      serviceType: this.service_type,
      component,
      workflow,
      jobId: jobData.id,
      ...options?.context
    };

    // Override suggestion if provided
    if (options?.suggestion) {
      connectorError.context.suggestion = options.suggestion;
    }

    // Override retryability if specified
    if (options?.retryable !== undefined) {
      (connectorError as any).retryable = options.retryable;
    }

    return connectorError;
  }

  /** Extract component name from job data (Studio V2 workflows) */
  private extractComponentFromJobData(jobData: JobData): string | undefined {
    return jobData.metadata?.component_name ||
           jobData.ctx?.component_name ||
           undefined;
  }

  /** Extract workflow name from job data */
  private extractWorkflowFromJobData(jobData: JobData): string | undefined {
    return jobData.payload?.workflow_name ||
           jobData.metadata?.workflow_name ||
           undefined;
  }
}

Impact: All connectors now have standard error handling (consistent, context-preserving).

Phase 4: Service-Specific Enhancers (1 Day - Optional)

Problem: HTTP errors lose service-specific details Solution: Create lightweight enhancers for each major service

OpenAI Error Enhancer

File: packages/core/src/error-enhancers/openai-error-enhancer.ts (NEW)

typescript

export class OpenAIErrorEnhancer {
  static enhance(httpError: any): ConnectorError {
    const errorData = httpError.response?.data?.error;

    if (!errorData) {
      return ConnectorError.fromHTTPError(httpError, 'openai');
    }

    // Map OpenAI error codes to our FailureReason
    const codeMap: Record<string, [FailureType, FailureReason, string]> = {
      'context_length_exceeded': [
        FailureType.VALIDATION_ERROR,
        FailureReason.INVALID_PAYLOAD,
        'Input exceeds OpenAI token limit. Reduce input text or use a larger model (gpt-4-turbo has 128k context).'
      ],
      'rate_limit_exceeded': [
        FailureType.RATE_LIMIT,
        FailureReason.REQUESTS_PER_MINUTE,
        'OpenAI rate limit exceeded. Wait a moment before retrying.'
      ],
      'invalid_api_key': [
        FailureType.AUTH_ERROR,
        FailureReason.INVALID_API_KEY,
        'OpenAI API key is invalid or expired. Check credentials configuration.'
      ],
      'insufficient_quota': [
        FailureType.RATE_LIMIT,
        FailureReason.DAILY_QUOTA_EXCEEDED,
        'OpenAI account quota exceeded. Add credits or upgrade plan.'
      ],
      'model_not_found': [
        FailureType.VALIDATION_ERROR,
        FailureReason.MODEL_NOT_FOUND,
        'OpenAI model not found. Check model name or your account access.'
      ]
    };

    const mapping = codeMap[errorData.code];

    if (mapping) {
      const [failureType, failureReason, suggestion] = mapping;
      return new ConnectorError(
        failureType,
        failureReason,
        errorData.message || 'OpenAI API error',
        failureType === FailureType.RATE_LIMIT, // Rate limits are retryable
        {
          serviceType: 'openai',
          httpStatus: httpError.response?.status,
          openaiErrorCode: errorData.code,
          openaiErrorType: errorData.type,
          suggestion,
          rawRequest: httpError.config?.data,
          rawResponse: errorData
        }
      );
    }

    // Fallback to HTTP handler
    return ConnectorError.fromHTTPError(httpError, 'openai');
  }
}

Gemini Error Enhancer

File: packages/core/src/error-enhancers/gemini-error-enhancer.ts (NEW)

typescript

export class GeminiErrorEnhancer {
  static enhance(httpError: any): ConnectorError {
    const errorData = httpError.response?.data?.error;
    const status = httpError.response?.status;

    // Google Cloud error format
    if (errorData?.status) {
      const statusMap: Record<string, [FailureType, FailureReason, string]> = {
        'PERMISSION_DENIED': [
          FailureType.AUTH_ERROR,
          FailureReason.INSUFFICIENT_PERMISSIONS,
          'Google Cloud API access denied. Check service account permissions.'
        ],
        'RESOURCE_EXHAUSTED': [
          FailureType.RATE_LIMIT,
          FailureReason.REQUESTS_PER_MINUTE,
          'Gemini API quota exceeded. Wait before retrying.'
        ],
        'INVALID_ARGUMENT': [
          FailureType.VALIDATION_ERROR,
          FailureReason.INVALID_PAYLOAD,
          'Gemini rejected request parameters. Check input format and requirements.'
        ],
        'FAILED_PRECONDITION': [
          FailureType.VALIDATION_ERROR,
          FailureReason.UNSUPPORTED_OPERATION,
          'Gemini model doesn\'t support this operation. Check model capabilities.'
        ]
      };

      const mapping = statusMap[errorData.status];
      if (mapping) {
        const [failureType, failureReason, suggestion] = mapping;
        return new ConnectorError(
          failureType,
          failureReason,
          errorData.message || 'Gemini API error',
          failureType === FailureType.RATE_LIMIT,
          {
            serviceType: 'gemini',
            httpStatus: status,
            geminiErrorStatus: errorData.status,
            suggestion,
            rawResponse: errorData
          }
        );
      }
    }

    return ConnectorError.fromHTTPError(httpError, 'gemini');
  }
}

ComfyUI Error Enhancer

File: packages/core/src/error-enhancers/comfyui-error-enhancer.ts (NEW)

typescript

export class ComfyUIErrorEnhancer {
  /**
   * Enhance ComfyUI errors (both log stream and WebSocket)
   */
  static enhance(error: any, context?: { component?: string; workflow?: string }): ConnectorError {
    const message = error.message || error.exception_message || '';

    // Check for specific ComfyUI error patterns

    // GPU OOM
    if (message.includes('CUDA out of memory') || message.includes('GPU memory')) {
      return new ConnectorError(
        FailureType.RESOURCE_LIMIT,
        FailureReason.GPU_MEMORY_FULL,
        'GPU ran out of memory during workflow execution',
        true, // Retryable
        {
          serviceType: 'comfyui',
          node: error.node_id ? { id: error.node_id, type: error.node_type } : undefined,
          component: context?.component,
          workflow: context?.workflow,
          suggestion: 'Try reducing image size, batch size, or use a lower resolution. ' +
                     'Current settings exceeded GPU capacity.',
          rawError: error
        }
      );
    }

    // Missing custom node
    if (message.match(/node .+ does not exist/i)) {
      const nodeMatch = message.match(/node (\S+) does not exist/i);
      const nodeName = nodeMatch ? nodeMatch[1] : 'unknown';
      return new ConnectorError(
        FailureType.VALIDATION_ERROR,
        FailureReason.UNSUPPORTED_OPERATION,
        `Custom node '${nodeName}' is not installed`,
        false, // Not retryable
        {
          serviceType: 'comfyui',
          node: { id: error.node_id, type: nodeName },
          component: context?.component,
          workflow: context?.workflow,
          suggestion: `Install the '${nodeName}' custom node from ComfyUI Manager before running this workflow.`,
          rawError: error
        }
      );
    }

    // Model not found
    if (message.match(/checkpoint|model.+not found|does not exist/i)) {
      const modelMatch = message.match(/'([^']+)'/);
      const modelName = modelMatch ? modelMatch[1] : 'unknown';
      return new ConnectorError(
        FailureType.VALIDATION_ERROR,
        FailureReason.MODEL_NOT_FOUND,
        `Model '${modelName}' not found in ComfyUI`,
        false,
        {
          serviceType: 'comfyui',
          node: error.node_id ? { id: error.node_id, type: error.node_type } : undefined,
          component: context?.component,
          workflow: context?.workflow,
          suggestion: `Install the model '${modelName}' or select a different model from the available list.`,
          rawError: error
        }
      );
    }

    // Fallback to FailureClassifier pattern matching
    const classification = FailureClassifier.classify(message, { serviceType: 'comfyui' });
    return new ConnectorError(
      classification.failure_type,
      classification.failure_reason,
      message,
      classification.retryable ?? true,
      {
        serviceType: 'comfyui',
        node: error.node_id ? { id: error.node_id, type: error.node_type } : undefined,
        component: context?.component,
        workflow: context?.workflow,
        rawError: error
      }
    );
  }
}

Export in Core:

typescript

// packages/core/src/index.ts
export * from './error-enhancers/openai-error-enhancer.js';
export * from './error-enhancers/gemini-error-enhancer.js';
export * from './error-enhancers/comfyui-error-enhancer.js';

Impact: Service-specific errors get detailed context while maintaining connector-agnostic architecture.

Phase 5: Update Connectors to Use Pattern (4 Hours)

Problem: Existing connectors don't use standard pattern Solution: Refactor all connectors to use createConnectorError()

Example: OpenAI Text Connector

Before:

typescript

// apps/worker/src/connectors/openai-text-connector.ts
try {
  const response = await this.client.chat.completions.create(params);
  // ...
} catch (error: any) {
  // Generic HTTP error handling - loses OpenAI context
  throw ConnectorError.fromHTTPError(error, 'openai');
}

After:

typescript

try {
  const response = await this.client.chat.completions.create(params);
  // ...
} catch (error: any) {
  // Use standard pattern with service-specific enhancement
  throw this.createConnectorError(error, jobData);
}

Example: ComfyUI REST Stream Connector

Before:

typescript

// Inline error classification
const classification = FailureClassifier.classify(message.data.message);
return new ConnectorError(
  classification.failure_type,
  classification.failure_reason,
  message.data.message,
  classification.retryable,
  { serviceType: 'comfyui_rest_stream' }
);

After:

typescript

// Use standard pattern + ComfyUI enhancer
throw this.createConnectorError(message.data, jobData);

Impact: All connectors now have consistent error handling, context preservation, and service-specific enhancements.

Consequences

Benefits

1. Connector-Agnostic Architecture ✅

Works for ANY external service (ComfyUI, OpenAI, Gemini, future services)
TypeScript is single source of truth
No duplication between languages

2. Better User Experience ✅

Context preserved: "Error in component 'Image Generator' → OpenAI rate limit exceeded"
Actionable suggestions: "Wait 60 seconds and try again"
Service-aware messages: "OpenAI token limit (4096) exceeded. Use gpt-4-turbo for larger inputs."

3. Maintainability ✅

Standard pattern for all connectors
Service-specific logic isolated in enhancers
Easy to add new services (just add new enhancer)

4. Quick Wins ✅

Phase 1: Fixes ComfyUI serialization (1 hour, zero TypeScript changes)
Phase 2: Better error descriptions (2 hours, all connectors benefit)
Phase 3: Standard pattern (4 hours, foundation for everything)

Drawbacks

1. Service-Specific Knowledge Required

Each new service needs an enhancer with service-specific error code mappings.

Mitigation: Enhancers are optional - fallback to generic HTTP/FailureClassifier always works.

2. Migration Effort

All existing connectors need refactoring to use standard pattern.

Mitigation: Can be done incrementally (connector by connector). Old code still works.

Success Metrics

Quantitative

Zero "[object Object]" in production error logs (Phase 1)
100% context preservation - all errors include component/workflow (Phase 3)
< 5% "Unknown error" fallback - most errors get specific classification (Phase 4)
90%+ errors have suggestions (Phase 2-4)

Qualitative

Support ticket reduction - fewer "what does this error mean?" tickets
Developer productivity - faster debugging with preserved context
User satisfaction - clear, actionable error messages

Migration Strategy

Incremental Rollout

Week 1: Quick wins (Phases 1-2)

Fix ComfyUI serialization
Enhance error descriptions
Deploy to production (low risk, high impact)

Week 2: Foundation (Phase 3)

Add standard pattern to BaseConnector
Refactor 1-2 connectors as proof of concept
A/B test with old error handling

Week 3: Service Enhancers (Phase 4 - Optional)

Create OpenAI, Gemini, ComfyUI enhancers
Test with real errors from production

Week 4: Full Migration (Phase 5)

Refactor all remaining connectors
Remove old error handling code
Monitor error rates

Dependencies

TypeScript

Existing: @emp/core (FailureClassifier, ConnectorError)
New: Service-specific enhancers (lightweight, ~100 LOC each)

Python (ComfyUI only)

One-time fix: format_value_safe() in execution.py

Alternative Approaches Considered

❌ Alternative 1: Python Error Codes (Original ADR)

Why Rejected:

ComfyUI-specific, doesn't help OpenAI/Gemini/etc.
Duplicates logic between Python and TypeScript
Makes Python the source of truth (wrong direction)

❌ Alternative 2: Each Connector Handles Errors Independently

Why Rejected:

Inconsistent error messages
Duplicated logic across connectors
No context preservation standard

✅ Alternative 3: TypeScript-First with Service Enhancers (THIS ADR)

Why Chosen:

Connector-agnostic
TypeScript is source of truth
Service-specific knowledge isolated and optional
Incremental implementation path

connector-error-handling-standard.md - ConnectorError class design
LOGGING_ARCHITECTURE.md - Redis log streaming (Phase 2 implemented)

Appendix: Error Message Examples

Before Modernization

ComfyUI:

Error: model=[object Object], vae=[object Object]

OpenAI:

HTTP 429: Rate limit exceeded

Generic:

Unknown error type. Simple retry may help.

After Modernization

ComfyUI:

ComfyUI Workflow Error → Resource Limit → GPU Memory Exceeded

GPU ran out of memory while processing node 'KSampler' (ID: 3)
in workflow 'Image Generation'.

💡 Suggestion: Try reducing image size from 2048x2048 to 1024x1024,
or reduce batch size from 4 to 2.

Component: Image Generator
Workflow: txt2img-basic
Node: KSampler (ID: 3)

OpenAI:

OpenAI API Error → Validation Error → Token Limit Exceeded

Input exceeds OpenAI token limit (4096 tokens).

💡 Suggestion: Reduce input text length or use a larger model like
gpt-4-turbo which supports 128,000 tokens.

Component: Text Generator
Model: gpt-3.5-turbo
Input Tokens: 5200
Max Tokens: 4096

Gemini:

Gemini API Error → Rate Limit → Request Quota Exceeded

Gemini API quota exceeded for your project.

💡 Suggestion: Wait a moment before retrying, or upgrade your
Google Cloud quota limit.

Component: Image Analysis
Model: gemini-pro-vision
Quota: 60 requests/minute (exceeded)

End of ADR

Error Handling Modernization - ADR (Connector-Agnostic) ​

Context ​

Current State ​

Multi-Connector Architecture ​

Specific Problems by Connector ​

Decision ​

Principle 1: TypeScript Owns Error Classification ​

Principle 2: Service-Specific Parsers Enhance Generic Errors ​

Principle 3: Always Preserve Context ​

Proposed Architecture (Connector-Agnostic) ​

Component Overview ​

Implementation Phases (Quick Wins First) ​

Phase 1: Fix ComfyUI Object Serialization (1 Hour) ✅ QUICK WIN ​

Phase 2: Enhance FailureClassifier Descriptions (2 Hours) ✅ QUICK WIN ​

Phase 3: Standard Connector Error Pattern (4 Hours) ✅ CRITICAL ​

Phase 4: Service-Specific Enhancers (1 Day - Optional) ​

OpenAI Error Enhancer ​

Gemini Error Enhancer ​

ComfyUI Error Enhancer ​

Phase 5: Update Connectors to Use Pattern (4 Hours) ​

Example: OpenAI Text Connector ​

Example: ComfyUI REST Stream Connector ​

Consequences ​

Benefits ​

1. Connector-Agnostic Architecture ✅ ​

2. Better User Experience ✅ ​

3. Maintainability ✅ ​

4. Quick Wins ✅ ​

Drawbacks ​

1. Service-Specific Knowledge Required ​

2. Migration Effort ​

Success Metrics ​

Quantitative ​

Qualitative ​

Migration Strategy ​

Incremental Rollout ​

Dependencies ​

TypeScript ​

Python (ComfyUI only) ​

Alternative Approaches Considered ​

❌ Alternative 1: Python Error Codes (Original ADR) ​

❌ Alternative 2: Each Connector Handles Errors Independently ​

✅ Alternative 3: TypeScript-First with Service Enhancers (THIS ADR) ​

Related ADRs ​

Appendix: Error Message Examples ​

Before Modernization ​

After Modernization ​

Error Handling Modernization - ADR (Connector-Agnostic)

Context

Current State

Multi-Connector Architecture

Specific Problems by Connector

Decision

Principle 1: TypeScript Owns Error Classification

Principle 2: Service-Specific Parsers Enhance Generic Errors

Principle 3: Always Preserve Context

Proposed Architecture (Connector-Agnostic)

Component Overview

Implementation Phases (Quick Wins First)

Phase 1: Fix ComfyUI Object Serialization (1 Hour) ✅ QUICK WIN

Phase 2: Enhance FailureClassifier Descriptions (2 Hours) ✅ QUICK WIN

Phase 3: Standard Connector Error Pattern (4 Hours) ✅ CRITICAL

Phase 4: Service-Specific Enhancers (1 Day - Optional)

OpenAI Error Enhancer

Gemini Error Enhancer

ComfyUI Error Enhancer

Phase 5: Update Connectors to Use Pattern (4 Hours)

Example: OpenAI Text Connector

Example: ComfyUI REST Stream Connector

Consequences

Benefits

1. Connector-Agnostic Architecture ✅

2. Better User Experience ✅

3. Maintainability ✅

4. Quick Wins ✅

Drawbacks

1. Service-Specific Knowledge Required

2. Migration Effort

Success Metrics

Quantitative

Qualitative

Migration Strategy

Incremental Rollout

Dependencies

TypeScript

Python (ComfyUI only)

Alternative Approaches Considered

❌ Alternative 1: Python Error Codes (Original ADR)

❌ Alternative 2: Each Connector Handles Errors Independently

✅ Alternative 3: TypeScript-First with Service Enhancers (THIS ADR)

Related ADRs

Appendix: Error Message Examples

Before Modernization

After Modernization