Skip to content

[M1][Epic 1.6] Implement comprehensive error handling and user feedback system #70

@POWDER-RANGER

Description

@POWDER-RANGER

Problem

No standardized error handling exists across frontend, backend, and ML services. Users receive raw error messages or generic "Something went wrong" alerts. API errors aren't logged consistently, making debugging impossible. Need production-grade error handling with user-friendly messages, structured logging, and error tracking.

Tasks

Backend Error Handling Middleware

  • Create global error handler middleware for Express
  • Implement custom error classes (ValidationError, AuthenticationError, NotFoundError, etc.)
  • Add error serialization with consistent response format
  • Implement correlation ID tracking across services
  • Add stack trace sanitization (hide in production, show in development)
  • Configure error logging with Winston or Pino
  • Add Sentry integration for production error tracking

Frontend Error Boundary System

  • Implement React Error Boundary components
  • Create global error context/state management
  • Build user-friendly error toast notification system
  • Add retry mechanisms for failed API calls
  • Implement offline detection and graceful degradation
  • Add form validation error display
  • Create error page components (404, 500, etc.)

ML Service Error Handling

  • Add FastAPI exception handlers
  • Implement model loading error recovery
  • Add timeout handling for long-running inference
  • Create fallback mechanisms (e.g., switch to lighter model on failure)
  • Add input validation with Pydantic models
  • Log ML-specific errors (model errors, GPU errors, OOM)

API Error Response Standardization

  • Define unified error response schema
  • Add HTTP status code mapping guide
  • Implement field-level validation errors
  • Add request ID in error responses
  • Create error code enum/registry
  • Document all error codes in API documentation

User Feedback Components

  • Build toast notification system (success, warning, error, info)
  • Create loading skeleton screens for async operations
  • Add progress indicators for long-running tasks
  • Implement confirmation dialogs for destructive actions
  • Add empty state components with helpful guidance
  • Create inline validation feedback for forms

Error Monitoring & Alerting

  • Integrate Sentry or similar error tracking service
  • Configure error sampling and filtering rules
  • Set up alerts for critical error rate thresholds
  • Add custom error context (user ID, source ID, etc.)
  • Create error dashboard in frontend for admin users
  • Implement client-side error reporting API

Acceptance Criteria

  • ✅ All API errors return consistent JSON format: {error: {code, message, details, requestId}}
  • ✅ Frontend displays user-friendly error messages (never raw stack traces)
  • ✅ Network errors trigger automatic retry with exponential backoff
  • ✅ Form validation errors appear inline next to relevant fields
  • ✅ All errors logged with correlation IDs for request tracing
  • ✅ Critical errors (auth failures, DB connection loss) trigger admin alerts
  • ✅ Error boundaries prevent entire app crashes from component failures
  • ✅ Sentry captures 100% of production errors with appropriate context

Technical Implementation

Unified Error Response Schema

// Consistent across all API endpoints
interface ErrorResponse {
  error: {
    code: string;           // Machine-readable error code (e.g., "INVALID_RSS_URL")
    message: string;        // Human-readable error message
    details?: any;          // Additional context (validation errors, etc.)
    requestId: string;      // Correlation ID for tracking
    timestamp: string;      // ISO 8601 timestamp
    path?: string;          // API endpoint that failed
  };
}

Backend Global Error Handler

// backend/src/middleware/errorHandler.ts
import { Request, Response, NextFunction } from 'express';
import { logger } from '../utils/logger';

export class AppError extends Error {
  constructor(
    public statusCode: number,
    public code: string,
    message: string,
    public details?: any,
    public isOperational: boolean = true
  ) {
    super(message);
    Object.setPrototypeOf(this, AppError.prototype);
  }
}

export function errorHandler(
  err: Error,
  req: Request,
  res: Response,
  next: NextFunction
) {
  // Default to 500 if not an AppError
  const statusCode = err instanceof AppError ? err.statusCode : 500;
  const code = err instanceof AppError ? err.code : 'INTERNAL_ERROR';
  
  // Log error with correlation ID
  logger.error({
    requestId: req.id,
    error: {
      message: err.message,
      stack: err.stack,
      code,
    },
    request: {
      method: req.method,
      path: req.path,
      userId: req.user?.id,
    },
  });
  
  // Send sanitized error response
  res.status(statusCode).json({
    error: {
      code,
      message: err.message,
      details: err instanceof AppError ? err.details : undefined,
      requestId: req.id,
      timestamp: new Date().toISOString(),
      path: req.path,
      // Only include stack in development
      ...(process.env.NODE_ENV === 'development' && { stack: err.stack }),
    },
  });
}

// Custom error classes
export class ValidationError extends AppError {
  constructor(details: any, message = 'Validation failed') {
    super(400, 'VALIDATION_ERROR', message, details);
  }
}

export class AuthenticationError extends AppError {
  constructor(message = 'Authentication required') {
    super(401, 'AUTHENTICATION_ERROR', message);
  }
}

export class AuthorizationError extends AppError {
  constructor(message = 'Insufficient permissions') {
    super(403, 'AUTHORIZATION_ERROR', message);
  }
}

export class NotFoundError extends AppError {
  constructor(resource: string) {
    super(404, 'NOT_FOUND', `${resource} not found`);
  }
}

export class RateLimitError extends AppError {
  constructor(retryAfter: number) {
    super(429, 'RATE_LIMIT_EXCEEDED', 'Too many requests', { retryAfter });
  }
}

Frontend Error Boundary

// frontend/src/components/ErrorBoundary.tsx
import React, { Component, ErrorInfo, ReactNode } from 'react';
import * as Sentry from '@sentry/react';

interface Props {
  children: ReactNode;
  fallback?: ReactNode;
}

interface State {
  hasError: boolean;
  error?: Error;
}

export class ErrorBoundary extends Component<Props, State> {
  state: State = { hasError: false };

  static getDerivedStateFromError(error: Error): State {
    return { hasError: true, error };
  }

  componentDidCatch(error: Error, errorInfo: ErrorInfo) {
    // Log to error tracking service
    Sentry.captureException(error, {
      contexts: {
        react: {
          componentStack: errorInfo.componentStack,
        },
      },
    });
    
    console.error('Error boundary caught:', error, errorInfo);
  }

  render() {
    if (this.state.hasError) {
      return this.props.fallback || (
        <div className="error-container">
          <h2>😞 Something went wrong</h2>
          <p>We've been notified and are looking into it.</p>
          <button onClick={() => window.location.reload()}>
            Reload page
          </button>
        </div>
      );
    }

    return this.props.children;
  }
}

API Client with Retry Logic

// frontend/src/api/client.ts
import axios, { AxiosError } from 'axios';
import { toast } from 'react-toastify';

const MAX_RETRIES = 3;
const RETRY_DELAY = 1000;

export const apiClient = axios.create({
  baseURL: import.meta.env.VITE_API_BASE_URL,
  timeout: 30000,
});

// Add correlation ID to requests
apiClient.interceptors.request.use((config) => {
  config.headers['X-Request-ID'] = crypto.randomUUID();
  return config;
});

// Retry logic for network errors
apiClient.interceptors.response.use(
  (response) => response,
  async (error: AxiosError) => {
    const config = error.config as any;
    
    // Don't retry on 4xx errors (client errors)
    if (error.response && error.response.status < 500) {
      return Promise.reject(error);
    }
    
    // Retry on network errors or 5xx
    config.retryCount = config.retryCount || 0;
    
    if (config.retryCount < MAX_RETRIES) {
      config.retryCount++;
      const delay = RETRY_DELAY * Math.pow(2, config.retryCount - 1);
      
      await new Promise((resolve) => setTimeout(resolve, delay));
      return apiClient(config);
    }
    
    // Show user-friendly error message
    const errorMessage = error.response?.data?.error?.message || 
                        'Network error. Please check your connection.';
    toast.error(errorMessage);
    
    return Promise.reject(error);
  }
);

ML Service Error Handling

# ml/src/exceptions.py
from fastapi import HTTPException, Request
from fastapi.responses import JSONResponse
import logging

logger = logging.getLogger(__name__)

class MLServiceError(Exception):
    """Base exception for ML service errors"""
    def __init__(self, message: str, code: str, details: dict = None):
        self.message = message
        self.code = code
        self.details = details or {}
        super().__init__(message)

class ModelLoadError(MLServiceError):
    """Raised when model fails to load"""
    def __init__(self, model_name: str, reason: str):
        super().__init__(
            f"Failed to load model: {model_name}",
            "MODEL_LOAD_ERROR",
            {"model": model_name, "reason": reason}
        )

class InferenceError(MLServiceError):
    """Raised when inference fails"""
    def __init__(self, reason: str):
        super().__init__(
            "Inference failed",
            "INFERENCE_ERROR",
            {"reason": reason}
        )

# Global exception handler
async def ml_exception_handler(request: Request, exc: MLServiceError):
    logger.error(
        f"ML Error: {exc.code}",
        extra={
            "request_id": request.headers.get("X-Request-ID"),
            "path": request.url.path,
            "details": exc.details,
        }
    )
    
    return JSONResponse(
        status_code=500,
        content={
            "error": {
                "code": exc.code,
                "message": exc.message,
                "details": exc.details,
                "request_id": request.headers.get("X-Request-ID"),
            }
        }
    )

Toast Notification System

// frontend/src/components/Toast.tsx
import { ToastContainer, toast as toastify } from 'react-toastify';
import 'react-toastify/dist/ReactToastify.css';

export const toast = {
  success: (message: string) => toastify.success(message, {
    position: 'top-right',
    autoClose: 3000,
    hideProgressBar: false,
  }),
  
  error: (message: string) => toastify.error(message, {
    position: 'top-right',
    autoClose: 5000,
    hideProgressBar: false,
  }),
  
  warning: (message: string) => toastify.warning(message, {
    position: 'top-right',
    autoClose: 4000,
  }),
  
  info: (message: string) => toastify.info(message, {
    position: 'top-right',
    autoClose: 3000,
  }),
};

export function ToastProvider() {
  return <ToastContainer />;
}

Error Code Registry

Authentication/Authorization (4xx)

  • AUTHENTICATION_ERROR - Missing or invalid JWT token
  • AUTHORIZATION_ERROR - User lacks required permissions
  • INVALID_CREDENTIALS - Wrong email/password
  • TOKEN_EXPIRED - JWT token expired

Validation Errors (400)

  • VALIDATION_ERROR - Request payload validation failed
  • INVALID_RSS_URL - RSS feed URL is malformed or unreachable
  • INVALID_FILE_FORMAT - Uploaded file format not supported
  • MISSING_REQUIRED_FIELD - Required field missing from request

Resource Errors (404)

  • NOT_FOUND - Requested resource doesn't exist
  • SOURCE_NOT_FOUND - Source ID not found
  • USER_NOT_FOUND - User ID not found

Rate Limiting (429)

  • RATE_LIMIT_EXCEEDED - Too many requests from client

Server Errors (500)

  • INTERNAL_ERROR - Unhandled server error
  • DATABASE_ERROR - Database query failed
  • ML_SERVICE_UNAVAILABLE - ML service unreachable
  • MODEL_LOAD_ERROR - ML model failed to load
  • INFERENCE_ERROR - ML inference failed

Dependencies

Requires: #58 (API endpoints), #60 (frontend dashboard)
Blocks: #61 (alerting), #67 (observability)
Related: #68 (security - error messages shouldn't leak sensitive info)

Estimated Effort

Time: 8-12 hours
Complexity: Medium-High
Skills: TypeScript, React, Express.js, FastAPI, Error tracking (Sentry)

Definition of Done

  • All tasks checked off
  • All acceptance criteria met
  • Error handling tested for all API endpoints
  • Frontend error scenarios tested (network failures, validation errors, etc.)
  • Sentry integration verified in staging environment
  • Error handling documentation added to docs/
  • Code reviewed and merged

Priority: P1 - High
Labels: backend, frontend, ml, error-handling, M1, P1, user-experience

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions