Skip to content

🔊 Add nudge audio caching to reduce TTS calls #84

@adrianwedd

Description

@adrianwedd

Summary

Implement intelligent audio caching for nudges to reduce Text-to-Speech API calls, improve response times, and optimize resource usage.

Benefits

  • Faster nudge delivery: Pre-generated audio eliminates TTS generation delay
  • Reduced API costs: Fewer calls to TTS services (Google Cloud TTS, etc.)
  • Offline capability: Cached audio works during internet disruptions
  • Consistent quality: Same audio quality for repeated nudges
  • Resource efficiency: Lower CPU and network usage

Caching Strategy

Cache Categories

  • Static nudges: Common reminders that don't change ("Time for a break!")
  • Template nudges: Parameterized content with variable substitution
  • Personalized nudges: User-specific content with moderate reuse
  • Dynamic nudges: Highly contextual, single-use content

Cache Management

  • Intelligent pre-generation: Create audio for common nudges at startup
  • Usage-based caching: Cache frequently used nudge patterns
  • LRU eviction: Remove least recently used audio files
  • Size limits: Configurable cache size with automatic cleanup
  • Expiration: Time-based expiration for personalized content

Technical Implementation

Cache Storage

  • Local filesystem cache with organized directory structure
  • Audio format optimization (MP3, Opus, AAC comparison)
  • Metadata storage for cache management (SQLite or JSON)
  • Checksum verification for cache integrity

Cache Key Generation

  • Content-based hashing for static nudges
  • Template + parameter hashing for dynamic content
  • Voice/personality parameter inclusion
  • User-specific cache partitioning

TTS Integration

  • Fallback to real-time TTS for cache misses
  • Background cache warming for predicted nudges
  • Multi-voice support with separate cache partitions
  • Quality settings optimization for cache vs real-time

Smart Features

Predictive Caching

  • Analyze user patterns to predict likely nudges
  • Pre-generate audio for upcoming scheduled nudges
  • Calendar integration for event-specific nudge preparation
  • Task-based nudge prediction and preparation

Dynamic Content Handling

  • Template-based audio generation with parameter substitution
  • Voice modulation for slight variations without full regeneration
  • Partial caching for common nudge components

Performance Optimization

  • Asynchronous cache warming
  • Parallel audio generation for batch operations
  • Compressed audio storage with quality balance
  • Memory caching for frequently accessed audio

Configuration Options

  • Cache size limits (disk space and file count)
  • Audio quality settings (bitrate, format)
  • Cache warming strategies (startup, background, on-demand)
  • Expiration policies per nudge type
  • Voice and personality cache partitioning

Files to Implement/Modify

  • Audio cache management module
  • TTS integration layer
  • Nudge generation and delivery system
  • Configuration and settings management
  • Cache monitoring and cleanup utilities

Acceptance Criteria

  • Common nudges are served from cache without TTS calls
  • Cache warming happens efficiently in background
  • Cache size stays within configured limits
  • Audio quality is maintained across cache and real-time generation
  • Fallback to real-time TTS works seamlessly
  • Performance improvement is measurable (response time, API usage)
  • Cache management is automated and requires no manual intervention
  • Configuration options allow customization for different setups

Metrics to Track

  • Cache hit/miss ratios
  • TTS API call reduction percentage
  • Average nudge delivery time improvement
  • Cache storage utilization
  • Background processing impact on system performance

Related to

Priority

Medium - Performance and cost optimization feature

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestperformancePerformance improvementsphase-2Phase 2 development tasks

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions