@ekwek1
🚀 Feature: CPU Support for Soprano TTS
Summary
I've implemented full CPU support for Soprano TTS, enabling the model to run on systems without CUDA-enabled GPUs. This makes Soprano accessible to a much wider range of users and deployment scenarios.
Motivation
Currently, Soprano requires a CUDA-enabled GPU to run, which limits its accessibility. Many users want to:
- Test Soprano on laptops or servers without GPUs
- Deploy in CPU-only environments
- Use Soprano for offline/non-real-time generation where speed is less critical
Changes Made
I've submitted a pull request that implements CPU support across the entire codebase:
1. Core TTS Module (soprano/tts.py)
- Added
'cpu' to recognized devices
- Replaced hardcoded
.cuda() calls with .to(device)
- Added
map_location=device to weight loading
- Made all tensor operations device-agnostic
2. Backend Support
- LMDeploy (
soprano/backends/lmdeploy.py): Added CPU mode support
- Transformers (
soprano/backends/transformers.py): Enhanced CPU compatibility with proper dtype handling
3. Decoder Components
- Spectral Operations (
soprano/vocos/spectral_ops.py): Removed hardcoded CUDA device from window buffer
4. Demo Application
- Updated Gradio app to automatically detect and display current device
- Removed
@spaces.GPU decorator for CPU compatibility
- Added performance notes for CPU vs GPU usage
5. Documentation
- Updated README.md with CPU usage examples
- Added changelog section for v0.0.3
- Updated installation requirements
- Checked off CPU support in roadmap
Technical Details
Device Detection:
DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
model = SopranoTTS(backend="auto", device=DEVICE)
Automatic Backend Selection:
- CUDA device → LMDeploy (if available) or Transformers
- CPU device → Transformers backend with float32 precision
Key Implementation Points:
- All
.cuda() calls replaced with .to(self.device)
- PyTorch buffers properly registered for automatic device movement
- dtype selection based on device (bfloat16 for CUDA, float32 for CPU)
- Cache management only applied to CUDA mode
Testing
The implementation has been tested with:
- ✅ CPU inference (single and batch)
- ✅ CUDA inference (backwards compatibility maintained)
- ✅ Automatic device detection
- ✅ Both LMDeploy and Transformers backends
- ✅ Gradio demo on both devices
Performance Notes
- CUDA: ~2000× real-time factor (unchanged)
- CPU: Slower than CUDA but fully functional for offline generation
Breaking Changes
None - all changes are backwards compatible. Existing CUDA code continues to work exactly as before.
Pull Request
I've submitted a pull request with all these changes. The implementation is clean, well-tested, and maintains full backwards compatibility with existing CUDA deployments.
Benefits
- Wider Accessibility: Users without GPUs can now use Soprano
- Testing & Development: Easier local development on laptops
- Flexible Deployment: Support for CPU-only server environments
- Cost Reduction: Option to use cheaper CPU instances for non-real-time workloads
Files Changed
soprano/tts.py
soprano/backends/lmdeploy.py
soprano/backends/transformers.py
soprano/vocos/spectral_ops.py
app.py (Gradio demo)
README.md
Looking forward to your feedback! Let me know if you'd like any changes or have questions about the implementation.
Related Roadmap Item: Closes the "CPU support" checkbox in the roadmap ✅
@ekwek1
🚀 Feature: CPU Support for Soprano TTS
Summary
I've implemented full CPU support for Soprano TTS, enabling the model to run on systems without CUDA-enabled GPUs. This makes Soprano accessible to a much wider range of users and deployment scenarios.
Motivation
Currently, Soprano requires a CUDA-enabled GPU to run, which limits its accessibility. Many users want to:
Changes Made
I've submitted a pull request that implements CPU support across the entire codebase:
1. Core TTS Module (
soprano/tts.py)'cpu'to recognized devices.cuda()calls with.to(device)map_location=deviceto weight loading2. Backend Support
soprano/backends/lmdeploy.py): Added CPU mode supportsoprano/backends/transformers.py): Enhanced CPU compatibility with proper dtype handling3. Decoder Components
soprano/vocos/spectral_ops.py): Removed hardcoded CUDA device from window buffer4. Demo Application
@spaces.GPUdecorator for CPU compatibility5. Documentation
Technical Details
Device Detection:
Automatic Backend Selection:
Key Implementation Points:
.cuda()calls replaced with.to(self.device)Testing
The implementation has been tested with:
Performance Notes
Breaking Changes
None - all changes are backwards compatible. Existing CUDA code continues to work exactly as before.
Pull Request
I've submitted a pull request with all these changes. The implementation is clean, well-tested, and maintains full backwards compatibility with existing CUDA deployments.
Benefits
Files Changed
soprano/tts.pysoprano/backends/lmdeploy.pysoprano/backends/transformers.pysoprano/vocos/spectral_ops.pyapp.py(Gradio demo)README.mdLooking forward to your feedback! Let me know if you'd like any changes or have questions about the implementation.
Related Roadmap Item: Closes the "CPU support" checkbox in the roadmap ✅