Add onnx gpu support and benchmarking suite for NeuTTS-Air.#60
Add onnx gpu support and benchmarking suite for NeuTTS-Air.#60NemesisGuy wants to merge 16 commits intoneuphonic:mainfrom
Conversation
- Added benchmarking utilities in `neuttsair/benchmark.py` and CLI for profiling ONNX providers. - Updated `README.md` and `examples/README.md` with new benchmarking instructions and device selection options. - Modified `basic_example.py` to support device arguments for backbone and codec. - Updated `CHANGELOG.md` to reflect new features and changes.
|
will this work with mac? i have a 16 core gpu 10 core cpu m1 pro 16gb ram. i have it right now so it chunks and seperates and then puts them together all in terminal but it takes SOOO long to generate. if i need something thats like an hour long I do it over night. |
|
Hey @ThomasChan06 👋 I don’t have a Mac to test directly, so if you’re able to try running it on your setup, that’d be super helpful 🙏. Once it’s running, you can also try the benchmark script — it’ll help you see whether MPS or CPU performs better for your specific hardware. M1/M2 chips sometimes behave differently depending on tensor precision (fp16 vs fp32) and task type, so this can help find the sweet spot for long jobs. Would be awesome if you could share your results after running a short test! |
|
@NemesisGuy I tried testing this on an M3 Mac, but the benchmark files (both under The are also missing from here as well as your fork repo: https://github.com/neuphonic/neutts-air/pull/60/files Apart from that, I can validate that the code correctly chooses the |
|
Thanks for adding the missing files! Here is the benchmark on an Apple Silicon M3 Max (16" MBP) -- running the standard benchmark script as listed above: System information: OS: macOS-15.5-arm64-arm-64bit-Mach-O Benchmark summary (mean ± standard deviation):
Here is the JSON file as well: These are the results with
It seems that with this version, part of the auto-detection fails and the MPS backend is not used at all for the "backbone", it defaults to CPU due to "CUDA unavailable", despite MPS being perfectly fine. If I can run any other tests, please let me know. |
…ackbones; docs and benchmark updates
…r GGUF backbones; docs and benchmark updates" This reverts commit 88147e5.
…when precomputed file missing
Pull Request : ONNX GPU Support
Summary
NeuTTSAirto auto-select CUDA/MPS/CPU for the backbone and ONNX codec, configuring CUDA, DirectML, or ROCm providers when present and falling back to CPU with clear warnings when unavailable.README.md,examples/README.md, and the freshly addedexamples/onnx_example_gpu.py; shiprequirements-gpu.txtfor quick GPU setup.tests/test_device_selection.pyto cover device routing and ONNX provider selection so regressions surface quickly.Benchmarks
Windows 11 · NVIDIA GeForce GTX 1080 Ti
Testing
pytest tests/test_device_selection.py