fix(transcribe): auto-fallback to CPU + int8 when CUDA is unavailable#19
Open
fadenb wants to merge 1 commit intopretyflaco:mainfrom
Open
fix(transcribe): auto-fallback to CPU + int8 when CUDA is unavailable#19fadenb wants to merge 1 commit intopretyflaco:mainfrom
fadenb wants to merge 1 commit intopretyflaco:mainfrom
Conversation
Instead of raising ValueError when the requested CUDA device is not present, automatically fall back to CPU and downgrade compute_type from float16 to int8 (float16 is unsupported on CPU). Also indicate whether CPU is forced or a fallback in the model-loading print message.
pretyflaco
approved these changes
May 7, 2026
Owner
pretyflaco
left a comment
There was a problem hiding this comment.
Direction is right — the current ValueError is genuinely user-hostile on no-GPU machines. The PR is well-scoped and the motivation is clear. Approving.
A few small follow-ups I'd want before this hits a release:
- Unit test for the
__post_init__fallback. Mocking_torch_device_availablelets us cover all three of your test scenarios in CI without the hardware mix you flagged. ~15 lines intests/test_transcribe.py. - Warning log should mention the compute_type change. Right now a user who passed
--compute-type float16explicitly sees only the device fallback warning, while compute_type silently flips toint8. One-line append. - (Out of scope, just noting:)
meet checkon a CUDA-less machine should keep working — your change preserves theif available is None: continuebranch so this should be fine, I'll smoke-test it on my side.
You're blocked on hardware for two of your three test scenarios anyway, so happy to land this as-is and push a follow-up commit with the unit test + warning tweak — or if you'd rather do it yourself for the learning, take a few days and add them here. Either works for me, just let me know which you prefer.
Either way, thanks for the clean PR — the diagnosis in the description was excellent.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Motivation
Running meet run on a machine without a GPU (laptop, container without passthrough, CI runner) currently crashes with an unhelpful ValueError. The user must know to pass --device cpu --compute-type int8 manually. This change makes it "just work" - the common case shouldn't require flags.
Test plan
I currently can only do one of them as I lack a device with CUDA capable GPU.