Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,7 @@ To label the transcript with speaker ID's (set number of speakers if known e.g.

To run on CPU instead of GPU (and for running on Mac OS X):

whisperx path/to/audio.wav --compute_type int8
whisperx path/to/audio.wav --compute_type int8 --device cpu

### Other languages

Expand All @@ -166,6 +166,7 @@ See more examples in other languages [here](EXAMPLES.md).
```python
import whisperx
import gc
from whisperx.diarize import DiarizationPipeline

device = "cuda"
audio_file = "audio.mp3"
Expand Down Expand Up @@ -196,7 +197,7 @@ print(result["segments"]) # after alignment
# import gc; import torch; gc.collect(); torch.cuda.empty_cache(); del model_a

# 3. Assign speaker labels
diarize_model = whisperx.diarize.DiarizationPipeline(use_auth_token=YOUR_HF_TOKEN, device=device)
diarize_model = DiarizationPipeline(use_auth_token=YOUR_HF_TOKEN, device=device)

# add min/max number of speakers if known
diarize_segments = diarize_model(audio)
Expand Down
3 changes: 2 additions & 1 deletion whisperx/asr.py
Original file line number Diff line number Diff line change
Expand Up @@ -319,7 +319,8 @@ def load_model(
whisper_arch - The name of the Whisper model to load.
device - The device to load the model on.
compute_type - The compute type to use for the model.
vad_method - The vad method to use. vad_model has higher priority if is not None.
vad_model - The vad model to manually assign.
vad_method - The vad method to use. vad_model has a higher priority if it is not None.
options - A dictionary of options to use for the model.
language - The language of the model. (use English for now)
model - The WhisperModel instance to use.
Expand Down
1 change: 0 additions & 1 deletion whisperx/transcribe.py
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,6 @@ def transcribe_task(args: dict, parser: argparse.ArgumentParser):

# Part 1: VAD & ASR Loop
results = []
tmp_results = []
# model = load_model(model_name, device=device, download_root=model_dir)
model = load_model(
model_name,
Expand Down