Qwen3-ASR-CLI

A command-line tool for speech-to-text transcription using Qwen3-ASR model.

Features

🎤 High-quality speech recognition powered by Qwen3-ASR
🚀 Supports CUDA, MPS (Apple Silicon), and CPU
📝 Clean text output, perfect for piping and scripting
🔌 Works as a CLI provider in OpenClaw

Installation

Prerequisites

Python 3.12+
uv package manager

Quick Install (Recommended)

Install directly from GitHub:

uv tool install git+https://github.com/cnjack/qwen3-asr-cli.git

Install from Source (for development)

Clone the repository:

git clone https://github.com/cnjack/qwen3-asr-cli.git
cd qwen3-asr-cli

Install globally:

uv tool install -e .

(Optional) Download Models Locally

By default, the tool uses Qwen/Qwen3-ASR-1.7B and will download it automatically from Hugging Face on first use. If you prefer to download models manually or need offline access, use one of the following methods:

Download through ModelScope (recommended for users in Mainland China):

pip install -U modelscope
modelscope download --model Qwen/Qwen3-ASR-1.7B --local_dir ./Qwen3-ASR-1.7B
modelscope download --model Qwen/Qwen3-ASR-0.6B --local_dir ./Qwen3-ASR-0.6B
modelscope download --model Qwen/Qwen3-ForcedAligner-0.6B --local_dir ./Qwen3-ForcedAligner-0.6B

Download through Hugging Face:

pip install -U "huggingface_hub[cli]"
huggingface-cli download Qwen/Qwen3-ASR-1.7B --local-dir ./Qwen3-ASR-1.7B
huggingface-cli download Qwen/Qwen3-ASR-0.6B --local-dir ./Qwen3-ASR-0.6B
huggingface-cli download Qwen/Qwen3-ForcedAligner-0.6B --local-dir ./Qwen3-ForcedAligner-0.6B

Note: If you download models locally, you'll need to specify the model path using the --model parameter when running the CLI.

Usage

qwen3-asr-cli <audio_file> [--model MODEL_NAME_OR_PATH]

Options

audio_file: Path to the audio file to transcribe (required)
--model: Model name or local path (default: Qwen/Qwen3-ASR-1.7B)
- Use official model name: Qwen/Qwen3-ASR-1.7B, Qwen/Qwen3-ASR-0.6B
- Use local path: ./Qwen3-ASR-1.7B, ./Qwen3-ASR-0.6B

Examples

# Transcribe using default model (Qwen/Qwen3-ASR-1.7B)
qwen3-asr-cli recording.mp3

# Transcribe using a different official model
qwen3-asr-cli recording.mp3 --model Qwen/Qwen3-ASR-0.6B

# Transcribe using a locally downloaded model
qwen3-asr-cli meeting.wav --model ./Qwen3-ASR-1.7B

# Transcribe and save to file
qwen3-asr-cli meeting.wav > transcript.txt

# Use with other commands
qwen3-asr-cli audio.mp3 | wc -w  # Count words

OpenClaw Integration

You can use qwen3-asr-cli as a CLI audio transcription provider in OpenClaw.

Add the following to your OpenClaw configuration:

{
  "tools": {
    "media": {
      "audio": {
        "enabled": true,
        "models": [
          {
            "type": "cli",
            "command": "qwen3-asr-cli",
            "args": ["{{MediaPath}}"],
            "timeoutSeconds": 120
          }
        ]
      }
    }
  }
}

To use a different model or a locally downloaded model, add the --model parameter:

{
  "type": "cli",
  "command": "qwen3-asr-cli",
  "args": ["{{MediaPath}}", "--model", "Qwen/Qwen3-ASR-0.6B"],
  "timeoutSeconds": 120
}

This allows OpenClaw to automatically transcribe voice messages and audio attachments using Qwen3-ASR locally, without relying on external API providers.

Supported Audio Formats

All formats supported by librosa:

MP3, WAV, FLAC, OGG, M4A, and more

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
docs		docs
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
main.py		main.py
output.mp3		output.mp3
pyproject.toml		pyproject.toml
server.js		server.js
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Qwen3-ASR-CLI

Features

Installation

Prerequisites

Quick Install (Recommended)

Install from Source (for development)

(Optional) Download Models Locally

Usage

Options

Examples

OpenClaw Integration

Supported Audio Formats

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

cnjack/qwen3-asr-cli

Folders and files

Latest commit

History

Repository files navigation

Qwen3-ASR-CLI

Features

Installation

Prerequisites

Quick Install (Recommended)

Install from Source (for development)

(Optional) Download Models Locally

Usage

Options

Examples

OpenClaw Integration

Supported Audio Formats

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages