Wake-Word by tachyonicbytes · Pull Request #618 · cjpais/Handy

tachyonicbytes · 2026-01-18T23:15:42Z

Before Submitting This PR

Please confirm you have done the following:

I have searched existing issues and pull requests (including closed ones) to ensure this isn't a duplicate
I have read CONTRIBUTING.md

If this is a feature or change that was previously closed/rejected:

I have explained in the description below why this should be reconsidered
I have gathered community feedback (link to discussion below)

Human Written Description

Hi! This is a Draft PR for the discussion of the WakeWord implementation. I chose a draft PR because the code is not production ready, and most likely won't be until we sort out all the details.

I started from the openWakeWord project, that seems to train wakewords, as well as provide an inference engine based on onnx to run them.

In order to understand how openWakeWord works, I stripped the entire logic and one example into a single file, very easy to understand, that I provide in this github gist here.

Basically, there are three models, the melspectrogram model, the embedding model from google and the actual wakeword model. And all that's needed is chopping up the audio into suitable pieces (resampling if needed), feeding the input from one model into the other up until the wakeword model, which gives us the prediction. The "Hey Mycroft" model seemed to have the best activation from all the models included.

Secondly, I translated the code to Rust, as a standalone example, provided in this github gist here. There are some more goodies than the simple example here. A project that already ports a form of openWakeWord to rust already exists, but I couldn't make it work.

Using the functional code from this example, I was able to implement a very rudimentary wakeword function for Handy, using the already present "Always-On Microphone", that's a prerequisite for the wakeword to function.

I posit that the models themselves are small enough and the license is not problematic to distribute them along with the app, not making the user also download them, but closer inspection of the openWakeWord repo should be done in order to do so. Otherwise, we can just download them like we do for the transcription models.

There is an additional question of stopping the recording after the wakeword was uttered. For the simple example, I did a 5 second delay, but we should find something more permanent here.

Ideally also, a "Hey Handy" model should be trained. The code from the oww repo seems to provide us with all we need in order to do it, we just need the computing capacity and data. I'd be an interesting exercise for the Handy community, if they want their voice / pronunciation to be a part of the model :)

You have to manually download these three models from openWakeWord and place them in order to form this directory tree:

https://github.com/dscripka/openWakeWord/releases/download/v0.5.1/hey_mycroft_v0.1.onnx
https://github.com/dscripka/openWakeWord/releases/download/v0.5.1/embedding_model.onnx
https://github.com/dscripka/openWakeWord/releases/download/v0.5.1/melspectrogram.onnx

src-tauri/resources/models/embedding_model.onnx
src-tauri/resources/models/hey_mycroft_v0.1.onnx
src-tauri/resources/models/melspectrogram.onnx

Related Issues/Discussions

Fixes #
Discussion:

Community Feedback

I announced the community on discord, with a video of the feature. @cjpais seemed interested in the code itself, saying that the feature was previously requested, but I couldn't find exactly where.

Testing

I tested the code manually, because only some e2e tests would be able to test such a vertical feature fully. But there didn't seem to be any e2e tests when I started.

I also added a test for the activation itself. It requires another external resource, a single-channel 16khz .wav file of the wake-word being spoken, that I didn't also include until we sort out what files are acceptable in a PR.

Screenshots/Videos (if applicable)

The original video

wakeword.mp4

AI Assistance

No AI was used in this PR
AI was used (please describe below)

If AI was used:

Tools used: VSCode + Agent mode, with either GPT5, GPT-4o or Claude Sonnet 4.
How extensively: Much of the editing and making sense of the code. It needed a lot of steering and hand-holding. I originally tried a direct port from openWakeWord, but it failed spectacularly. It couldn't even make the first gist in this PR description, that was made manually. It did translate that code to rust though (the second gist), but bugfixes were needed even then.

Implements wake-word functionality, based on the models from the openWakeWord project.

cjpais · 2026-01-19T00:37:16Z

I just want to say thank you for this and at the very least other people can use this in their forks given you've put it here. Right now I am overwhelmed with issues and trying to stabilize the app as well as get the new keyboard implementation in, so it will take I would guess months at this point for me to review and get this in, just as a heads up.

tachyonicbytes · 2026-01-19T08:06:19Z

Yeah, no sweat, the functionality works for me, so I don't have to wait for anything to use it. Take care of making the software better.

pchalasani · 2026-01-29T14:48:58Z

new keyboard implementation

@cjpais curious what this is about?

cjpais · 2026-01-30T00:34:18Z

@pchalasani #580

tachyonicbytes added 3 commits January 19, 2026 00:54

feat: Rudimentary wakeword implementation

471971f

Implements wake-word functionality, based on the models from the openWakeWord project.

chore: Simplify WakeWordToggle.tsx

8913035

fix: wake-word didn't restart, buffers not cleared after restart

4445b56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Comments

Wake-Word#618

Wake-Word#618
tachyonicbytes wants to merge 3 commits intocjpais:mainfrom
tachyonicbytes:feat/wakeword

tachyonicbytes commented Jan 18, 2026

Uh oh!

cjpais commented Jan 19, 2026

Uh oh!

tachyonicbytes commented Jan 19, 2026 •

edited

Loading

Uh oh!

pchalasani commented Jan 29, 2026

Uh oh!

cjpais commented Jan 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Comments

Conversation

tachyonicbytes commented Jan 18, 2026

Before Submitting This PR

Human Written Description

Related Issues/Discussions

Community Feedback

Testing

Screenshots/Videos (if applicable)

AI Assistance

Uh oh!

cjpais commented Jan 19, 2026

Uh oh!

tachyonicbytes commented Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pchalasani commented Jan 29, 2026

Uh oh!

cjpais commented Jan 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tachyonicbytes commented Jan 19, 2026 •

edited

Loading