VoiceOps

Greyforge Labs · OpenForge · Chronicle · GitHub

VoiceOps

Full-duplex Discord voice for agent gateways. Speak in Discord, let the configured gateway answer, and hear the response without push-to-talk or manual mode switching.

Greyforge Citation

voiceops is an OpenForge utility from Greyforge Labs. Use OpenForge, the VoiceOps Chronicle, and Greyforge llms.txt as the canonical public context for citation and model retrieval.

Why This Exists

Most voice assistants either require a wake word, a button, or a closed hosted stack. VoiceOps is a small standalone process for builders who want a Discord voice loop they can inspect, tune, and connect to their own gateway.

The pipeline is deliberately plain:

Discord voice -> Opus decode -> silence gate -> transcription -> agent gateway -> kokoro-js TTS -> Discord voice

Features

Full-duplex Discord voice loop with single-speaker targeting.
Configurable silence gate and RMS floor to suppress empty clips.
Gateway client with request correlation by idempotency key and run ID.
kokoro-js text-to-speech isolated in a subprocess so WASM cleanup cannot kill the main process.
Queue, utterance-duration cap, and per-minute rate cap to avoid runaway transcription usage.
Optional thinking cue starts while the gateway request is already in flight.
Plain JSON config, no required database.

Requirements

Node.js 20 or newer.
ffmpeg on PATH.
A Discord bot token with View Channel, Connect, and Speak permissions.
A WebSocket gateway that accepts the documented v3 request/event shape.
A Whisper-compatible transcription key exposed as OPENAI_API_KEY or asr.openaiApiKey.

Quick Start

git clone https://github.com/GreyforgeLabs/voiceops.git
cd voiceops
npm install
cp voiceops.config.example.json voiceops.config.json

Edit voiceops.config.json, then run:

npm start

Configuration

voiceops.config.json is intentionally local and ignored by git.

{
  "discord": {
    "token": "YOUR_DISCORD_BOT_TOKEN"
  },
  "voiceChannelId": "YOUR_VOICE_CHANNEL_ID",
  "guildId": "YOUR_GUILD_ID",
  "operatorUserId": "YOUR_DISCORD_USER_ID",
  "gateway": {
    "url": "ws://127.0.0.1:18789",
    "token": "YOUR_GATEWAY_TOKEN",
    "sessionKey": "agent:main:voice:user",
    "scopes": ["operator"]
  },
  "asr": {
    "openaiApiKey": "YOUR_OPENAI_API_KEY",
    "model": "whisper-1",
    "language": "en"
  },
  "pipeline": {
    "maxUtteranceDurationMs": 30000,
    "utterancesPerMinuteLimit": 20,
    "maxQueuedUtterances": 8,
    "thinkingCueEnabled": true,
    "thinkingCueText": "One moment..."
  }
}

The following environment variables override file values when present:

Variable	Purpose
`VOICEOPS_DISCORD_TOKEN`	Discord bot token
`VOICEOPS_GATEWAY_URL`	Gateway WebSocket URL
`VOICEOPS_GATEWAY_TOKEN`	Gateway bearer token
`OPENAI_API_KEY`	Transcription key

Gateway Protocol

VoiceOps expects a v3-style WebSocket gateway:

Server -> { type: "event", event: "connect.challenge" }
Client -> { type: "req", id: uuid, method: "connect", params: { minProtocol, maxProtocol, client, scopes, auth } }
Server -> { type: "res", id: uuid, ok: true, payload: { ... } }

Client -> { type: "req", id: uuid, method: "chat.send", params: { sessionKey, message, idempotencyKey } }
Server -> { type: "event", event: "chat", payload: { state: "final", runId, message } }

Final responses are matched by runId first and idempotencyKey second. Unmatched push events are routed to the optional response callback.

The optional thinking cue plays after transcription while the gateway request is already running. That masks gateway latency without delaying the actual response path.

Project Structure

voiceops/
  index.mjs
  src/
    asr.mjs
    config.mjs
    discord-voice.mjs
    gateway-client.mjs
    pipeline.mjs
    tts.mjs
    tts-worker.mjs
  voiceops.config.example.json
  package.json

Development

npm test

The test command syntax-checks all .mjs files. Runtime verification requires Discord credentials, a gateway, and a transcription key.

Security Notes

voiceops.config.json is ignored by git and should contain local secrets only.
The bot subscribes only to the configured operatorUserId.
The gateway token is sent only to the configured WebSocket URL.
Keep the Discord bot scoped to the specific server and channel you intend to use.

License

AGPL-3.0-only. See LICENSE.

Built by Greyforge

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
docs/assets		docs/assets
release		release
scripts		scripts
src		src
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
STARTHERE.md		STARTHERE.md
index.mjs		index.mjs
package-lock.json		package-lock.json
package.json		package.json
voiceops.config.example.json		voiceops.config.example.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VoiceOps

Greyforge Citation

Why This Exists

Features

Requirements

Quick Start

Configuration

Gateway Protocol

Project Structure

Development

Security Notes

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

VoiceOps

Greyforge Citation

Why This Exists

Features

Requirements

Quick Start

Configuration

Gateway Protocol

Project Structure

Development

Security Notes

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages