Voice MIDI Controller

Control hardware synthesizers with your voice. AI-powered musical collaborator that listens, executes, proposes, and learns from your references.

Three Workflows

This project combines three creative workflows:

A — Voice → MIDI Hardware

Speak musical ideas, control your Elektron Syntakt (or any MIDI synth) in real-time.

"Add a note on E3 on step 5"     → MIDI note sent
"Make the filter darker"          → CC 74 lowered
"Propose a bassline"              → AI generates, you say "next" or "yes"

B — Reference-Based Composition

Feed 5 reference tracks. The AI analyzes them, extracts patterns, and proposes ideas inspired by the pool.

5 tracks → stems separated → key/BPM/chords extracted → AI proposes
"Propose something inspired by the references"
"More like track 2 but slower"
"Next" / "Yes" / "Darker"

C — Audio Analysis & Stem Separation

Separate any track into stems (drums, bass, vocals, synths) and extract musical data (key, BPM, chords, MIDI).

python analyzer/pipeline.py references/ --output output/
# → stems/ + midi/ + analysis.json for each track

How They Connect

C (analyze refs) → feeds → B (creative proposals) → outputs via → A (MIDI to hardware)

Each workflow works independently. Use just A for live control, just C for analysis, or the full pipeline.

Architecture

Component	Tech	Runtime
MIDI Engine	Node.js + `easymidi` (MCP server)	Real-time
Creative Engine	Claude AI (MCP tools)	Interactive
Audio Analyzer	Python: Demucs + Essentia + Basic Pitch	Offline/batch
Voice Input	STT → Claude → MCP	Real-time

See docs/architecture-overview.md for the full breakdown.

Primary Hardware: Elektron Syntakt

First target is the Syntakt (12-track digital/analog groovebox). Full CC mapping included. Works with any MIDI-capable hardware — just update the config file.

See examples/syntakt-config.json for the complete mapping.

Key Concept: The "Next" Paradigm

Claude doesn't just execute commands — it proposes musical ideas. The musician navigates:

Voice	Effect
"Propose a bassline"	AI generates, sends to synth
"Next"	New variation
"Yes" / "Keep"	Lock it in
"More like that"	Variations in same direction
"Darker" / "Simpler"	Guide the direction
"Something completely different"	Reset approach

See examples/collaborative-workflow.md for a full session example.

Documentation

Doc	Contents
Architecture Overview	Three workflows, tool choices, how they connect
Research	5 approaches evaluated, STT options, existing tools
Design	Syntakt MIDI mapping, interaction modes, MCP tools
Speech-to-Text	STT engines compared
Approaches	Detailed evaluation of each technical approach

Tool Stack

Real-time (MCP Server)

Tool	Role
easymidi	MIDI output to hardware
MCP SDK	Claude ↔ MIDI bridge

Audio Analysis (Python CLI)

Tool	Role
Demucs v4	Stem separation (best OSS)
Essentia	Key, BPM, chord detection
Basic Pitch	Audio → MIDI conversion

Alternatives Evaluated

Tool	Verdict
ableton-mcp	Good if you use Ableton (not required)
strudel-mcp	Good for patterns, but browser-dependent
daw-mcp	Supports Bitwig + Ableton
Spleeter	Outdated, surpassed by Demucs
LALAL.ai / AudioShake	Commercial, unnecessary for this use

Status

Phase: Research & Design complete. Implementation next.

Research: 5 approaches evaluated
Design: Architecture decided, MCP tools defined
Hardware: Syntakt MIDI CC mapping complete
Tool stack: Demucs + Essentia + Basic Pitch + easymidi
Examples: Patterns, CC control, notes, collaborative workflow
Phase 1: Build MCP server + sequencer
Phase 2: Musical intelligence + proposal engine
Phase 3: Voice integration (STT)
Phase 4: Reference analysis pipeline

Quick Start (Coming Soon)

# Install MCP server
cd mcp-server && npm install

# Install analyzer
cd analyzer && pip install demucs essentia basic-pitch

# Configure Claude Code
# Add MCP server to ~/.claude.json

# Start creating
claude "Connect to Syntakt and play a C minor arpeggio"

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
docs		docs
examples		examples
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Voice MIDI Controller

Three Workflows

A — Voice → MIDI Hardware

B — Reference-Based Composition

C — Audio Analysis & Stem Separation

How They Connect

Architecture

Primary Hardware: Elektron Syntakt

Key Concept: The "Next" Paradigm

Documentation

Tool Stack

Real-time (MCP Server)

Audio Analysis (Python CLI)

Alternatives Evaluated

Status

Quick Start (Coming Soon)

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Voice MIDI Controller

Three Workflows

A — Voice → MIDI Hardware

B — Reference-Based Composition

C — Audio Analysis & Stem Separation

How They Connect

Architecture

Primary Hardware: Elektron Syntakt

Key Concept: The "Next" Paradigm

Documentation

Tool Stack

Real-time (MCP Server)

Audio Analysis (Python CLI)

Alternatives Evaluated

Status

Quick Start (Coming Soon)

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages