Hush

Spatial Spiking Neural Network for Speech Recognition

A biologically-plausible speech recognition system using spiking neural networks. No gradient descent. No backpropagation. No training loops. Just exposure, association, and self-organization.

Philosophy

No training. Exposure and self-adaptation.

The network starts minimal. Speech data shapes the structure through trial and error. Neurons spawn where needed, connections form through correlation, unused paths die. The problem sculpts the solution.

This is not machine learning in the traditional sense. There are no:

Loss functions
Gradient descent
Backpropagation
Weight matrices
Epochs or batches

Instead, learning happens through:

Exposure — streaming audio through the network
Association — binding MFCC patterns to teacher characters
Importance scoring — biological tagging of useful vs noise patterns
Consolidation — sleep-like memory cleanup between learning phases
Prediction-surprise — sequence expectations boost learning

Results

Configuration	Accuracy
Base SNN only	35-40%
+ Importance weighting	79%
+ Onset suppression	85%
+ Stability gating	90%
+ Learned LexicalBank	100%

100% accuracy on test batch with only 169 learned word mappings (vs 50k+ dictionary entries in traditional systems).

Architecture

┌─────────────────────────────────────────────────────────────────────────┐
│  STAGE 1: PRIMARY AUDITORY CORTEX (SpatialSpeechNet)                   │
│                                                                         │
│  MFCC → [Onset Suppression] → [Stability Gate] → Motor Output          │
│              (5 frames)          (>85% sim)        (raw chars)         │
└─────────────────────────────────────────────────────────────────────────┘
                                    ↓
                           raw: "ilustration"
                                    ↓
┌─────────────────────────────────────────────────────────────────────────┐
│  STAGE 2: WERNICKE'S AREA (LexicalBank)                                │
│                                                                         │
│  Raw chars → [Learned Mappings] → [Importance Weighted] → Refined      │
└─────────────────────────────────────────────────────────────────────────┘
                                    ↓
                         refined: "illustration"

Two-Stage Biological Pipeline

The system models the real auditory cortex → Wernicke's area pathway:

Primary Auditory Cortex (SpatialSpeechNet)
- 26 sensory neurons (MFCC input)
- 29 motor neurons (alphabet output)
- 32 memory neurons (databank interface)
- Character-level pattern recognition
- Produces raw transcription with systematic errors
Wernicke's Area (LexicalBank)
- Learned word-level refinement
- Only applies high-importance (proven) corrections
- Self-correcting through feedback

Key Innovations

Onset Suppression

First 5 frames after speech onset are suppressed
Biological basis: auditory cortex shows ~50-100ms adaptation at sound onset
Fixes prepended character errors ("yes" → "syes")

Stability Gating

Only process frames with >85% similarity to previous
Biological basis: neurons are most discriminative during stable periods
Reduces false associations at phoneme boundaries

Importance Scoring

Patterns tagged with importance (0-255)
Correct recall: +16 importance
Wrong recall: -8 importance
Low-importance patterns pruned during consolidation
Self-correcting: bad patterns decay, good ones persist

Mastery-Based Curriculum

Start with 10 samples, repeat until mastery
Expand batch size by 1.5x on advancement
Consolidate memory between grades (like sleep)

Usage

Prepare Data

Convert audio to pre-processed MFCC spool:

cargo run --bin hush-prepare -- \
    --manifest data/librispeech/manifest.json \
    --output data/dev-clean.spool

Train

cargo run --release --bin expose -- \
    --spool data/dev-clean.spool \
    --sort-by-length \
    --max-transcript-len 20 \
    --initial-batch 10 \
    --mastery-threshold 0.40 \
    --target-accuracy 0.50

Output

╔════════════════════════════════════════════════════════════════╗
║                    LEARNING COMPLETE                           ║
╠════════════════════════════════════════════════════════════════╣
║ CURRICULUM                                                     ║
║   Grades completed:       3                                    ║
║   Total passes:          21                                    ║
╠════════════════════════════════════════════════════════════════╣
║ PERFORMANCE                                                    ║
║   Total time:           45.23 s                                ║
║   Frames processed:    105847                                  ║
║   Ticks executed:      423388                                  ║
║   Frames/sec:          2341.2  (23.4x real-time)               ║
║   Ticks/sec:           9364.8                                  ║
╠════════════════════════════════════════════════════════════════╣
║ NETWORK STRUCTURE                                              ║
║   Neurons total:         64  (active: 48, healthy: 61)         ║
║   Synapses total:       847                                    ║
║     Sensory→Memory:     156                                    ║
║     Memory→Motor:        89                                    ║
╠════════════════════════════════════════════════════════════════╣
║ MEMORY BANKS                                                   ║
║   Associations:         129                                    ║
║   Sequences:              0                                    ║
║   Lexical mappings:     169  (127 high-importance)             ║
╠════════════════════════════════════════════════════════════════╣
║ IMPORTANCE SCORING                                             ║
║   Low (noise):           42                                    ║
║   High (useful):         87                                    ║
║   Average:             142.3                                   ║
╠════════════════════════════════════════════════════════════════╣
║ RESOURCE USAGE                                                 ║
║   Est. memory:          12.38 KB                               ║
║   Bytes/neuron:          145  (39 base + synapses)             ║
╚════════════════════════════════════════════════════════════════╝

Biological Plausibility

Feature	Biological Basis
No backprop	Local learning rules only (Hebbian-like)
Spike-driven	Communication via discrete spikes
Spatial structure	Neurons exist in 3D, proximity-based connectivity
Memory separation	Databanks external (like hippocampus)
Sleep consolidation	Offline memory cleanup and strengthening
Importance tagging	Neuromodulator-like gating
Onset adaptation	Auditory cortex ~50-100ms adaptation
Stability gating	Discriminative during stable periods
Hierarchical processing	Primary cortex → Wernicke's area
Lexical access	Word-form vocabulary matching

Key Insights

Over-connect then prune beats sparse-grow
Curriculum matters — short samples first
Surprise drives learning — unexpected correct = strong reinforcement
Consolidation is essential — without it, memory bloats with noise
Two-stage refinement — neither stage perfect alone, together they work
Learned corrections beat static rules — 169 mappings > 50k dictionary

Project Structure

src/
├── spatial.rs      # SpatialSpeechNet - onset suppression, stability gating
├── memory.rs       # SpeechIO, AssociationBank, LexicalBank
├── bin/
│   └── expose.rs   # Two-stage pipeline: SNN → LexicalBank refinement
├── mfcc.rs         # MFCC extraction
├── spool.rs        # Audio spool reading
└── decoding.rs     # CTC-like decoding with sustained-first-char

Dependencies

neuropool — Biological neuron pool substrate
dataspool-rs — Pre-processed sample storage

License

MIT OR Apache-2.0

References

DeepSpeech architecture (for comparison, not implementation)
Biological auditory cortex processing
Wernicke's area and lexical access
Hebbian learning and spike-timing-dependent plasticity
Memory consolidation during sleep

Built by Blackfall Labs

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.github		.github
examples		examples
src		src
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
PLAN.md		PLAN.md
PLAN_PREK_CURRICULUM.md		PLAN_PREK_CURRICULUM.md
PLAN_PROPER_PERSISTENCE.md		PLAN_PROPER_PERSISTENCE.md
README.md		README.md
hush.md		hush.md
transcription.md		transcription.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Hush

Philosophy

Results

Architecture

Two-Stage Biological Pipeline

Key Innovations

Usage

Prepare Data

Train

Output

Biological Plausibility

Key Insights

Project Structure

Dependencies

License

References

About

Uh oh!

Releases

Sponsor this project

Uh oh!

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Hush

Philosophy

Results

Architecture

Two-Stage Biological Pipeline

Key Innovations

Usage

Prepare Data

Train

Output

Biological Plausibility

Key Insights

Project Structure

Dependencies

License

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages