A Python-based animatronic raven integrating speech recognition, face detection & tracking, low-latency audio I/O, capacitive touch sensing, and precise servo-driven movements for lifelike interaction.
- Offline speech recognition - Uses Vosk models locally on the Pi - no cloud or internet required
- Real-time face detection, recognition & tracking - Follows faces via the
face-recognition
library - Low-latency audio I/O - Playback & recording through
sounddevice
for smooth responses - Capacitive touch interaction - Detects touch through feathers or perches with a custom
smbus2
-based MPR121 driver - Precise servo control - Pololu Maestro interface drives beak, wings, body, and two-axis head movement (up/down & left/right)
- Asynchronous task coordination - Non-blocking routines let sensing, speaking, and moving run in parallel
.
├── Servo.py # Maestro servo control interface
├── animatron_move.py # Background movement routines
├── animatron_speak.py # Speech playback and event handling
├── samuel_main.py # Entry point for standard operation
├── samuel_async.py # Async-driven operation example
├── config.py # Behavior and threshold configurations
├── global_state.py # Shared events and state definitions
├── timer_window_for_programmer.py # Developer timing visualization tool
├── touch_sensor.py # MPR121 capacitive-touch sensor driver
├── requirements.txt # Python dependency list
├── tox.ini # Testing & linting configuration
└── README.md # Project documentation
- Raspberry Pi (preferably model 5, with Python 3.8+)
- Pololu Maestro servo controller (USB connection, 6–12 channels)
- Standard hobby servos for beak, two-axis head, wings, and body
- USB class-compliant microphone (any generic USB mic)
- Speaker with 3.5 mm jack or USB audio interface
- LED indicators - Two LED lights for the raven's eyes (blinking mechanism)
- Adafruit MPR121 capacitive-touch breakout for interactive touch sensing
Samuel’s speech movements are driven by a preprocessing pipeline that converts audio into time-synced servo instructions. This is handled by a separate project called volume-analyzer
, which analyzes raven sound clips and generates movement maps for Samuel’s mouth.
Unlike real-time volume tracking, this system generates multiple servo instruction maps per audio file in advance. This enables lifelike variation and gives Samuel a sense of "free will" when choosing how to respond.
Each raven call is analyzed into several binary movement sequences—lists of 1
s (mouth open) and 0
s (mouth closed)—based on patterns in vocal intensity.
These sequences are grouped by different sensitivity levels, producing a range of expressive options per file.
Example output:
{
"head_pat5.mp3": {
"0": [0, 0, 1, 1, ..., 0],
"1": [0, 1, 1, 1, ..., 0],
"2": [...],
}
}
The binary 0/1 sequences generated by volume-analyzer
are integrated into speak_dictionary
entries and used during audio playback. Each 1
triggers a mouth servo pulse (open), while each 0
keeps the beak closed—producing realistic movement that mirrors the rhythm and intensity of the raven's call.
To make Samuel’s behavior feel more lifelike and less robotic, each audio file is preprocessed into multiple motion maps, each based on a different sensitivity threshold. These maps vary in how expressive Samuel is (e.g., more or fewer beak movements depending on the threshold used).
At runtime, Samuel randomly selects one of the available maps for the current audio clip. This randomness introduces variation in timing and expressiveness—even when repeating the same sound—creating the illusion of "free will" and making his performances more engaging.
This precomputed and randomized approach also ensures low-latency playback while preserving a sense of spontaneity in Samuel’s responses.
Developer Tools (Optional Graphs)
The volume-analyzer
project includes tools for visualizing the audio analysis process during development.
Available graphs:
- RMS energy plot
- STFT-based power spectrogram
- Decibel-scaled volume curve
from volume_analyzer import generate_graphs
generate_graphs(
generate_rms=True,
generate_power=True,
generate_volume=True
)
These graphs help ensure clusters align well with audio features and offer insight into how the servo timing was derived.
- Python 3.8+
- Install dependencies:
git clone https://github.com/Anatw/samuel_the_raven.git cd samuel_the_raven pip install -r requirements.txt
- Key libraries:
- `vosk` for offline speech recognition
- `sounddevice` & `PySoundFile` for audio I/O
- `face-recognition` for vision-based interaction
- `smbus2` for I²C communication with MPR121
Edit config.py to adjust:
- Movement pulse-width ranges and repetition counts
- Speech repetition intervals
- Touch thresholds and debounce settings
- Face-recognition upsample factor (speed vs. accuracy)
python samuel_main.py
On Raspberry Pi, this also automatically starts the developer timing visualization tool.
- Connect any USB mic to the Pi’s USB port and verify with:
arecord -l
- In Python, select your mic via `sounddevice`:
import sounddevice as sd sd.default.device = 'Your USB Mic Name'
- Wire VIN → Pi 3.3 V (pin 1); GND → Pi GND (pin 6)
- Wire SDA → SDA1 (pin 3, BCM 2); SCL → SCL1 (pin 5, BCM 3)
- Enable I²C on the Pi:
sudo raspi-config # Interfacing Options → I2C → enable → reboot
- Install `smbus2`:
pip3 install smbus2
- Use `touch_sensor.py` driver for MPR121 initialization, threshold tuning, debounce, and polling.
- Example usage:
from touch_sensor import MPR121TouchSensor sensor = MPR121TouchSensor( touch_thresh=12, release_thresh=6, touch_conf=3, release_conf=3, dt=1, dr=1, poll_interval=0.1 )
- Troubleshoot I²C with:
i2cdetect -y 1
- We use `tox` to manage testing and lint checks
- Install `tox` if needed:
pip install tox
- Run all environments:
tox
- `tox.ini` defines:
- `py` for future pytest suite
- `lint` for flake8, black, isort, etc.
Contributions are welcome!
- Open an issue to discuss changes or submit a pull request
- Ensure code passes:
tox -e lint tox -e py
Please be respectful and inclusive in all project discussions.
This project is licensed under the MIT License. See LICENSE for details.
- Blog – Animatronic Menagerie
- GitHub – Anatw
Developed by Anat Wax