Samuel the Raven

A Python-based animatronic raven integrating speech recognition, face detection & tracking, low-latency audio I/O, capacitive touch sensing, and precise servo-driven movements for lifelike interaction.

Features

Offline speech recognition - Uses Vosk models locally on the Pi - no cloud or internet required
Real-time face detection, recognition & tracking - Follows faces via the face-recognition library
Low-latency audio I/O - Playback & recording through sounddevice for smooth responses
Capacitive touch interaction - Detects touch through feathers or perches with a custom smbus2-based MPR121 driver
Precise servo control - Pololu Maestro interface drives beak, wings, body, and two-axis head movement (up/down & left/right)
Asynchronous task coordination - Non-blocking routines let sensing, speaking, and moving run in parallel

Repository Structure

.
├── Servo.py                          # Maestro servo control interface
├── animatron_move.py                 # Background movement routines
├── animatron_speak.py                # Speech playback and event handling
├── samuel_main.py                    # Entry point for standard operation
├── samuel_async.py                   # Async-driven operation example
├── config.py                         # Behavior and threshold configurations
├── global_state.py                   # Shared events and state definitions
├── timer_window_for_programmer.py    # Developer timing visualization tool
├── touch_sensor.py                   # MPR121 capacitive-touch sensor driver
├── requirements.txt                  # Python dependency list
├── tox.ini                           # Testing & linting configuration
└── README.md                         # Project documentation

Hardware Requirements

Raspberry Pi (preferably model 5, with Python 3.8+)
Pololu Maestro servo controller (USB connection, 6–12 channels)
Standard hobby servos for beak, two-axis head, wings, and body
USB class-compliant microphone (any generic USB mic)
Speaker with 3.5 mm jack or USB audio interface
LED indicators - Two LED lights for the raven's eyes (blinking mechanism)
Adafruit MPR121 capacitive-touch breakout for interactive touch sensing

Speech Synchronization and Movement Coordination

Samuel’s speech movements are driven by a preprocessing pipeline that converts audio into time-synced servo instructions. This is handled by a separate project called volume-analyzer, which analyzes raven sound clips and generates movement maps for Samuel’s mouth.

Unlike real-time volume tracking, this system generates multiple servo instruction maps per audio file in advance. This enables lifelike variation and gives Samuel a sense of "free will" when choosing how to respond.

Audio Analysis with `volume-analyzer`

Each raven call is analyzed into several binary movement sequences—lists of 1s (mouth open) and 0s (mouth closed)—based on patterns in vocal intensity.

These sequences are grouped by different sensitivity levels, producing a range of expressive options per file.

Example output:

{
  "head_pat5.mp3": {
    "0": [0, 0, 1, 1, ..., 0],
    "1": [0, 1, 1, 1, ..., 0],
    "2": [...],
  }
}

Integration in Samuel

The binary 0/1 sequences generated by volume-analyzer are integrated into speak_dictionary entries and used during audio playback. Each 1 triggers a mouth servo pulse (open), while each 0 keeps the beak closed—producing realistic movement that mirrors the rhythm and intensity of the raven's call.

To make Samuel’s behavior feel more lifelike and less robotic, each audio file is preprocessed into multiple motion maps, each based on a different sensitivity threshold. These maps vary in how expressive Samuel is (e.g., more or fewer beak movements depending on the threshold used).

At runtime, Samuel randomly selects one of the available maps for the current audio clip. This randomness introduces variation in timing and expressiveness—even when repeating the same sound—creating the illusion of "free will" and making his performances more engaging.

This precomputed and randomized approach also ensures low-latency playback while preserving a sense of spontaneity in Samuel’s responses.

Developer Tools (Optional Graphs)

The volume-analyzer project includes tools for visualizing the audio analysis process during development.

Available graphs:

RMS energy plot
STFT-based power spectrogram
Decibel-scaled volume curve

from volume_analyzer import generate_graphs

generate_graphs(
    generate_rms=True,
    generate_power=True,
    generate_volume=True
)

These graphs help ensure clusters align well with audio features and offer insight into how the servo timing was derived.

Software Requirements

Python 3.8+

Install dependencies:

git clone https://github.com/Anatw/samuel_the_raven.git
cd samuel_the_raven
pip install -r requirements.txt

Key libraries:
- `vosk` for offline speech recognition
- `sounddevice` & `PySoundFile` for audio I/O
- `face-recognition` for vision-based interaction
- `smbus2` for I²C communication with MPR121

Configuration

Edit config.py to adjust:

Movement pulse-width ranges and repetition counts
Speech repetition intervals
Touch thresholds and debounce settings
Face-recognition upsample factor (speed vs. accuracy)

Usage

python samuel_main.py

On Raspberry Pi, this also automatically starts the developer timing visualization tool.

Hardware Setup & Calibration

Audio Setup

Connect any USB mic to the Pi’s USB port and verify with:
```
arecord -l
```

In Python, select your mic via `sounddevice`:

import sounddevice as sd
sd.default.device = 'Your USB Mic Name'

Touch Sensor Setup

Wire VIN → Pi 3.3 V (pin 1); GND → Pi GND (pin 6)
Wire SDA → SDA1 (pin 3, BCM 2); SCL → SCL1 (pin 5, BCM 3)

Enable I²C on the Pi:

sudo raspi-config
# Interfacing Options → I2C → enable → reboot

Install `smbus2`:
```
pip3 install smbus2
```
Use `touch_sensor.py` driver for MPR121 initialization, threshold tuning, debounce, and polling.

Example usage:

from touch_sensor import MPR121TouchSensor

sensor = MPR121TouchSensor(
    touch_thresh=12,
    release_thresh=6,
    touch_conf=3,
    release_conf=3,
    dt=1,
    dr=1,
    poll_interval=0.1
)

Troubleshoot I²C with:
```
i2cdetect -y 1
```

Software Testing & Linting

We use `tox` to manage testing and lint checks
Install `tox` if needed:
```
pip install tox
```
Run all environments:
```
tox
```
`tox.ini` defines:
- `py` for future pytest suite
- `lint` for flake8, black, isort, etc.

Contributing

Contributions are welcome!

Open an issue to discuss changes or submit a pull request
Ensure code passes:
```
tox -e lint
tox -e py
```

Code of Conduct

Please be respectful and inclusive in all project discussions.

License

This project is licensed under the MIT License. See LICENSE for details.

Contact

Blog – Animatronic Menagerie
GitHub – Anatw

Developed by Anat Wax

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Samuel the Raven

Features

Repository Structure

Hardware Requirements

Speech Synchronization and Movement Coordination

Audio Analysis with `volume-analyzer`

Integration in Samuel

Software Requirements

Configuration

Usage

Hardware Setup & Calibration

Audio Setup

Touch Sensor Setup

Software Testing & Linting

Contributing

Code of Conduct

License

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
README.md		README.md
Servo.py		Servo.py
animatron_move.py		animatron_move.py
animatron_speak.py		animatron_speak.py
camera_face_tracking.py		camera_face_tracking.py
config.py		config.py
global_state.py		global_state.py
requirements.txt		requirements.txt
samuel_async.py		samuel_async.py
samuel_main.py		samuel_main.py
speech_recognition.py		speech_recognition.py
timer_window_for_programmer.py		timer_window_for_programmer.py
tox.ini		tox.ini
utils.py		utils.py

Anatw/samuel_the_raven

Folders and files

Latest commit

History

Repository files navigation

Samuel the Raven

Features

Repository Structure

Hardware Requirements

Speech Synchronization and Movement Coordination

Audio Analysis with volume-analyzer

Integration in Samuel

Software Requirements

Configuration

Usage

Hardware Setup & Calibration

Audio Setup

Touch Sensor Setup

Software Testing & Linting

Contributing

Code of Conduct

License

Contact

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Audio Analysis with `volume-analyzer`

Packages