PepperWizard

PepperWizard is a Python 3 command-line application for teleoperating the SoftBank Pepper robot. It provides an interactive TUI for voice interaction, teleop, and scripted behaviours, backed by a Dockerised NAOqi bridge.

Features

Modern Bridge Architecture: Integrates with PepperBox to wrap the legacy Python 2.7 / NAOqi SDK in a Docker container, so your code stays in Python 3.
Self-contained MVP: Three-service compose runs on any Linux + Docker host that can reach the robot on the network. Optional services (joystick, vision tracking, proprioception) live in a developer overlay.
Talk Modes: A unified interface for speech and animation, with push-to-talk voice input (CPU Whisper), slash-command autocomplete, spellcheck, and emoticon animation triggers.
Teleop: Keyboard by default; joystick (DualShock via Dualshock-ZMQ) when the dev overlay is enabled.
Experimental Logging: Timestamped JSONL logging of all interactions.
Social State Control: Toggle the robot's autonomous social behaviours.
Battery & Temperature Status: Battery level and joint-temperature warnings polled every 10s.

Architecture

The default docker-compose.yml is a self-contained MVP — three services, no other repos required:

pepper-robot-env: Python 2.7 "shim server" bridging the robot's NAOqi OS to modern Python 3 callers over HTTP :5000.
stt-service: Whisper CPU speech-to-text. Captures audio from the host microphone; communicates over ZeroMQ.
pepper-wizard: The Python 3 CLI in this repo. Drives the robot through the shim and orchestrates optional autonomous features (tracking, perception) when they're available.

A separate docker-compose.dev.yml overlay adds dualshock-publisher (joystick teleop), perception-service (YOLO/mediapipe vision), and proprioception-service (state publishing). These require sibling checkouts of PepperBox and PepperPerception and are aimed at developers.

The pepper-wizard application itself is structured as follows:

main.py: The main application entry point.
robot_client.py: A client class that handles all direct communication with the robot.
teleop.py / keyboard_teleop.py: Joystick and keyboard teleop threads.
command_handler.py: Maps user commands to specific actions.
cli.py: Handles all command-line interface (UI) elements.
config.py: Loads configuration files.

Getting Started

Prerequisites

Linux host
Docker + Docker Compose
A Pepper robot reachable on your network (or a NAOqi simulator)

OS Compatibility: PepperWizard uses network_mode: host and host PulseAudio for the microphone, so it currently requires Linux. macOS and Windows are on the long-term roadmap but are not supported for the MVP.

The NAOqi bridge runs from the public jwgcurrie/pepper-box image — Docker pulls it automatically on first run.

1. Point PepperWizard at your robot

The connection is configured via a robot.env file. Copy the example and edit if your robot isn't at the lab default:

cp robot.env.example robot.env

# robot.env
NAOQI_IP=192.168.123.50   # robot IP (use 127.0.0.1 for a local NAOqi sim)
NAOQI_PORT=9559           # 9559 on physical robots, sim-specific otherwise

2. Build and run (MVP)

docker compose up -d --build                 # build images and start the background services
docker compose stop pepper-wizard            # free port 5561 for the interactive container
docker compose run --rm -it pepper-wizard    # launch the interactive CLI

Subsequent CLI sessions don't need up or stop again — just re-run docker compose run --rm -it pepper-wizard.

Teleop defaults to Keyboard mode; tracking entries are hidden if the optional services aren't running.

Why the stop step? docker compose up instantiates pepper-wizard as a background container that binds port 5561 for external commands. docker compose run creates a second interactive container that needs the same port. Stopping the first frees it for the second; both use network_mode: host.

3. Developer stack (optional)

The developer overlay (docker-compose.dev.yml) adds joystick teleop, proprioception, optional GPU vision tracking, and swaps the physical-robot shim for the qiBullet simulator baked into pepper-box:latest. It requires a sibling checkout of PepperBox; PepperPerception is optional (the bind-mount is auto-created empty if absent, and the perception service itself is gated behind --profile gpu).

docker compose -f docker-compose.yml -f docker-compose.dev.yml --profile gpu up -d --build
docker compose -f docker-compose.yml -f docker-compose.dev.yml stop pepper-wizard
docker compose -f docker-compose.yml -f docker-compose.dev.yml run --rm -it pepper-wizard

Drop --profile gpu on hosts without an NVIDIA runtime; the perception service stays gated and the rest of the stack runs as normal. The stop pepper-wizard step serves the same role as in the MVP path — frees port 5561 for the interactive CLI.

The overlay mounts both sibling repos into the pepper-wizard container for live editing.

Shortening the dev-overlay commands (optional)

Add a .env file (gitignored) to the repo root:

COMPOSE_FILE=docker-compose.yml:docker-compose.dev.yml
COMPOSE_PROFILES=gpu

Docker Compose reads these automatically, so plain docker compose <cmd> then picks both files and the GPU profile:

docker compose up -d --build
docker compose stop pepper-wizard
docker compose run --rm -it pepper-wizard

Omit COMPOSE_PROFILES=gpu on hosts without NVIDIA runtime.

Running against the simulator

Set NAOQI_IP=127.0.0.1 in robot.env to trigger sim mode — pepper-robot-env's entrypoint detects the local IP and boots qiBullet instead of pynaoqi. On first boot it auto-seeds the qiBullet asset cache (Pepper URDF + meshes) into ../PepperBox/.qibullet/.

The cache directory is auto-created by Docker as root-owned on first mount, which blocks the container's pepperdev user (UID 1000) from writing. If the entrypoint prints a permission-denied message, chown the host directory once and recreate the container:

sudo chown -R 1000:1000 ../PepperBox/.qibullet/
docker compose -f docker-compose.yml -f docker-compose.dev.yml up -d --force-recreate pepper-robot-env

Note: in sim mode, proprioception-service is redundant — the qiBullet shim already publishes joint state on :5560. Its restart loop is expected and can be ignored. audio-publisher-service will likewise loop "broker unreachable" because qiBullet has no NAOqi broker; this is also expected.

Usage

Once the application is running, you can enter commands into the terminal.

The application uses an interactive selection menu.

Arrow Keys (↑ / ↓): Navigate selection.
Enter: Confirm selection.

Select Action:
 > Talk Mode [Voice]
   Teleop Mode [Joystick]
   Set Social State [Disabled]
   Tracking Mode [Head]
   Robot State [Wake]
   Gaze at Marker
   Track Object
   Joint Temperatures
   Exit Application

Talk Mode

Talk Mode has two interfaces, toggled with [Tab] in the main menu:

Voice — Push-to-talk speech-to-text. Press [Space] to start recording, [Enter] to stop. Transcriptions are shown for review by default; press [Enter] to confirm, edit inline, or [Esc] to discard. Toggle review mode with /review.
Text — Type sentences directly, with spellcheck, emoticon triggers, and slash-command autocomplete.

Advanced Features

1. Proactive Spellcheck & Confirmation

As you type, the system checks your grammar. If a correction is found:

Interactive UI: You will see a prompt like Pepper (Suggestion) [tag]:.
Tab-Toggle: Press [Tab] to switch between the Suggestion (Cyan) and your Raw Input (White).
Confirm: Press [Enter] to confirm the selected text.

2. Slash-Autocomplete

Type / at any time to see a menu of available commands and animations.

Context Aware: Works at the start of a line or mid-sentence (e.g., Hello /).
Tags: Includes full animation tags (e.g., /happy, /bow).
Safety: Only triggers when you explicitly type /, preventing accidental activations.

3. Available Inputs

Plain Speech: Enter any text.
Emoticon-Triggered Animation: Include a recognized emoticon (e.g., :), XD).
Hotkey-Triggered Blocking Animation: Include a hotkey (e.g., /N, /Y).
Tag-Triggered Animation: Use the autocomplete menu to select a tag (e.g., /happy).
/help - Show contextual help for the talk mode.
/q - Quit talk mode and return to the main menu.

Configuration

You can customise some of the robot's behaviors by editing the JSON files:

animations.json: Maps animation names to single-character keys. These tags are used internally and by emoticon_map.json.
emoticon_map.json: Maps emoticons (e.g., :), :() to animation names (e.g., happy, sad). This allows for dynamic animation triggering in Unified Talk Mode.
quick_responses.json: Defines phrases and animations that can be triggered by hotkeys (e.g., /N) in the Unified Talk Mode. The animation field in each entry is used to determine which animation to play.

Logging

PepperWizard includes a logging system that captures robot interactions, user commands, and application events.

Log Files

Logs are automatically saved to the logs/ directory in JSON Lines (JSONL) format.

Default Naming: Log files are automatically timestamped: logs/session_YYYY-MM-DD_HH-MM-SS.jsonl

Custom Session ID: You can specify a custom session ID to create a specific filename (e.g., logs/session_P01.jsonl):

docker compose run --rm -it pepper-wizard python3 -m pepper_wizard.main --proxy-ip host.docker.internal --proxy-port 5000 --session-id P01

Console Output

By default, the console output is minimal, only showing critical warnings or errors.

Verbose Mode: To see all logs (INFO/DEBUG) in the console in real-time, use the --verbose flag:

docker compose run --rm -it pepper-wizard python3 -m pepper_wizard.main --proxy-ip host.docker.internal --proxy-port 5000 --verbose

Testing

To verify PepperWizard end-to-end, run the automated integration test. This simulates a full user session (connecting to the robot, speaking, moving, etc.) and verifies the log output. It is recommended to run this in simulation in case your robot accidentally runs into a wall.

docker compose run --rm pepper-wizard python3 tests/integration_test.py

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 89 Commits
assets		assets
pepper_wizard		pepper_wizard
stt-service		stt-service
tests		tests
tools		tools
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.dev.yml		docker-compose.dev.yml
docker-compose.yml		docker-compose.yml
pepper_wizard.py		pepper_wizard.py
plot_logs.py		plot_logs.py
requirements.txt		requirements.txt
robot.env.example		robot.env.example

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PepperWizard

Features

Architecture

Getting Started

Prerequisites

1. Point PepperWizard at your robot

2. Build and run (MVP)

3. Developer stack (optional)

Shortening the dev-overlay commands (optional)

Running against the simulator

Usage

Talk Mode

Advanced Features

1. Proactive Spellcheck & Confirmation

2. Slash-Autocomplete

3. Available Inputs

Configuration

Logging

Log Files

Console Output

Testing

License

About

Uh oh!

Releases 11

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PepperWizard

Features

Architecture

Getting Started

Prerequisites

1. Point PepperWizard at your robot

2. Build and run (MVP)

3. Developer stack (optional)

Shortening the dev-overlay commands (optional)

Running against the simulator

Usage

Talk Mode

Advanced Features

1. Proactive Spellcheck & Confirmation

2. Slash-Autocomplete

3. Available Inputs

Configuration

Logging

Log Files

Console Output

Testing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 11

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages