PRX Voice Engine

A real-time voice conversation orchestration engine built in Rust. PRX Voice manages the full lifecycle of AI-powered voice sessions — from speech recognition to agent reasoning to speech synthesis — with enterprise-grade observability, multi-tenancy, and compliance built in.

Architecture

┌─────────────────────────────────────────────────────┐
│                   prx-voice-bin                     │  ← Entry point
├─────────────────────────────────────────────────────┤
│              prx-voice-control                      │  ← REST / gRPC / WebSocket API
│         (auth, jwt, ratelimit, routing)              │
├──────────────┬──────────────┬───────────────────────┤
│  session     │   adapter    │   transport           │
│ (orchestrator│ (asr, tts,   │  (websocket,          │
│  manager,    │  agent, vad, │   media channel)      │
│  handoff,    │  fallback)   │                       │
│  recording)  │              │                       │
├──────────────┴──────────────┴───────────────────────┤
│   state (FSM)  │  event (bus, envelope, replay)     │
├────────────────┴────────────────────────────────────┤
│  policy    │  billing   │  audit     │  observe     │
│ (rbac,     │ (ledger,   │ (record,   │ (metrics,    │
│  tenant,   │  meter,    │  store,    │  slo,        │
│  quota)    │  pricing)  │  compliance│  degradation)│
├─────────────────────────────────────────────────────┤
│  storage (postgres, memory, object, migrations)     │
├─────────────────────────────────────────────────────┤
│  core (settings, flags, security, deploy)           │
├─────────────────────────────────────────────────────┤
│  types (ids, error, redact)                         │  ← Zero-dependency leaf
└─────────────────────────────────────────────────────┘

Crate Overview

Crate	Purpose
prx-voice-types	Shared IDs, error codes, enums. Zero external dependencies — leaf crate for the entire workspace.
prx-voice-state	12-state session FSM (Idle → Listening → UserSpeaking → AsrProcessing → Thinking → Speaking → …) with transition rules and interrupt semantics.
prx-voice-event	CloudEvents v1.0 event system with bus, envelope, payload definitions, and event replay.
prx-voice-adapter	Trait-based adapter interfaces for ASR, TTS, Agent, and VAD. Ships with mock, Deepgram, Azure, OpenAI, and local Sherpa/Ollama implementations.
prx-voice-session	Session orchestrator coordinating state machine, adapters, event emission, turn management, handoff, and recording.
prx-voice-transport	WebSocket and media channel abstraction.
prx-voice-control	HTTP/gRPC control plane — REST API, JWT authentication, rate limiting.
prx-voice-policy	Multi-tenant isolation, RBAC permission model, quota enforcement.
prx-voice-billing	Usage metering, ledger, and pricing engine.
prx-voice-audit	Audit logging, compliance checks, and audit record storage.
prx-voice-observe	Prometheus metrics, SLO monitoring, 4-level degradation strategy, incident management.
prx-voice-storage	PostgreSQL, in-memory, and object storage backends with migration support.
prx-voice-core	Global configuration, feature flags, security settings, deployment config.
prx-voice-bin	Server binary entry point with graceful shutdown.

Getting Started

Prerequisites

Rust 1.85+ (Edition 2024)
PostgreSQL 15+ (optional, in-memory store available for development)

Build

cargo build --workspace

Run

# With default config (mock adapters, in-memory store)
cargo run -p prx-voice-bin

# With custom config
PRX_VOICE_HOST=0.0.0.0 PRX_VOICE_PORT=3000 cargo run -p prx-voice-bin

# Override specific settings via environment variables
PRX_VOICE_SERVER__PORT=8080 cargo run -p prx-voice-bin

The server starts on http://localhost:3000 by default.

Docker

# Build and run
docker compose up --build

# Or build image directly
docker build -t prx-voice .
docker run -p 3000:3000 prx-voice

Test

# Run all tests
cargo test --workspace

# Run specific crate tests
cargo test -p prx-voice-state
cargo test -p prx-voice-session

# Integration tests
cargo test -p prx-voice-integration-tests

Configuration

Configuration is loaded from config.yaml and can be overridden via environment variables with the PRX_VOICE_ prefix:

server:
  host: "0.0.0.0"
  port: 3000

session:
  max_duration_sec: 1800
  max_turns: 100
  interrupt_enabled: true
  default_language: "en-US"

adapters:
  default_asr_provider: "mock"     # mock | deepgram | local
  default_agent_provider: "mock"   # mock | openai | local
  default_tts_provider: "mock"     # mock | azure | local

Adapter Providers

Component	Provider	Description
ASR	`mock`	Returns canned transcriptions for testing
ASR	`deepgram`	Deepgram streaming ASR (`DEEPGRAM_API_KEY`)
ASR	`local`	Sherpa-ONNX offline ASR
Agent	`mock`	Echo agent for testing
Agent	`openai`	OpenAI GPT models (`OPENAI_API_KEY`)
Agent	`local`	Ollama local LLM
TTS	`mock`	Returns silence for testing
TTS	`azure`	Azure Cognitive Services TTS (`AZURE_SPEECH_KEY`, `AZURE_SPEECH_REGION`)
TTS	`local`	Sherpa-ONNX offline TTS

API

REST Endpoints

POST   /api/v1/sessions              Create a new voice session
GET    /api/v1/sessions/:id          Get session details
POST   /api/v1/sessions/:id/end      End a session
POST   /api/v1/sessions/:id/interrupt Interrupt current playback
GET    /api/v1/sessions/:id/events   SSE event stream
GET    /api/v1/health/live           Liveness probe
GET    /api/v1/health/ready          Readiness probe
GET    /api/v1/metrics               Prometheus metrics

WebSocket

GET    /api/v1/ws/:session_id        Full-duplex audio + events

All REST responses use a unified envelope:

{
  "request_id": "uuid",
  "timestamp": "2024-01-01T00:00:00Z",
  "data": { ... },
  "error": null
}

Deployment

Kubernetes (Helm)

helm install prx-voice deploy/helm/prx-voice \
  -f deploy/helm/values-us-east.yaml

Region-specific value files are provided for us-east, us-west, and eu-west.

Project Status

This project is in alpha. The core session lifecycle, state machine, event system, and adapter framework are functional. Enterprise features (billing, audit, RBAC) are structurally complete but not yet production-hardened.

License

MIT License — see LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
crates		crates
deploy/helm		deploy/helm
web-console		web-console
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
config.yaml		config.yaml
docker-compose.yaml		docker-compose.yaml
rust-toolchain.toml		rust-toolchain.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PRX Voice Engine

Architecture

Crate Overview

Getting Started

Prerequisites

Build

Run

Docker

Test

Configuration

Adapter Providers

API

REST Endpoints

WebSocket

Deployment

Kubernetes (Helm)

Project Status

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PRX Voice Engine

Architecture

Crate Overview

Getting Started

Prerequisites

Build

Run

Docker

Test

Configuration

Adapter Providers

API

REST Endpoints

WebSocket

Deployment

Kubernetes (Helm)

Project Status

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages