VoxLedger

VoxLedger is a voice-first personal finance assistant that lets a user register, unlock the app with their voice, manage budgets, track expenses, read notifications, navigate pages, and ask finance-related questions using natural spoken English.

It combines a React + Vite frontend with a FastAPI backend, SQLite database, Whisper speech-to-text, MFCC-based voice authentication, and gTTS text-to-speech.

Project Overview

VoxLedger is designed as a hands-free finance assistant. The user can interact with the application through voice instead of depending only on buttons and text input.

The system supports:

user registration
voice sample recording
voice-based unlock/authentication
monthly income setup
monthly and category budget management
expense logging
transactions history
unread notifications
alert reading
page navigation by voice
dark mode voice control
finance-related queries and insights

The project aims to feel like a secure, practical assistant that understands the app and performs the correct action from spoken input.

Key Features

Voice Authentication

Register with a name, password, and voice sample
Unlock the app using the registered voice
Reject silence, random noise, and most invalid audio
Compare spoken input against stored voice embeddings

Voice-Driven Finance Actions

Set monthly income
Set monthly budget
Set category budgets
Add expenses using natural speech
Support both default and user-created categories

App Navigation by Voice

Open dashboard
Open budget page
Open transactions
Open notifications
Open alerts
Open profile
Open Add Voice Sample page
Open conversation page

Notifications and Alerts

Read unread notifications
Read all notifications
Read first or second notification
Mark all notifications as read
Read alerts, critical alerts, warnings, and informational alerts

Smart Queries

Check total spending
Check remaining balance
Ask for user details like name and income
Ask for insights and analytics
Query transactions by time/date where supported

Voice Assistant Controls

Stop speaking immediately with Stop
Ignore many non-app and hallucinated transcripts
Support short and long command styles
Work with simple English phrasing

UI Features

Dark mode / light mode switching by voice
Auto-lock after inactivity
Profile and voice sample management
Conversation history

Tech Stack

Frontend

React 18
TypeScript
Vite
Tailwind CSS
React Router
Radix UI
Framer Motion
TanStack React Query

Backend

FastAPI
Uvicorn
Pydantic
SQLite
NumPy / SciPy / scikit-learn

Audio / AI / Speech

OpenAI Whisper for speech-to-text
librosa for audio processing and MFCC features
gTTS for text-to-speech
ffmpeg for audio conversion and speed adjustment

How the Project Works

Complete Flow

The frontend records audio from the browser microphone.
Audio is sent to the backend.
The backend cleans and analyzes the audio.
If needed, voice authentication checks whether the speaker matches the registered user.
Whisper converts speech to text.
The backend detects the user’s intent.
The correct action is performed:
- navigate page
- update budget
- add expense
- read notifications
- answer a query
The backend generates a spoken response.
The frontend plays the assistant response.

Authentication Flow

User records a secure voice sample during registration.
Backend stores a voice embedding in the database.
At unlock time, spoken audio is converted into a probe embedding.
Cosine similarity is computed against stored embeddings.
Access is granted only if the match is strong enough.

Finance Flow

Example:

User says: I spent two hundred on food
Whisper transcribes the sentence
Intent parser identifies add expense
Amount = 200
Category = Food
Transaction is stored in SQLite
Assistant replies with confirmation

Project Structure

VoxLedger_v9_Final/
│
├── VoxLedger_backend/
│   ├── main.py
│   ├── config.py
│   ├── database.py
│   ├── reset_db.py
│   ├── requirements.txt
│   ├── start_backend.ps1
│   ├── start_backend.sh
│   ├── routes/
│   ├── services/
│   ├── utils/
│   ├── database/
│   ├── voice_samples/
│   └── tts_output/
│
├── VoxLedger_frontend/
│   ├── package.json
│   ├── vite.config.ts
│   ├── tailwind.config.ts
│   ├── start.ps1
│   ├── start_frontend.sh
│   ├── src/
│   └── public/
│
└── README.md

Important Files

Backend

main.py — starts the FastAPI application
database.py — creates and manages the SQLite tables
config.py — central configuration values
reset_db.py — clears existing user data and resets the app
routes/voice_routes.py — main voice command pipeline
services/voice_auth_service.py — voice authentication logic
services/whisper_service.py — speech-to-text logic
services/tts_service.py — text-to-speech generation
utils/intent_parser.py — identifies user intent from text

Frontend

package.json — frontend scripts and dependencies
src/ — application pages, components, and voice UI logic
start.ps1 — Windows helper script to start frontend quickly

Prerequisites

Install these before running the project:

Python 3.10 or above
Node.js 18 or above
npm
ffmpeg added to system PATH

Install ffmpeg

Windows

Use either:

winget install ffmpeg

Or download ffmpeg manually and add it to your PATH.

macOS

brew install ffmpeg

Ubuntu / Debian

sudo apt update
sudo apt install ffmpeg

How to Run the Project

You need two terminals:

one for backend
one for frontend

Backend Startup

Windows PowerShell (recommended)

Open PowerShell inside VoxLedger_backend and run:

.\start_backend.ps1

This script will:

create a virtual environment if missing
activate it
install requirements
start FastAPI on port 8000

Manual Backend Start (Windows / macOS / Linux)

cd VoxLedger_backend
python -m venv venv

Activate the virtual environment:

Windows

venv\Scripts\activate

macOS / Linux

source venv/bin/activate

Install dependencies:

pip install -r requirements.txt

Run the backend:

uvicorn main:app --reload --port 8000

Backend URL:

http://127.0.0.1:8000

API docs:

http://127.0.0.1:8000/docs

Frontend Startup

Windows PowerShell (recommended)

Open a new PowerShell window inside VoxLedger_frontend and run:

.\start.ps1

Manual Frontend Start

cd VoxLedger_frontend
npm install
npm run dev

Frontend URL:

http://localhost:5173

First-Time Usage

Start the backend.
Start the frontend.
Open http://localhost:5173 in your browser.
The splash screen checks whether a user already exists.
If no user exists, go to the registration page.
Register with:
- name
- password
- voice sample
Record the secure voice sample clearly.
After registration, use voice authentication to unlock the app.

Recommended Registration Voice Sample

Use a longer phrase like:

Hello VoxLedger, this is my secure voice sample for authentication. I will use this voice to access my finance assistant.

This improves the uniqueness of the stored voice pattern.

How to Use the Voice Assistant

Basic Use Pattern

Unlock the app with your voice
Open the page you want or directly speak a command
Wait for the assistant to process the request
Hear the response
Say Stop anytime to interrupt speaking

Good Speaking Tips

speak clearly
avoid fan or TV noise nearby
do not whisper during authentication
keep a small pause before speaking
use a normal speaking voice
stay reasonably close to the microphone

Supported Voice Commands

Authentication / Profile

Unlock VoxLedger
Authenticate me
I want to add another voice sample
I need to add one more voice profile
Go to profile page
What is my name?
What is my monthly income?
How many voice samples do I have?

Navigation

Open dashboard
Open budget page
Open transactions
Open notifications
Open alerts
Open profile
Open conversation page
Open add voice sample page

Budget Commands

Set my monthly budget to fifty thousand
Set food category budget to 2000
Set transport budget to 1000
Show my budget

Income Commands

Set my monthly income to one lakh
Update my monthly income
What is my monthly income?

Expense Commands

Add 200 food
Add two hundred rupees for food
Please log two hundred rupees for food expenses
I spent 500 on shopping
Spend 100 on skincare

Query Commands

How much did I spend?
What is my total spending?
How much money have I spent?
What is my remaining balance?
What balance do I have left?
Give me my spending insight
Show my insights

Notification Commands

Read notifications
Read unread notifications
Read first notification
Read second notification
Mark all notifications as read

Alert Commands

Read alerts
Read critical alerts
Read warnings
Read info alerts

Dark Mode Commands

Turn on dark mode
Enable dark mode
Switch to dark theme
Turn off dark mode
Switch to light mode

Delete / Dangerous Commands

Delete first transaction
Delete second transaction
Confirm delete

Stop Command

Stop
Stop speaking

How to Reset or Delete Existing User Data

If you want to remove all old users and register a fresh new user, use the backend reset script.

Method 1: Recommended

Go to the backend folder and run:

cd VoxLedger_backend
python reset_db.py

This clears user-related data from the database, including:

users
voice embeddings
notifications
transactions
budgets
conversation history

After running it, open the frontend again and register a new user.

Method 2: Manual File Reset

The SQLite database is stored at:

VoxLedger_backend/database/voxledger.db

If needed, you can stop the backend and delete this database file manually, then restart the backend so tables are recreated.

Troubleshooting

1. Backend not starting

Check:

Python is installed
virtual environment is activated
pip install -r requirements.txt completed successfully
port 8000 is free

2. Frontend not starting

Check:

Node.js is installed
npm install completed successfully
port 5173 is free

3. Voice commands not working

Check:

browser microphone permission is allowed
backend is running
frontend is running
you are speaking clearly
environment noise is low

4. Voice authentication failing

Check:

a voice sample was actually saved during registration
you are using the same voice/user
the voice sample was recorded clearly
you are not too far from the mic

If needed, reset the database and register again with a cleaner voice sample.

5. ffmpeg not found

If backend logs mention ffmpeg issues:

install ffmpeg
add it to PATH
restart terminal after installation

6. Whisper is slow

Whisper runs on CPU by default on many systems, so the first few requests may feel slower.

Backend and Frontend Ports

Service	Port	URL
Backend	8000	`http://127.0.0.1:8000`
Frontend	5173	`http://localhost:5173`

Security Notes

Passwords exist in the project flow for login/registration, but sensitive details should never be spoken back by the assistant.
Voice authentication should be used for access control, but it should always be tested carefully in noisy real-world environments.
Destructive actions such as transaction deletion should require confirmation.
If you plan to deploy this project publicly, improve:
- password storage security
- session management
- HTTPS usage
- production database setup
- production-grade authentication rules

Future Improvements

Possible future upgrades:

better speaker verification model
stronger noise rejection for cough and room sounds
multilingual support
real-time live VAD on frontend
cloud database instead of SQLite
downloadable reports
charts and richer analytics insights
mobile app version
role-based multi-user voice profiles

Summary

VoxLedger is a complete voice-first finance assistant project built for:

secure voice login
personal finance tracking
budget management
voice navigation
smart finance queries

To use it:

start backend
start frontend
register a user
record a voice sample
unlock with voice
manage finance operations using speech

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
voxledger-app		voxledger-app
.gitignore		.gitignore
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

VoxLedger

Table of Contents

Project Overview

Key Features

Voice Authentication

Voice-Driven Finance Actions

App Navigation by Voice

Notifications and Alerts

Smart Queries

Voice Assistant Controls

UI Features

Tech Stack

Frontend

Backend

Audio / AI / Speech

How the Project Works

Complete Flow

Authentication Flow

Finance Flow

Project Structure

Important Files

Backend

Frontend

Prerequisites

Install ffmpeg

Windows

macOS

Ubuntu / Debian

How to Run the Project

Backend Startup

Windows PowerShell (recommended)

Manual Backend Start (Windows / macOS / Linux)

Frontend Startup

Windows PowerShell (recommended)

Manual Frontend Start

First-Time Usage

Recommended Registration Voice Sample

How to Use the Voice Assistant

Basic Use Pattern

Good Speaking Tips

Supported Voice Commands

Authentication / Profile

Navigation

Budget Commands

Income Commands

Expense Commands

Query Commands

Notification Commands

Alert Commands

Dark Mode Commands

Delete / Dangerous Commands

Stop Command

How to Reset or Delete Existing User Data

Method 1: Recommended

Method 2: Manual File Reset

Troubleshooting

1. Backend not starting

2. Frontend not starting

3. Voice commands not working

4. Voice authentication failing

5. ffmpeg not found

6. Whisper is slow

Backend and Frontend Ports

Security Notes

Future Improvements

Summary

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Packages