NotesMaker

NotesMaker is a real-time lecture note-making project built for turning system audio into usable study material. It listens to audio playing on the computer, converts speech into text using Whisper, and stores that live transcript in raw.txt. When recording is stopped, it automatically runs a second script that reads the transcript and generates summarized notes in notes.txt. The project is designed for long lecture-style audio, not music, and uses VB-Cable to route system output into Python. The workflow is simple: capture audio -> transcribe chunk by chunk -> stop recording -> summarize transcript into notes.

What You Need To Download

Before running the project, install these tools and packages:

1. Python

Python 3.10 or later

2. Python packages

Install these in your terminal:

pip install openai-whisper sounddevice scipy requests numpy

3. FFmpeg

FFmpeg is required by Whisper for audio handling.
Download and install FFmpeg, then make sure it is available in your system PATH.

4. Ollama

Install Ollama from its official website.
Start Ollama on your machine.

5. TinyLlama model in Ollama

After Ollama is installed, pull the model:

ollama pull tinyllama

6. VB-Cable

Install VB-Cable / VB-Audio Virtual Cable.
This is what routes your system audio into Python so the project can hear whatever is playing on your computer.

Step 1: Full Setup

This is the most important step. If the audio routing is wrong, Whisper will not receive the lecture audio properly.

A. Install all dependencies

Make sure Python, FFmpeg, Ollama, TinyLlama, and VB-Cable are installed first.

B. Route system audio through VB-Cable

The project expects the computer audio to go into VB-Cable, and then Python listens to the VB-Cable recording side.

On Windows:

Open Sound Settings
Under Output, change the default output device to: CABLE Input (VB-Audio Virtual Cable)
This means your browser / video / lecture audio is now being sent into VB-Cable

C. Check the recording side

Open the old-style Sound Control Panel
Go to the Recording tab
Find: CABLE Output (VB-Audio Virtual Cable)
This is the device Python should listen to

D. Match sample rate settings

The script records at 44100 Hz, so keep the VB-Cable device settings aligned with that.

In Sound Control Panel:

Open Playback tab
Open properties for CABLE Input
Go to Advanced
Set a format close to 44100 Hz

Then:

Open Recording tab
Open properties for CABLE Output
Go to Advanced
Set the same or matching sample rate, preferably 44100 Hz

If you see sound issues, also try:

turning off extra audio enhancements
keeping the playback and recording formats the same

E. Find the correct Python device index

Run:

py -3.10 list_devices.py

This prints all audio devices. In this project, the script should use the index for CABLE Output.

Open SpeechToText.py and set:

DEVICE_INDEX = 1

Change that number if your VB-Cable device appears at a different index on your machine.

Step 2: Test Whisper First

Before using the full live pipeline, first test whether Whisper is working.

This repo includes test_whisper.py, which transcribes the sample file test_sound.mp3.

Run:

py -3.10 test_whisper.py

If this works, Whisper is installed correctly and can convert audio to text.

Optional helper files:

list_devices.py lists all available audio devices
record_test.py records a short WAV file from the selected input device so you can check whether VB-Cable routing is working

If Step 2 fails, fix that first before moving to live recording.

Step 3: Run Live Speech-To-Text

Once setup and testing are done, start the main transcription script:

py -3.10 SpeechToText.py

What happens:

the script first asks if you want to clear raw.txt
then it asks if you want to clear notes.txt
if both answers are yes, recording starts
if either answer is no, the script exits

During recording:

play your lecture / course / spoken audio on the computer
the script captures system audio through VB-Cable
audio is processed in chunk-sized blocks
Whisper transcribes each chunk
the transcript is appended continuously to raw.txt

You can open raw.txt while the script is running and see the transcript grow over time.

To stop recording:

Ctrl + C

The script does not abruptly stop and throw away the last part. It first flushes the final buffered audio, processes the last block, finishes transcription, and only then exits the capture stage.

Step 4: Automatic Summarization

After SpeechToText.py is stopped, it automatically launches summarizer.py.

What summarizer.py does:

reads the full raw.txt
splits large transcript text into manageable text chunks
sends those chunks to TinyLlama through Ollama
combines the partial summaries
writes the final result to notes.txt

The final notes.txt is meant to contain:

bullet-wise notes
clearer main points
condensed lecture context
a cleaner summary than the raw transcript

Important Notes

This project is meant for lecture-style or educational spoken content.
If you feed it songs or non-lecture audio, the summary can become weird or low quality.
raw.txt is the direct transcript.
notes.txt is the summarized output.
chunks/ contains temporary audio files during processing and is not meant to be kept.

Repo Files

SpeechToText.py - live audio capture and transcription
summarizer.py - transcript summarization using TinyLlama through Ollama
list_devices.py - audio device listing helper
record_test.py - basic input recording test
test_whisper.py - test transcription on sample audio

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NotesMaker

What You Need To Download

1. Python

2. Python packages

3. FFmpeg

4. Ollama

5. TinyLlama model in Ollama

6. VB-Cable

Step 1: Full Setup

A. Install all dependencies

B. Route system audio through VB-Cable

C. Check the recording side

D. Match sample rate settings

E. Find the correct Python device index

Step 2: Test Whisper First

Step 3: Run Live Speech-To-Text

Step 4: Automatic Summarization

Important Notes

Repo Files

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
README.md		README.md
SpeechToText.py		SpeechToText.py
list_devices.py		list_devices.py
record_test.py		record_test.py
summarizer.py		summarizer.py
test_whisper.py		test_whisper.py

Folders and files

Latest commit

History

Repository files navigation

NotesMaker

What You Need To Download

1. Python

2. Python packages

3. FFmpeg

4. Ollama

5. TinyLlama model in Ollama

6. VB-Cable

Step 1: Full Setup

A. Install all dependencies

B. Route system audio through VB-Cable

C. Check the recording side

D. Match sample rate settings

E. Find the correct Python device index

Step 2: Test Whisper First

Step 3: Run Live Speech-To-Text

Step 4: Automatic Summarization

Important Notes

Repo Files

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages