PrivateScribe

Private, on-device transcription & summarisation with a clean GOV.UK-style interface.
No servers. No cloud. Your audio and text stay on your machine.

🎙️ Live microphone capture or upload recordings (mp3 / wav / m4a / ogg / webm)
🔤 Whisper (Transformers.js, WASM) for speech-to-text — runs entirely in the browser
🧠 WebLLM summariser & text-formatter — runs with WebGPU (WASM fallback)
📝 Extractive summaries (no invention) + action-items scaffold
🗂️ Local History, Markdown export, Print to PDF
🧱 GOV.UK Design System look & feel (no government branding)

Privacy: All processing happens on your device. Nothing is uploaded.

Demo & features

Home
Start/Stop recording → live transcript (fixed-height scroller), live extractive summary.
Or Upload an audio file instead of the mic.
Models
Choose your Whisper checkpoint (tiny / base / small) and WebLLM model; warm each once.
Formatter
Paste messy text (no punctuation/casing) → get clean, properly formatted text.
History
Save sessions locally and re-open later.
Export
Markdown (.md) and Print to PDF.

Requirements

Node.js 18+ (tested with Node 22.19.0)
OS: Windows 10/11, macOS, or Linux
Browser: Chrome / Edge (WebGPU preferred)
- Check WebGPU: open chrome://gpu and verify WebGPU is enabled.

Quick start (Windows / macOS / Linux)

# 1) Clone your repo
git clone https://github.com/<your-username>/privatescribe-web.git
cd privatescribe-web

# 2) Install dependencies
npm install

# 3) Copy GOV.UK dist (CSS/JS/assets) into /public/govuk
npm run copy:govuk

# 4) Start dev server
npm run dev
# → open http://localhost:3000


**Windows PowerShell note (if npm is blocked):**

```powershell
Set-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass

Run

Development

npm run dev          # Next dev server: http://localhost:3000

Production

npm run build        # Build production bundle
npm run copy:govuk   # Ensure govuk dist is present in /public/govuk
npm run start        # Start prod server (set PORT=xxxx to change)

Clear Next.js cache (if needed)

# macOS/Linux
rm -rf .next

# Windows PowerShell
Remove-Item -Recurse -Force .next

Usage

Warm models (first run)
- Go to /models
- Pick Whisper: Xenova/whisper-base.en (recommended) or Xenova/whisper-small.en (best, heavier)
- Pick WebLLM: e.g. Llama-3.2-1B-Instruct-q4f16_1-MLC or Phi-3-mini-4k-instruct-q4f16_1-MLC
- Click Download / Warm up for each until status shows Ready (Choices persist in localStorage.)
Record or upload
- Start recording → allow mic → watch live transcript & summary
- Or Upload recording (mp3/wav/m4a/ogg/webm) → decoded locally, transcribed in ~15s chunks
Save & export
- Save session to store locally (open via History)
- Export Markdown or Export PDF (Print)
Fix unformatted text
- Paste raw text → Fix formatting (on-device via WebLLM/WASM)

Model selection & performance tips

Whisper (ASR)

Recommended: Xenova/whisper-base.en
Best accuracy (heavier): Xenova/whisper-small.en
Fastest (least accurate): Xenova/whisper-tiny.en

ASR decoding (pre-tuned in worker):

temperature: 0 (deterministic)
num_beams: 5 (beam search helps names/rare words)
stride_length_s: 2 (overlap improves word boundaries)
language: 'en' (prevents language drift)
Cleans [BLANK_AUDIO] / [MUS_AUDIO] artifacts

Audio tips

Use a decent mic; keep close; reduce noise
Uploads are resampled to 16 kHz automatically
Long silences are skipped; keep recordings tidy

WebLLM (summariser & formatter)

Good default: Llama-3.2-1B-Instruct-q4f16_1-MLC
Very fast on modest machines: Phi-3-mini-4k-instruct-q4f16_1-MLC
Temperature 0.0 & an extractive prompt prevent invention
WebGPU is much faster than WASM; Chrome/Edge recommended

Folder structure (high level)

app/
  layout.tsx            # GOV.UK shell (header/footer) + SW handling in dev/prod
  page.tsx              # Home: record/upload → transcript → extractive summary
  models/page.tsx       # Model selectors + "Download / Warm up"

public/
  govuk/                # govuk-frontend.min.(css|js) + assets (copied via script)
  # other static assets…

scripts/
  copy-govuk-assets.mjs # Copies node_modules/govuk-frontend/dist/govuk → public/govuk

src/
  asr/
    asr-loader.ts       # Worker bootstrap, chunk transcribe, 16kHz pipeline, warm/dispose
    asr.worker.ts       # Whisper via @xenova/transformers (beam search, stride, etc.)
  audio/
    mic.ts              # Mic capture → PCM frames
    file.ts             # File decode (AudioContext) → mono PCM
  llm/
    webllm.ts           # WebLLM init (old signature), warm, summarize, formatter, fallback
  summarizer/
    prompt.ts           # Strict extractive JSON prompt template
    schema.ts           # Parse/validate tolerant JSON from LLM
  store/
    db.ts               # Session save/load (local browser storage)
    settings.ts         # Persist chosen model IDs (localStorage)
  styles/
    globals.css         # GOV.UK tweaks + transcript scroller (.app-scroll)
  utils/
    export.ts           # Markdown/print helpers
    resample.ts         # Linear resample → 16 kHz

Architecture

Recording flow Mic (WebAudio) → 16k PCM frames → ASR worker (Whisper) → transcript segments → summariser (WebLLM with extractive prompt) → UI

Upload flow File → decode (AudioContext) → resample to 16k → chunked ASR → same summarisation flow

Why extractive? Prompt + temperature 0 ensure the model does not invent names/decisions/dates. Unknowns are omitted.

Configuration

Model IDs stored in localStorage:
- asrModelId → Xenova/whisper-*.en
- webllmModelId → one from WebLLM prebuiltAppConfig.model_list
Service Worker
- Dev: unregistered to avoid stale Next.js chunks
- Prod: you may register one (avoid caching /_next/* bundles)
Styling
- public/govuk/govuk-frontend.min.css + .js linked in app/layout.tsx
- Header shows PrivateScribe (no crown/wordmark)
- Fonts: system-ui stack (Transport webfont is restricted)

Browser support

Chrome 121+ / Edge 121+ — WebGPU recommended
Firefox — WASM path works; WebGPU varies by platform
Safari — WASM path works; WebGPU support varies by OS

If WebGPU isn’t available, WebLLM falls back to WASM (slower but functional).

Troubleshooting

PowerShell: npm.ps1 cannot be loaded → Use npm.cmd or run:

Set-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass

Dev chunk error: “Loading chunk … failed” → Due to stale SW caching. We unregister SW in dev. If stuck:

Hard refresh Ctrl+F5
DevTools → Application → Service Workers → Unregister
DevTools → Application → Storage → Clear site data

GOV.UK styles not applied

Ensure:
- public/govuk/govuk-frontend.min.css
- public/govuk/govuk-frontend.min.js

In app/layout.tsx:

<link rel="stylesheet" href="/govuk/govuk-frontend.min.css" />
<script src="/govuk/govuk-frontend.min.js" defer></script>

Run npm run copy:govuk if missing

WebLLM error: Cannot find model record in appConfig for [object Object]

Use old signature:
```
CreateMLCEngine(modelId, { appConfig })
```
Pick a model from /models (IDs must match exactly)

ASR accuracy is poor

Use whisper-base.en or whisper-small.en
Reduce background noise / speak closer
Ensure uploaded audio is clean; (re)warm model once

Clear Next.js cache

rm -rf .next            # macOS/Linux
Remove-Item -Recurse -Force .next   # Windows PowerShell

Security & privacy

ASR (Transformers.js WASM) and LLM (WebGPU/WASM) run entirely in browser
Audio/text never leave your device
Sessions saved to local browser storage
No analytics/telemetry by default

Branding & licenses

App name: PrivateScribe (non-government)
Uses GOV.UK Design System styles (MIT), without protected government branding
Fonts: system-ui stack (Transport is restricted)

Third-party notices

govuk-frontend — MIT
@xenova/transformers — MIT
@mlc-ai/web-llm — Apache-2.0

Contributing

Fork & branch: feat/<short-name>
npm i, npm run dev
Keep PRs focused; attach before/after notes or screenshots
Ensure no cloud calls; app must remain fully local by default

Roadmap

iOS app (native) using Apple Speech on-device + same extractive summariser
Speaker diarisation (labels), improved timestamps
PII redaction mode
Model cards with size & speed guidance on /models
Session export as JSON and SRT

Repository setup

Suggested repo name: privatescribe or privatescribe-web Description: Private, on-device transcription and summarisation. GOV.UK-style UI.

First push

git init
git add .
git commit -m "feat: PrivateScribe on-device transcription + GOV.UK UI"
git branch -M main
git remote add origin https://github.com/<your-username>/<your-repo>.git
git push -u origin main

Scripts

npm run dev        # Start Next.js dev server (http://localhost:3000)
npm run build      # Production build
npm run start      # Start production server
npm run copy:govuk # Copy govuk-frontend dist → public/govuk

Credits

PrivateScribe — developed by Bashir Abubakar. UI built with the GOV.UK Design System. Not an official government service.

::contentReference[oaicite:0]{index=0}

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.next		.next
app		app
images		images
private-transcribe-web		private-transcribe-web
public		public
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
next-env.d.ts		next-env.d.ts
next.config.mjs		next.config.mjs
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PrivateScribe

Table of contents

Demo & features

Requirements

Quick start (Windows / macOS / Linux)

Run

Usage

Model selection & performance tips

Whisper (ASR)

WebLLM (summariser & formatter)

Folder structure (high level)

Architecture

Configuration

Browser support

Troubleshooting

Security & privacy

Branding & licenses

Contributing

Roadmap

Repository setup

First push

Scripts

Credits

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PrivateScribe

Table of contents

Demo & features

Requirements

Quick start (Windows / macOS / Linux)

Run

Usage

Model selection & performance tips

Whisper (ASR)

WebLLM (summariser & formatter)

Folder structure (high level)

Architecture

Configuration

Browser support

Troubleshooting

Security & privacy

Branding & licenses

Contributing

Roadmap

Repository setup

First push

Scripts

Credits

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages