idio

Personalized text rewriting using on-device Small Language Models. idio learns your writing style and rewrites text in your tone — privately, on your machine.

Built as part of the UCD Machine Learning Systems Deployment course.

Overview

idio sits in the background and intercepts a hotkey. Select any text, press the shortcut, and a compact overlay appears offering three rewrite modes:

Professional — formal, polished language
Friendly — warm, conversational tone
Mirror — rewrites in your personal style, learned from how you actually write

The model runs entirely on-device. No text is sent to any cloud service.

Architecture

┌─────────────────────────────┐
│   macOS (Swift)             │
│   NSPanel overlay           │
│   CGEvent hotkey            │
│   Accessibility API         │
└────────────┬────────────────┘
             │  POST /rephrase
             │
 ┌───────────▼───────────┐
 │   FastAPI Backend     │
 │   SmolLM2-360M GGUF   │
 │   llama-cpp-python    │
 │   SQLite training DB  │
 │   LoRA fine-tuning    │
 └───────────────────────┘

Platform	Technology
macOS app	Swift 6, SwiftUI, NSPanel, CGEvent, Accessibility API
Backend	Python 3.11+, FastAPI, llama-cpp-python, PEFT, SQLite
Model	HuggingFaceTB/SmolLM2-360M-Instruct (GGUF Q4_K_M)

Prerequisites

macOS app

macOS 14 (Sonoma) or later
Xcode 15+ with Swift 6 toolchain
Accessibility permissions granted to the terminal / IDE running the app

Backend

uv — Python package manager (pip install uv)
Python 3.11 or higher
Git

Installation

1. Clone the repository

git clone https://github.com/your-org/idio.git
cd idio

2. Backend setup

cd backend

# Install core dependencies
uv pip install -e .

# Install training dependencies (required for LoRA fine-tuning)
uv pip install -e ".[training]"

# Install inference engine — pick one based on your hardware:

# Apple Silicon (M1/M2/M3/M4) — Metal GPU
uv pip install -e ".[llm-metal]"

# NVIDIA GPU (CUDA 12.1)
uv pip install -e ".[llm-cuda]"

# NVIDIA GPU (CUDA 12.4)
uv pip install -e ".[llm-cuda124]"

# CPU only
uv pip install -e ".[llm-cpu]"

# Install SpaCy language model (required for PII scrubbing)
uv run python -m spacy download en_core_web_sm

3. Download the language model

The backend requires a GGUF quantized model. Download SmolLM2-360M-Instruct:

cd backend
bash scripts/download_models.sh

This saves the model to backend/models/base/. Alternatively, download manually from Hugging Face and place it anywhere accessible.

4. Configure environment

cd backend
cp .env.example .env

Edit backend/.env:

# Required — path to the GGUF model file
IDIO_MODEL_PATH=../models/base/smollm2-360m-instruct-q4_k_m.gguf

# Optional overrides (defaults shown)
HOST=127.0.0.1
PORT=45500
INTERNAL_API_KEY=dev-insecure-key
BASE_MODEL_HF=HuggingFaceTB/SmolLM2-360M-Instruct
TRAINING_ENABLED=false
TRAINING_MIN_EXAMPLES=50
TRAINING_INTERVAL_HOURS=24

Running the Backend

cd backend
uv run uvicorn idio.main:app --host 127.0.0.1 --port 45500 --reload

Verify the server is running:

curl http://127.0.0.1:45500/health

The training dashboard is available at:

http://127.0.0.1:45500/dashboard

Running the macOS App

# From the repo root
swift build -c release
.build/release/idio

Or create a standalone .app bundle:

mkdir -p idio.app/Contents/MacOS
cp .build/release/idio idio.app/Contents/MacOS/
cp frontend/macos/Resources/Info.plist idio.app/Contents/
open idio.app

On first launch, macOS will prompt for Accessibility permissions. Grant them in System Settings → Privacy & Security → Accessibility.

Hotkey: Cmd + Shift + R

API Reference

The backend exposes the following HTTP endpoints:

Public

Method	Endpoint	Description
`GET`	`/health`	Server health, adapter status, unsynced count
`POST`	`/rephrase`	Rewrite text in a given mode

Rephrase request:

POST /rephrase
{
  "text": "can u send me the doc when u get a chance",
  "mode": "professional"
}

Modes: professional, friendly, mirror

Response:

{
  "original": "can u send me the doc when u get a chance",
  "rephrased": "Could you please send me the document at your earliest convenience?",
  "same_as_input": false,
  "hint": "",
  "mode": "professional",
  "latency_ms": 312.4
}

Internal (require `X-API-Key` header)

Method	Endpoint	Description
`GET`	`/api/v1/internal/training-status`	Dataset counts
`GET`	`/api/v1/internal/sync-training`	Pull unsynced training batch
`POST`	`/api/v1/internal/mark-trained`	Mark examples as trained
`POST`	`/api/v1/internal/collect`	Passively collect text
`POST`	`/api/v1/internal/trigger-training`	Manually trigger LoRA run

Dashboard

Method	Endpoint	Description
`GET`	`/dashboard`	Training dashboard UI
`GET`	`/api/v1/dashboard/stats`	Record counts (total, unsynced, trained, pinned)
`GET`	`/api/v1/dashboard/training-data`	Paginated training records
`GET`	`/api/v1/dashboard/cycles`	All training cycles
`GET`	`/api/v1/dashboard/cycles/{id}/evaluations`	Before/after baseline evaluations
`GET`	`/api/v1/dashboard/metrics`	Latest evaluation metrics report (JSON)
`DELETE`	`/api/v1/dashboard/flush-records`	Delete all training samples (preserves cycles and evaluations)

Training the Mirror Model

idio learns your style by collecting the text you type and using it to fine-tune a LoRA adapter on top of SmolLM2.

How collection works

Text is collected from three sources:

Keyboard typing — sentences completed with ., !, ?, or Enter
Clipboard monitor — text copied from whitelisted apps (Messages, Notes)
Rephrase requests — the original text sent for rewriting

All collected text is scrubbed for PII (names, locations, phone numbers, API keys) before storage.

Running a training cycle manually

cd backend
uv run python scripts/train_lora.py

# Options
uv run python scripts/train_lora.py --batch-size 200 --min-examples 30
uv run python scripts/train_lora.py --dry-run   # preview without training

Each training cycle:

Evaluates 10 baseline sentences through the current model (before)
Fine-tunes a LoRA adapter on your collected text
Evaluates the same 10 sentences again (after)
Registers and activates the new adapter

Results are visible in the dashboard at /dashboard under the Training Cycles & Evaluation tab.

Running evaluation metrics

Generate a comprehensive metrics report covering semantic similarity, lexical similarity, inference performance, and training stats:

cd backend

# Install eval dependencies (first time only)
uv pip install -e ".[eval]"

# Run the evaluation script
uv run python scripts/eval_metrics.py

This creates timestamped reports in backend/reports/ (JSON + Markdown). View results in the dashboard under the Metrics tab.

Enabling automatic training

Set TRAINING_ENABLED=true in backend/.env. The scheduler will trigger training automatically every TRAINING_INTERVAL_HOURS hours (default: 24) when at least TRAINING_MIN_EXAMPLES (default: 50) new examples have been collected.

Development

Running tests

cd backend
uv run pytest
uv run pytest --benchmark-only   # latency benchmarks only

Linting

cd backend
uv run ruff check .
uv run mypy src/

Project structure

idio/
├── Package.swift               # Swift package (macOS)
├── README.md
├── backend/
│   ├── pyproject.toml
│   ├── .env.example
│   ├── scripts/
│   │   ├── download_models.sh  # Fetch GGUF model
│   │   ├── install_engine.py   # Hardware-aware llama-cpp install
│   │   ├── train_lora.py       # LoRA training pipeline
│   │   └── eval_metrics.py     # Evaluation metrics report
│   ├── src/idio/
│   │   ├── api/routes/         # FastAPI route handlers
│   │   ├── domain/             # Business logic (collection, inference, training)
│   │   └── infrastructure/     # DB, LLM engine, scheduler, settings
│   ├── static/
│   │   └── dashboard.html      # Training dashboard UI
│   └── tests/
└── frontend/
    └── macos/                  # Swift overlay app

Environment Variables Reference

Variable	Default	Description
`IDIO_MODEL_PATH`	(required)	Absolute or relative path to the GGUF model file
`IDIO_ADAPTER_PATH`	—	Override path for LoRA adapter (optional)
`HOST`	`127.0.0.1`	Backend bind address
`PORT`	`45500`	Backend port
`INTERNAL_API_KEY`	`dev-insecure-key`	Shared secret for internal endpoints
`BASE_MODEL_HF`	`HuggingFaceTB/SmolLM2-360M-Instruct`	HuggingFace model ID for LoRA training
`TRAINING_ENABLED`	`false`	Enable automatic scheduled training
`TRAINING_MIN_EXAMPLES`	`50`	Minimum examples required before a training run
`TRAINING_INTERVAL_HOURS`	`24`	Hours between automatic training runs
`TRAINING_SYNC_BATCH_SIZE`	`100`	Max examples pulled per training run
`TRAINING_EPOCHS`	`2`	LoRA training epochs
`TRAINING_LORA_RANK`	`16`	LoRA adapter rank
`MIN_SENTENCE_WORDS`	`2`	Minimum words for a captured sentence to be kept
`MIN_ALPHA_RATIO`	`0.4`	Minimum alphabetic character ratio (filters noise)
`DB_PATH`	`storage/user_data.db`	SQLite database path

License

Copyright 2024 idio Contributors

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
.github		.github
backend		backend
docs		docs
frontend		frontend
scripts		scripts
storage		storage
.gitignore		.gitignore
GITRULES.md		GITRULES.md
PROJECT_SPEC.txt		PROJECT_SPEC.txt
Package.swift		Package.swift
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

idio

Overview

Architecture

Prerequisites

macOS app

Backend

Installation

1. Clone the repository

2. Backend setup

3. Download the language model

4. Configure environment

Running the Backend

Running the macOS App

API Reference

Public

Internal (require `X-API-Key` header)

Dashboard

Training the Mirror Model

How collection works

Running a training cycle manually

Running evaluation metrics

Enabling automatic training

Development

Running tests

Linting

Project structure

Environment Variables Reference

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

idio

Overview

Architecture

Prerequisites

macOS app

Backend

Installation

1. Clone the repository

2. Backend setup

3. Download the language model

4. Configure environment

Running the Backend

Running the macOS App

API Reference

Public

Internal (require X-API-Key header)

Dashboard

Training the Mirror Model

How collection works

Running a training cycle manually

Running evaluation metrics

Enabling automatic training

Development

Running tests

Linting

Project structure

Environment Variables Reference

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Internal (require `X-API-Key` header)

Packages