AccessEye MVP

Hands-free, eye-tracking browser interface — deployed on Cloudflare Pages

Production: https://accesseye-mvp.pages.dev
GitHub: https://github.com/mkbrown261/Accesseye-mvp

What It Does

AccessEye lets you control a computer entirely with your eyes — no hands, no mouse, no touch. It runs 100 % in the browser using your webcam. No data ever leaves the device; all processing is local.

Look at any button or link and dwell on it for ~350 ms to activate it. An optional voice layer lets you speak commands ("click", "scroll down", "show clickable items") as a parallel input channel. A gesture layer lets you trigger actions with facial movements (lip-tap, tongue-out, custom trained gestures).

Target users: people with motor impairments, ALS, spinal cord injury, or any condition that makes traditional input devices inaccessible.

Feature Summary

Feature	Status
Real-time eye tracking (MediaPipe FaceMesh, 478 landmarks)	✅ Live
Quality-gated 9-point calibration	✅ Live
Phase 2 — binocular iris + head-pose + pupil fusion	✅ Live
Phase 3 — One-Euro filter, IVT, PACE, adaptive dwell	✅ Live
Snap-To engine — gravity-model target assist	✅ Live
Accuracy Engine — gain remap, drift correction, center gravity	✅ Live
Voice Navigation + Intent Fusion	✅ Live
Gesture Studio — facial gesture detection + custom training	✅ Live
Accessibility Control Mode (WCAG 2.1 AA / ADA / Section 508)	✅ Live
AI Intent Prediction (OpenAI-backed, server-side)	✅ Live

Architecture Overview

Browser (all client-side)
├── app.js                Phase 1 — MediaPipe wiring, CalibrationEngine v8,
│                         CalibrationUI, GazeEngine, UIElementRegistry, dwell
├── phase2-engine.js      Phase 2 — HybridGazeEngine, HeadPoseEstimator,
│                         TemporalStabilizer, DynamicCalibrationEngine,
│                         MicroSaccadeFilter, GazeConfidenceScorer,
│                         IntentPredictionEngine, GazeBenchmark
├── phase2-init.js        Phase 2 bootstrap — camera patching, UI wiring,
│                         intent-engine toggle, benchmark controls
├── phase3-engine.js      Phase 3 — OneEuroFilter, IVTSaccadeDetector,
│                         AdaptiveDwellTimer, PACERecalibrator,
│                         SmoothPursuitCalibrator, CalibrationValidator,
│                         HeadFreeStabilizer
├── phase3-init.js        Phase 3 bootstrap — dwell presets, smooth-pursuit
│                         calibration UI, post-calib validation
├── snap-engine.js        Snap-To Engine — TargetPredictor, AdaptiveGazeLearner,
│                         SnapToEngine (gravity-model soft attractor)
├── accuracy-engine.js    Accuracy Engine v3 — GazeGainRemapper,
│                         CenterGravity, GravitySnapEngine,
│                         PerSessionDriftCorrector
├── voice-nav.js          Voice Navigation + Intent Fusion — Web Speech API,
│                         NavigationElementList, VoiceNavigationController,
│                         show-clickable overlay, focus-on, all 40+ commands
├── gesture-studio.js     Gesture Studio — FacialGestureEngine (lip-tap,
│                         tongue-out), CustomGestureRecorder/Recognizer,
│                         GestureStudio lifecycle manager
├── a11y-mode.js          Accessibility Control Mode (ACM) — WCAG 2.1 AA/AAA,
│                         ADA Title III, Section 508 — gaze-dwell focus loop,
│                         skip-nav, live region, dictation, hint overlay
└── a11y-logger.js        Interaction logger — timestamped CSV + PDF export
                          of all ACM modality events

Edge (Cloudflare Pages / Workers)
└── src/index.tsx         Hono app — serves HTML shell, /api/intent endpoint

Full Gaze Pipeline (per frame)

Webcam (30 fps, 640×480)
  ↓
MediaPipe FaceMesh (refineLandmarks = true, 478 pts)
  ↓
HeadPoseEstimator  ← 6-DOF Euler angles from 6 stable anchor points
  ↓
HybridGazeEngine
  ├── iris offset signal  (binocular centroid, EMA-stabilised eye span)
  ├── head-pose signal    (yaw/pitch compensation, gated on |headMag| > 15°)
  └── pupil boundary signal (centroid relative to eye centre)
       ↓ confidence-weighted fusion
  ↓
GazeConfidenceScorer  ← brightness, occlusion, glare, landmark quality
  ↓
[Phase 3] OneEuroFilter  (minCutoff 1 Hz, β 0.007)  ← inserted by P3 init
  ↓
TemporalStabilizer  (Kalman R=0.005, Q=0.0001 + EMA α=0.22 + trimmed mean)
  ↓
MicroSaccadeFilter  (12 px / 200 ms fixation stability window)
  ↓
DynamicCalibrationEngine.applyBiasCorrection  ← slow EMA drift offset (±12%)
  ↓
CalibrationEngine.mapGaze
  (degree-3 polynomial ridge regression, λ=0.01, 10 terms, iris-only model)
  ↓ [GazeGainRemapper patch, accuracy-engine]
  (power-law remap u′ = sign(u)×|u|^0.70 — expands centre, compresses edges)
  ↓
AccuracyEngine patches (_updateGazeCursor)
  ├── GravitySnapEngine  (Grossman & Balakrishnan gravity model)
  ├── CenterGravity      (idle return pull, 0.8%/frame)
  └── PerSessionDriftCorrector  (fixation-centroid comparison, ±2.5% max)
  ↓
app._updateGazeCursor  → pixel position (lastScreenX, lastScreenY)
  ↓
SnapToEngine.update  (optional) → snapped pixel position
  ↓
UIElementRegistry.updateGaze → dwell check → 'activate' event
  ↓
[Phase 3 AdaptiveDwellTimer] → IVT-gated dwell (Fast / Normal / Accessible / Extended)

Calibration (v8 — quality-gated)

How it works:

Open the app → click the Calibrate tab in the top mode bar
Click Start Calibration
For each of the 9 dots (4 corners, 4 inner ring, centre):
- 🔵 Blue — move your eyes to the dot
- 🟠 Amber — gaze is locking on — hold still
- 🟢 Green — locked, sampling — the ring fills automatically
- ✨ Flash + shrink — point captured, advances automatically
After all 9 points the polynomial model is built. The cursor appears.

Signal used for calibration: iris-only offset (head-pose and pupil signals deliberately excluded during calibration to avoid adding fusion noise while the user is stationary).

Calibration model specs:

Property	Value
Points	9 (4 corners @ 8%/92%, 4 inner @ 25%/75%, centre)
Regression	Degree-3 polynomial ridge, λ = 0.01, 10 basis terms
Corner weight	4× vs 1× interior
Gaze normalisation	Observed range + 18% padding for edge extrapolation
Quality gate	0.45 threshold; 8 consecutive lock-frames before sampling
Samples per point	40
Storage	localStorage key `accesseye_calib` (version 8)

Tips:

Sit 50–70 cm from the screen, face roughly centred in the camera frame
For corner dots, move your eyes only — small head movement is fine
If a dot stays amber, blink once to reset then re-fixate
Recalibrate any time via the Calibrate tab or voice command "calibrate"

Snap-To Engine

Snap-To is an optional cursor-assist layer that pulls the gaze cursor toward the nearest interactive element when you look close enough.

Three modules:

Module	Role
`TargetPredictor`	Scores candidates by distance, size, element type, and interaction history
`AdaptiveGazeLearner`	Builds a per-user profile from every dwell event; auto-tunes snap distance, dwell time, smoothing, and prediction weight
`SnapToEngine`	Smooth-interpolates cursor to winning target; emits snap/release events

Controls (Live Demo → Snap-To panel):

On/Off toggle
Auto-dwell toggle
Snap threshold distance (px)
Dwell time (ms)
Cursor smoothing (%)
Prediction weight (%)

Voice Navigation

Voice navigation runs via the Web Speech API (Chrome/Edge). Enable it with the microphone toggle in the Phase 2 panel.

How it works:

Utterance is transcribed
Raw lowercase text is checked against multi-word ACTION_COMMANDS first (longest match wins)
Tokenised words are checked for single-word action commands
Embedded action + element combos are tried ("click home", "open demo")
Named element matching against the live DOM scan
If nothing matches — "No match" status displayed

Command categories:

Category	Example commands
Discovery	"show clickable items", "hide clickable", "what can i click"
Focus	"focus on [name]", "next item", "previous item"
Navigation	"home", "demo", "architecture", "docs", "studio"
Mode	"calibrate", "gaze mode", "mouse mode"
Camera	"start camera", "restart", "stop"
Scrolling	"scroll up", "scroll down", "scroll to top", "stop scrolling"
Zoom	"zoom in", "zoom out", "reset zoom"
Interaction	"click", "press", "double click", "right click"
Control recovery	"pause control", "resume control", "reset cursor", "exit mode"
Editing	"select all", "copy", "paste", "cut", "focus search"
Voice control	"pause voice", "resume voice"
Media	"play", "pause", "stop"

"Show Clickable Items" highlights every interactive element with a numbered cyan badge and outline ring. Auto-clears after 8 seconds or on "hide clickable items".

Gesture Studio

Facial gesture control, accessible from the Gesture Studio tab.

Built-in gestures:

Gesture	Action
Lip-tap (open-close-open mouth quickly)	Scroll up
Tongue-out (hold)	Scroll down continuously
Smile (stretch ≥ 38% of face width)	Configurable
Blow	Configurable

Custom gestures:

Click New Gesture in Gesture Studio
Name it and assign an action (click, scroll, navigate, run custom script, etc.)
Click Record — hold your chosen face position for 2–4 seconds
The profile is saved to localStorage and fires in real-time

Actions available for custom gestures: click focused element, scroll up/down, navigate to page, open settings, run arbitrary JavaScript.

Accessibility Control Mode (ACM)

ACM is a WCAG 2.1 AA/AAA + ADA Title III + Section 508 compliance layer that sits entirely on top of the gaze pipeline without modifying it.

Enable: ACM toggle button in the Phase 2 panel.

Features:

Gaze-dwell focus loop (10 fps — CPU-lightweight)
Skip-navigation link (WCAG 2.4.1)
Screen-reader live region announcements (WCAG 4.1.3)
Focus outlines and ARIA role support
Voice command integration (speaks element labels aloud)
Keyboard Tab/Enter/Space event tracking
Dwell time control (300–2000 ms slider)
Real-time stat counters: gaze, voice, keyboard, intent events
CSV export — timestamped log of every interaction
PDF export — same log formatted for reporting/compliance

Accuracy Engine (v3)

Four additive post-processing patches applied after CalibrationEngine.mapGaze, before the cursor pixel position is written:

Module	What it does
GazeGainRemapper (ACC.3)	Power-law remap (γ=0.70) expands centre region ~18%, compresses edges; shrinks centre dead-zone from ~20% → ~8% of screen
CenterGravity (ACC.4)	Gentle 0.8%/frame return pull when cursor is idle at screen edge
GravitySnapEngine (ACC.1)	Gravity-model attractor (Grossman & Balakrishnan 2005) — 25–40% mis-selection reduction vs distance-only snap
PerSessionDriftCorrector (ACC.2)	Watches 50-fixation sliding window; infers and corrects systematic drift up to ±2.5% screen over ~60 s

Dynamic Calibration (Micro-Recalibration)

The DynamicCalibrationEngine (Phase 2) continuously refines the calibration model during normal use — no explicit recalibration step needed.

Two mechanisms:

Interaction-triggered (click-based): Every time a UI element is activated by dwell or gesture, the current raw iris position is recorded as a micro-sample paired with the known screen target. Weight-decay (0.995/frame, PACE-style) ages old samples out. The polynomial model is rebuilt and the bias is recalculated.
Passive drift EMA: A slow EMA bias correction (clamped to ±12% of screen) is applied every frame during stable fixations, independently of explicit interactions.

Micro-samples are persisted to localStorage and reloaded on the next session.

Tech Stack

Layer	Technology
Edge framework	Hono 4.x on Cloudflare Pages/Workers
Build	Vite 6 + @hono/vite-cloudflare-pages
Eye tracking	MediaPipe FaceMesh + Hands (CDN, no install)
Styling	Tailwind CSS (CDN) + custom CSS
Icons	Font Awesome 6 (CDN)
Voice	Web Speech API (built-in, Chrome/Edge)
AI intent	OpenAI API (server-side Hono route, key never exposed)
Storage	localStorage (calibration, micro-samples, gesture profiles, PACE data)
Runtime	Cloudflare Workers (edge, no Node.js, no server)

Deployment

# Build
npm run build

# Deploy to Cloudflare Pages
npx wrangler pages deploy dist --project-name accesseye-mvp

# Local development (sandbox)
npm run build
pm2 start ecosystem.config.cjs

# Test
curl http://localhost:3000

Production URL: https://accesseye-mvp.pages.dev
Platform: Cloudflare Pages
Status: ✅ Active

Module File Reference

File	Size (approx)	Role
`app.js`	~2 800 lines	Phase 1 core — MediaPipe, calibration, UI registry, dwell
`phase2-engine.js`	~2 400 lines	Phase 2 — hybrid gaze, head pose, dynamic calibration, intent
`phase2-init.js`	~320 lines	Phase 2 bootstrap and UI wiring
`phase3-engine.js`	~1 700 lines	Phase 3 — filtering, saccade, PACE, smooth pursuit, head-free
`phase3-init.js`	~300 lines	Phase 3 bootstrap, dwell presets, validation UI
`snap-engine.js`	~600 lines	Snap-To — target prediction, adaptive learner, engine
`accuracy-engine.js`	~930 lines	Accuracy layer — gain remap, gravity snap, drift correction
`voice-nav.js`	~990 lines	Voice navigation, intent fusion, show-clickable overlays
`gesture-studio.js`	~780 lines	Facial gesture engine + custom gesture lifecycle
`a11y-mode.js`	~530 lines	ACM — WCAG/ADA/508 compliance layer
`a11y-logger.js`	~250 lines	Interaction logger, CSV/PDF export
`src/index.tsx`	~1 600 lines	Hono edge server — HTML shell + /api/intent

Known Bugs Fixed (This Session)

ID	File	Description
CALIB-BTN-1	app.js	"Start Calibration" button re-opened the overlay instead of starting calibration — two-phase handler added
RESTART-5	phase2-engine.js	Camera restart corrupted gaze pipeline — `hybridGaze.processResults` was not restored on deactivation
VOICE-1	voice-nav.js	Nav/mode/camera commands routed through intent fusion and silently failed — `DIRECT_ID_ACTIONS` set added
VOICE-2	voice-nav.js	`_extractEmbeddedAction` treated nav words as verbs — `skipAsVerb` set added
VOICE-3	voice-nav.js	"Reset cursor" targeted wrong element ID — corrected to `#global-gaze-cursor`
VOICE-4	voice-nav.js	"Show Clickable Items" and 16 other commands were never in `ACTION_COMMANDS` and had no implementation — full `_showClickableOverlays`, `_hideClickableOverlays`, `_handleFocusOn`, `_extractMultiWordAction`, and all missing switch cases added

Development History (Key Milestones)

Commit	Change
`8531b3f`	VOICE-4: Show Clickable Items + 16 missing voice commands fully implemented
`74eafc9`	VOICE-1/2/3: nav commands, embedded-action parsing, reset-cursor target fixed
`fabef82`	CALIB-BTN-1 + RESTART-5: calibration start button + camera restart gaze loss fixed
`53e9b8c`	Fix dwell never completing — 5 targeted fixes
`585876d`	Fix edge-bias / centre dead-zone — 4 additive accuracy improvements
`1a7f2b6`	Accuracy Engine v2: GravitySnap + SessionDriftCorrector
`82af62b`	ACM live counters — event bus hooks for gaze/voice/keyboard/intent
`6dd602f`	Re-add ACM: WCAG/ADA/508 layer, safe 10 fps dwell loop
`b0cc00c`	Voice Navigation + Intent Fusion system
`7a41806`	Camera restart, dwell toggle, smile gesture, P2 panel layout
`b3f374c`	Fix gaze tracking after restart — Phase 2 activates cleanly at 30 fps
`7de9a41`	Phase 3: ridge regression, 13-pt calib, IVT, PACE, head-free
`35319bd`	12 precision improvements — jitter + accuracy
`305740b`	Calibration v8: quality-gated sampling, auto-advance on stable fixation
`0e218e8`	Fix calibration stuck at pt1: separate recentFrames buffer for lock phase

What Has Not Been Built Yet

The following techniques were evaluated but not yet implemented:

#	Technique	Notes
3	Breathing notch filter	Would remove ~0.25 Hz physiological oscillation from gaze signal
5	Iris ellipse aspect ratio correction	Iris is currently tracked as centroid only; shape ignored
8	3D corneal surface reconstruction	Would need depth/stereo camera or glint-based PCCR
9	Zero-shot implicit calibration	Explicit 9-point calibration is always required
10	Multi-anchor ensemble gaze fusion	Single polynomial model; no mixture-of-experts

License

MIT — built by mkbrown261

Name		Name	Last commit message	Last commit date
Latest commit History 107 Commits
public/static		public/static
src		src
.gitignore		.gitignore
README.md		README.md
ecosystem.config.cjs		ecosystem.config.cjs
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
vite.config.ts		vite.config.ts
wrangler.jsonc		wrangler.jsonc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AccessEye MVP

What It Does

Feature Summary

Architecture Overview

Full Gaze Pipeline (per frame)

Calibration (v8 — quality-gated)

Snap-To Engine

Voice Navigation

Gesture Studio

Accessibility Control Mode (ACM)

Accuracy Engine (v3)

Dynamic Calibration (Micro-Recalibration)

Tech Stack

Deployment

Module File Reference

Known Bugs Fixed (This Session)

Development History (Key Milestones)

What Has Not Been Built Yet

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AccessEye MVP

What It Does

Feature Summary

Architecture Overview

Full Gaze Pipeline (per frame)

Calibration (v8 — quality-gated)

Snap-To Engine

Voice Navigation

Gesture Studio

Accessibility Control Mode (ACM)

Accuracy Engine (v3)

Dynamic Calibration (Micro-Recalibration)

Tech Stack

Deployment

Module File Reference

Known Bugs Fixed (This Session)

Development History (Key Milestones)

What Has Not Been Built Yet

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages