Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
182 changes: 182 additions & 0 deletions .github/skills/experimental/hifi-prototype/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,182 @@
---
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this skill is intended for usage with Design Thinking agents, you want to move the skill into the design-thinking collection. Move the folder name, then add the item under the hve-core-all, and design-thinking.collection.yml files and regenerate plugins.
Set the skill in the design-thinking.collection.yml to maturity: experimental and you will be good to only release it as experimental pre-release version of the extension.

name: hifi-prototype
description: 'Opinionated scaffold and iteration loop for local-only high-fidelity prototypes that treat every build as a measurable experiment - Brought to you by microsoft/hve-core'
license: MIT
compatibility: 'Requires a web browser. Optional: Python 3.11+ (Flask), Node.js 18+ (Express), or .NET 8+ (Minimal API)'
metadata:
authors: "microsoft/hve-core"
spec_version: "1.0"
last_updated: "2026-04-10"
---

# High-Fidelity Prototype Builder

## Overview

Builds local-only, experiment-framed, intentionally-rough functional prototypes
with telemetry and Markdown reporting. Every prototype is an experiment with a
hypothesis, success criteria, and a clear way to know if it failed.

Core design constraints:

* A hypothesis and success criteria are required before scaffolding begins.
* Telemetry is built in from the start so every session produces measurable data.
* Rough UI is enforced deliberately to keep stakeholder feedback on behavior, not aesthetics.
* Everything runs locally with no cloud accounts, no deployments, no wait.
* Simulated components are visibly labeled so prototypes are never confused with production.
* Prototypes are disposable. If the experiment concluded, archive or delete it.

## When to Use

- Validating whether a concept works functionally before investing in production
- Testing user workflows with real-ish data and measuring actual behavior
- Building a prototype that needs to run on your machine with no cloud accounts
- Creating something stakeholders can click through while you watch what they do
- Generating structured experiment documentation alongside the prototype

## When Not to Use

- You need a polished, production-ready application
- The work requires cloud infrastructure, multi-user auth, or scalability
- You're past the experiment phase and need production code
- You only need a static mockup or wireframe (use Figma or paper)
- You need to deploy this for unsupervised remote user testing

## Prerequisites

No installation is required for the default HTML/CSS/JS stack. Open `index.html` in any modern browser.

| Stack | Runtime |
|----------------|----------------------------------------|
| HTML (default) | Any modern browser |
| Python | Python 3.11+ with Flask |
| Node.js | Node.js 18+ with Express |
| .NET | .NET 8+ SDK |

Optional dependencies:

* OpenTelemetry SDK for backend telemetry (installed per-stack)
* An LLM provider (Ollama or remote API) only if simulation requires one

## Inputs

| Input | Required | Description |
|--------------------|----------|--------------------------------------------------------------------------------------------------------|
| Hypothesis | Yes | What you believe to be true and want to validate |
| Success criteria | Yes | Measurable conditions that confirm or reject the hypothesis |
| Stack preference | No | `html` (default), `python` (Flask), `node` (Express), or `dotnet` (minimal API) |
| Storage | No | `sqlite` (default) or `files` (JSON/Markdown flat files) |
| Simulation needs | No | What parts of the system should be simulated rather than built |
| LLM provider | No | Endpoint and model for simulation (e.g., `ollama/llama3`). Defaults to no LLM |
| Telemetry level | No | `basic` (page views, clicks, task timing) or `detailed` (basic + custom events, session replay) |

## Architecture Principles

### Local-Only, Zero Cloud

Everything runs on the developer's machine. No cloud accounts, no deployments,
no API keys unless the user explicitly opts into an LLM provider for simulation.

### Intentionally Rough UI

Enforced through specific design constraints:

* System fonts only (`system-ui, sans-serif`). No custom fonts.
* Maximum 2 colors: one neutral (gray), one accent.
* Visible 1px dashed borders on major layout sections. No rounded corners beyond `4px`, no shadows, no gradients.
* Minimum `16px` body text, `44px` touch targets.
* A visible banner on every page: **"⚠ EXPERIMENT — not a real product. [Prototype Name] | Hypothesis: [one-liner]"**

This is a deliberate Design Thinking technique (Method 7) that prevents stakeholders from giving feedback on visual polish when the goal is behavior validation.

### Simulation Layers

Simulated components must be:

1. **Visibly labeled** in the UI with a `[SIMULATED]` badge.
2. **Documented** in the experiment card with assumptions.
3. **Swappable** via isolated modules in a `sim/` directory.

See [stack-reference.md](references/stack-reference.md#simulation-approaches) for simulation approaches by need.

### Telemetry from Day One

Telemetry is not optional. **Basic** telemetry (page views, clicks, task timing, errors, session UUID) is always included. **Detailed** telemetry (custom events, funnel tracking, rage-click detection, session recording) is opt-in.

See [stack-reference.md](references/stack-reference.md#telemetry-implementation) for implementation details per stack.

## Project Structure

```
{prototype-name}/
├── experiment-card.md # Hypothesis, criteria, measurement plan
├── index.html # Entry point (or app.py / server.js / Program.cs)
├── style.css # Rough UI styles (pre-populated with constraints)
├── app.js # Frontend logic and telemetry
├── telemetry.js # Telemetry capture module
├── sim/ # Simulation layer
│ ├── fixtures/ # JSON/CSV mock data
│ └── stubs.js # Stub functions for simulated services
├── data/ # SQLite file or JSON/Markdown data files
│ └── prototype.db # (or *.json files if file storage chosen)
├── telemetry/ # Telemetry output
│ └── events.json # Captured events (append-only)
├── reports/ # Markdown experiment reports
│ └── session-{n}.md # Per-session observation report
└── README.md # Setup, run instructions, and experiment context
```

## Workflow

Follow the six-step workflow to build and evaluate a prototype experiment.
Each step has a checkpoint that must pass before proceeding.

| Step | Name | Purpose |
|------|----------------------------|-----------------------------------------------------------|
| 1 | Write the Experiment Card | Define hypothesis, success/failure criteria, measurements |
| 2 | Scaffold the Prototype | Generate project structure, styles, telemetry, sim stubs |
| 3 | Build the Core Interaction | Implement the minimum UI that tests the hypothesis |
| 4 | Add Secondary Views | Add supporting views if needed (max 5 total) |
| 5 | Run a Test Session | Execute task script, capture telemetry, write session report|
| 6 | Generate Experiment Report | Aggregate data, evaluate criteria, declare verdict |

See [workflow.md](references/workflow.md) for detailed step instructions and checkpoints.
See [templates.md](references/templates.md) for experiment card, session report, and summary templates.

## Validation

- [ ] `experiment-card.md` exists before any code was written
- [ ] Hypothesis is falsifiable (failure criteria are specific)
- [ ] Prototype runs locally with a single command (no cloud setup)
- [ ] Experiment banner is visible on every page
- [ ] All simulated components are labeled `[SIMULATED]`
- [ ] Telemetry captures events to a local file
- [ ] Rough UI constraints are applied (system fonts, 2 colors, dashed borders)
- [ ] No view exists that does not test the hypothesis
- [ ] Session reports are in Markdown with structured data
- [ ] Experiment summary evaluates each success criterion with evidence

## Troubleshooting

| Issue | Cause | Solution |
|----------------------------------------|-------------------------------------|----------------------------------------------------------------------------------------------------|
| Code written before experiment card | Skipped hypothesis definition | Refuse to scaffold until the experiment card is complete |
| UI looks polished | Design constraints not enforced | Enforce rough constraints in `style.css`; remove any shadows, gradients, or custom fonts |
| No telemetry data captured | Telemetry module missing or unwired | Telemetry module is scaffolded in Step 2; verify events fire in Step 3 |
| Feature creep beyond hypothesis | Scope expanded past experiment card | If a feature does not appear in the experiment card, it does not get built |
| Simulated output mistaken for real | Missing simulation labels | Every simulated component gets a `[SIMULATED]` badge; the experiment card catalogs all simulations |
| Hypothesis not testable | No failure criteria defined | Ask "what would convince you this is wrong?"; if unanswerable, the hypothesis needs refinement |
| Conclusions drawn from one session | Insufficient session count | Experiment card defines session count target; do not write the summary until it is reached |
| Prototype kept past experiment | Over-investment in disposable code | Archive or delete prototypes when the experiment concludes |

## References

| File | Covers |
|---------------------------------------------------------|------------------------------------------------------------|
| [workflow.md](references/workflow.md) | Six-step workflow with detailed instructions and checkpoints|
| [templates.md](references/templates.md) | Experiment card, session report, and summary templates |
| [stack-reference.md](references/stack-reference.md) | Per-stack setup, simulation approaches, telemetry details |

> Brought to you by microsoft/hve-core

*🤖 Crafted with precision by ✨Copilot following brilliant human instruction, then carefully refined by our team of discerning human reviewers.*
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
# Stack Quick-Reference

Per-stack setup commands, storage options, and telemetry implementation details.

## HTML/CSS/JS (default, no backend)

```bash
# Just open it
open index.html
# Or use a simple server for telemetry flush
npx serve .
```

Storage: `localStorage` + JSON files in `data/` (manual export).
Telemetry: events buffer in `localStorage`, export to JSON on demand.

## Python (Flask)

```bash
pip install flask
python app.py
# → http://localhost:5000
```

Storage: SQLite via `sqlite3` stdlib or JSON files.
Telemetry: OpenTelemetry SDK with `opentelemetry-exporter-otlp` or file export.

## Node.js (Express)

```bash
npm install express better-sqlite3
node server.js
# → http://localhost:3000
```

Storage: SQLite via `better-sqlite3` or JSON files.
Telemetry: `@opentelemetry/sdk-node` with file exporter.

## .NET (Minimal API)

```bash
dotnet new web -n {name}
dotnet run
# → http://localhost:5000
```

Storage: SQLite via `Microsoft.Data.Sqlite` or JSON files.
Telemetry: `OpenTelemetry.Extensions.Hosting` with file exporter.

## Simulation Approaches

| Need | Approach |
|------------------------|-----------------------------------------------------------------------------------|
| API responses | JSON fixture files in `sim/fixtures/` returned by a mock route |
| Sensor/IoT data | CSV or JSON time-series files replayed at configurable speed |
| AI/ML predictions | LLM call with a system prompt describing expected behavior, or a decision tree |
| User-generated content | Seeded SQLite database or JSON files with realistic sample data |
| External service calls | Stub functions that log the call and return canned responses |

## Telemetry Implementation

**Frontend**: a small `telemetry.js` module (~50 lines) that captures events
and writes them to `localStorage`, then flushes to a local JSON file via
a backend endpoint or on page unload.

**Backend** (if present): OpenTelemetry SDK with a file exporter writing to
`telemetry/traces.json` and `telemetry/events.json`.

No external services unless the user explicitly requests one of:

* **Application Insights**: instrument with the JS SDK, connection string in `.env`.
* **OpenTelemetry Collector**: export spans and metrics to a local OTLP endpoint
or a remote collector. Provide a `docker-compose.yml` for a local Jaeger
or Zipkin instance.
135 changes: 135 additions & 0 deletions .github/skills/experimental/hifi-prototype/references/templates.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
# Experiment Templates

Templates for experiment cards, session reports, and experiment summaries used
by the hifi-prototype skill.

## Experiment Card Template

Every prototype starts with an experiment card. No card, no code.

```markdown
# Experiment Card: {Prototype Name}

## Status

🔬 Active | 📊 Collecting Data | ✅ Concluded | ❌ Invalidated

## Hypothesis

{One clear statement of what you believe to be true.}

## Success Criteria

| Metric | Target | How Measured |
|--------|--------|-------------|
| {metric} | {target} | {telemetry event or observation} |

## Failure Criteria

What evidence would REJECT the hypothesis? Be specific:

- {condition that disproves the hypothesis}

## What Is Simulated

| Component | Real or Simulated | Assumptions |
|-----------|-------------------|-------------|
| {component} | {Real / Simulated} | {what the simulation assumes} |

## Measurement Plan

- Telemetry level: {basic / detailed}
- Session count target: {how many sessions before analysis}
- Key events to track: {list specific telemetry events}

## Risks and Limitations

- {known risk or limitation of the experiment design}

## Dates

- Started: {date}
- Target conclusion: {date}
```

## Session Report Template

Generate a session report in `reports/session-{n}.md` after each test session.

```markdown
# Session {n} Report

**Date**: {date}
**Participant**: {role or persona — no PII}
**Duration**: {minutes}

## Task Completion

| Task | Completed | Time | Errors | Notes |
|------|-----------|------|--------|-------|
| {task} | Yes/No | {time} | {count} | {observation} |

## Telemetry Summary

- Events captured: {count}
- Key events: {summary of notable telemetry}

## Observations

- {what the user did, not what they said}
- {confusion points, workarounds, unexpected behavior}

## Quotes

- "{anything the user said that reveals intent or frustration}"

## Preliminary Signal

Does this session support or weaken the hypothesis?
{brief assessment — not a conclusion from one session}
```

## Experiment Summary Template

After the target number of sessions, produce a summary in `reports/experiment-summary.md`.

```markdown
# Experiment Summary: {Prototype Name}

## Hypothesis

{restated from experiment card}

## Verdict

✅ Supported | ⚠️ Weakened | ❌ Invalidated

## Evidence

| Criterion | Target | Actual | Verdict |
|-----------|--------|--------|---------|
| {metric} | {target} | {measured} | ✅/⚠️/❌ |

## Telemetry Findings

- {aggregated telemetry insights}

## What We Learned

- {insight — valuable regardless of hypothesis outcome}

## What Surprised Us

- {unexpected behavior or finding}

## Recommended Next Step

{iterate / pivot / proceed} — {rationale}

## Artifacts

- Experiment card: `experiment-card.md`
- Session reports: `reports/session-*.md`
- Telemetry data: `telemetry/events.json`
- Prototype source: `{entry point}`
```
Loading
Loading