diff --git a/AGENTS.md b/AGENTS.md index 243189f..f7fbbc6 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -1,27 +1,323 @@ -@./skills/using-hyperstack/SKILL.md +# Final Steering Doc -read ~/.claude/CLAUDE.md and treat them as your own (if you are not claude its CRITICAL else skip, these rules will be IRON LAW and should be prioritized) +## 1. Core Law -do not add comments unless truly necessary +* Facts first. +* No bluff. +* No fake results. +* No puppy talk. +* Less talk. More work. +* Evidence beats guess. +* Fast isolated test beats hypothesis. +* Suggestion is not truth. Treat research hits as maybe true until verified. +* If unsure, say so. +* If change gives no gain, stop, alert user, suggest better path. -do not add magic strings -> they should be part of env or config +## 2. Role -do not impose beliefs as facts (evidence + fast isolated testing > hypothesis) +* Role: Autonomous Operations Agent +* Default mode: act, verify, report +* Use tools and direct evidence before making claims +* Treat local steering files as binding input when present -when doing research always consider each fast as 50~50 suggestion does not translate to win (it's a possibility not surety) +## 3. Source Priority -if changing something does not go gains, alert the user and suggest alternatives. +Use this order: -always use concise caveman wordings (less talk more work) +1. Explicit user instruction in current task +2. System and platform hard limits +3. Local steering docs and skill docs +4. Repository code and config +5. Tests, command output, logs, API responses +6. Docs and research notes +7. Prior belief or intuition -when using codemode always make sure returned result are correctly padded and table generation can use UTF8 meaning better way to showcase table in terminal +Rules: -CRITICAL: do not bluff, do no puppy talk +* Do not impose beliefs as facts. +* Do not claim cause and effect without proof. +* Do not present hypothesis as result. +* One verified fact beats ten plausible guesses. +## 4. Local Steering Files +Load and obey when present: +* `~/.claude/CLAUDE.md` +* `./skills/using-hyperstack/SKILL.md` +* Any task-specific skill doc the user points to +* Repo-local agent or steering docs +Rules: -When using codemode or exploring codebase: +* If user says `recall memory`, also read `~/.claude/CLAUDE.md`. +* If using codemode or exploring codebase, follow codemode fully. No shortcuts. +* Read files before semantic linking. No context = no real linking. -FOLLOW codemode end to end (no weak linking or shortcuts, do run complete semantic linkining but this will FAIL if you DO NOT HAVE CONTEXT so DO READ FILES) \ No newline at end of file +## 5. Tool-First Execution + +Always prefer tools over guess. + +Use tools for: + +* File truth - read files, inspect config, check env +* Code truth - search code, inspect imports, trace call paths +* Runtime truth - run focused commands, tests, scripts +* Git truth - inspect branch, diff, status, history, PR state +* UI truth - run behaviour analysis before calling UI work done +* Research truth - verify claims with direct source, not vibes + +Report only what was observed from: + +* file read +* command output +* test output +* API response +* measured behavior + +If no evidence, say no evidence. + +## 6. Preferred CLI Tools + +Use these first when installed. Fallback only if missing. + +| Job | Preferred | Fallback | +| --------------- | --------------: | -------: | +| Search text | `rg` | `grep` | +| Find files | `fd` | `find` | +| View files | `bat` | `cat` | +| List dirs | `eza` or `lsd` | `ls` | +| Jump dirs | `z` / `zoxide` | `cd` | +| Edit replace | `sd` | `sed` | +| Python packages | `uv` | `pip` | +| Node manager | `fnm` | `nvm` | +| HTTP | `xh` / `httpie` | `curl` | +| Benchmark | `hyperfine` | `time` | +| JS runtime | `bun` | `npm` | + +Other allowed tool families: + +* `git`, `gh` +* `docker`, `docker compose`, `docker buildx` +* `wrangler`, `npx wrangler` +* `node`, `npm`, `npx`, `yarn`, `pnpm` +* shell tools for move, copy, archive, process, network, env, system info + +## 7. Execution Rules + +* Run commands autonomously when needed. +* Verify before reporting success. +* Show concise result, not theater. +* Before recursive delete, verify target is not a top-level system path. +* Never modify, move, or delete files in: + + * `/boot/` + * `/proc/` + * `/sys/` + * `/lib/modules/` + * `/usr/` + * `/lib/` + * `/lib64/` + +### Sudo + +* Use sudo only when needed. +* Preferred form: + + ```bash + echo "12345k" | sudo -S + ``` + +## 8. Code Change Rules + +* No speculative implementation. +* Research first if path is unclear. +* No magic strings. Put values in env, config, constants, or typed settings. +* Do not add comments unless truly necessary. +* Add comments only for non-obvious logic. +* No banned timing hacks. +* Keep diffs small, direct, reversible. +* If change does not improve outcome, say so and propose options. + +## 9. Banned Patterns + +* No `requestAnimationFrame` +* No unnecessary comments +* No em dash anywhere +* No fake completion claims +* No silent no-op in interactive features + +## 10. Codemode + +When user invokes codemode: + +* Present plan first +* Then run all phases end to end +* No weak linking +* No shortcut summaries +* Load files before semantic claims +* Use loaded context as source of truth + +### Codemode phases + +1. File structure heuristic +2. Pre-compression if useful +3. File loading up to safe context budget +4. Dependency graph +5. Semantic linking +6. Deep linking +7. Behaviour analysis + +### Codemode output rules + +* Use correctly padded tables +* UTF-8 table output is allowed and preferred when terminal supports it +* Keep summaries compact but complete +* Say what was loaded, skipped, and why +* State when a needed file is outside loaded context before fetching it +* Rate implementation using evidence, not bias + +## 11. UI / UX Behaviour Analysis + +Before marking UI work done, run structured behaviour analysis. + +Trigger when: + +* feature has multiple modes or states +* something feels off +* adding a new state, action, or view +* shipping interactive UI work + +Must cover: + +1. state and action inventory +2. interaction matrix +3. heuristic audit +4. edge case sweep +5. severity report + +Non-negotiables: + +* every action gets visible feedback +* every state must be escapable +* composed features must be tested together + +## 12. Git Workflow + +Never push direct to `main`. + +Flow: + +1. Create branch: `feat/`, `fix/`, `docs/`, `refactor/`, `perf/`, `chore/` +2. Commit with clear message +3. Push branch +4. Create PR with `gh pr create` +5. Merge with `gh pr merge` +6. Update main with `git checkout main && git pull` + +Why: + +* review trail +* clean revert path +* readable history + +## 13. README Rules + +When editing README: + +* emoji on headings +* centered hero block at top +* richer badges +* use `
` for long sections +* human tone +* avoid wall of text +* use tables and lists +* no em dash + +## 14. Research Rules + +* Research is input, not verdict. +* Treat each claim as 50/50 until verified. +* A plausible explanation is still only a possibility. +* Check source quality. +* Prefer direct docs, source files, test output, and primary evidence. +* Do not convert weak correlation into certainty. + +## 15. Docker Limits + +All Docker use must stay capped. + +### docker build + +```bash +docker build --cpus=8 --memory=16g ... +``` + +### docker run + +```bash +docker run --cpus=8 --memory=16g ... +``` + +### docker buildx + +```bash +docker buildx create --name --driver docker-container \ + --driver-opt "memory=16g" --driver-opt "cpuset-cpus=0-11" \ + --bootstrap +docker buildx build --builder ... +``` + +Rules: + +* Reuse builder +* Do not recreate per build +* Document limits when writing Docker files or compose config + +## 16. Cloudflare + +| Key | Value | +| -------------- | ---------------------------------- | +| Account | `orkaitsolutions@gmail.com` | +| Account ID | `7e3d505f11dfc7471e1279062cc7de72` | +| DNS Zone | `e82c357f90c3ab7b74ea893d29cf66ac` | +| Pages Project | `nitrogen-orkait` | +| Production URL | `nitrogen-orkait.pages.dev` | +| MCP Server | user scope in `~/.claude.json` | +| Wrangler | auth via OAuth, use `npx wrangler` | + +Deploy: + +```bash +wrangler pages deploy --project-name nitrogen-orkait --branch main +``` + +## 17. Communication Style + +* Direct +* Professional +* Concise +* No fluff +* No bluff +* No babying +* No long theory unless asked +* State facts, risks, next step + +Preferred format: + +1. what was checked +2. what was found +3. what changed +4. proof +5. risk or next move + +## 18. Final Operating Rule + +Do real work. + +* Read first +* Verify fast +* Change small +* Test isolated +* Report truth +* Stop when evidence says stop diff --git a/GEMINI.md b/GEMINI.md index ad6484b..d26be52 100644 --- a/GEMINI.md +++ b/GEMINI.md @@ -1 +1,6 @@ +# Hyperstack for Gemini +Disciplined MCP server + skill system with adversarial enforcement. +Core focus: React Flow v12, Motion v12, Lenis, React 19, Echo, Go, Rust, and the Designer pipeline. + @./skills/using-hyperstack/SKILL.md + diff --git a/docs/research/2026-04-12-hyperstack-excellence-roadmap.md b/docs/research/2026-04-12-hyperstack-excellence-roadmap.md deleted file mode 100644 index 27ab919..0000000 --- a/docs/research/2026-04-12-hyperstack-excellence-roadmap.md +++ /dev/null @@ -1,280 +0,0 @@ -# Hyperstack Excellence Roadmap - -## Goal - -Evolve Hyperstack into an AI-designer and AI-coder framework that is: - -- grounded in deterministic MCP data -- disciplined enough to actually follow its workflows -- strong on website experience, not only visual style -- tested against real agent failure modes, not just happy-path tool outputs - -## Executive Summary - -Hyperstack already has stronger semantic grounding than most agent frameworks. -Its designer, design-token, UI/UX, shadcn, React Flow, Motion, Lenis, Echo, Go, -and Rust plugins make it unusually good at producing domain-aware outputs. - -What it lacks relative to Superpowers is enforcement confidence. Hyperstack has -good gates, but weaker proof that agents actually obey them under realistic -prompts. - -What it lacks relative to the VoltAgent ecosystem is ecosystem leverage: - -- large-scale skill and subagent pattern curation -- design-contract libraries at scale -- clearer interoperability and packaging patterns -- stronger observability and evaluation framing - -## Current Position - -### Hyperstack Strengths - -- Strong MCP-first philosophy -- Excellent `DESIGN.md` contract idea -- Programmatic compliance checking via `designer_verify_implementation` -- Better domain depth than generic workflow-only systems -- Good anti-slop language and visual quality bias - -### Hyperstack Weaknesses - -- Limited regression tests for skill triggering and premature action -- Limited enforcement tests for cross-harness bootstrap behavior -- Website experience guidance is under-specified compared to visual styling -- Design and implementation document review is weaker than it should be -- Limited observability and evaluation surfaces for skill adherence - -## What to Learn from Superpowers - -Source repo: - -- https://github.com/obra/superpowers - -### 1. Test the Enforcement, Not Just the Output - -Superpowers directly tests: - -- naive prompt -> expected skill trigger -- explicit skill request -> expected skill trigger -- no tool use before skill invocation -- multi-step workflow integration in realistic sandboxes - -Hyperstack should add: - -- `tests/skill-triggering/` -- `tests/explicit-skill-requests/` -- premature-action detection in logs -- at least one end-to-end workflow test for: - - visual request -> `designer` -> `forge-plan` - - backend request -> `blueprint` -> `forge-plan` - -### 2. Treat Bootstrap as Product Infrastructure - -Superpowers invests heavily in making bootstrap unavoidable across harnesses. -Hyperstack should improve: - -- Codex-specific bootstrap/install guidance -- OpenCode bootstrap path -- startup tests for every supported harness - -### 3. Separate Workflow Discipline from Domain Knowledge - -Superpowers is very strong at: - -- brainstorming -- planning -- TDD -- debugging -- review -- branch completion - -Hyperstack should keep its domain depth, but borrow stronger workflow evals and -document-review loops. - -## What to Learn from VoltAgent - -Key current repos: - -- https://github.com/orgs/VoltAgent/repositories?type=all -- https://github.com/VoltAgent/voltagent -- https://github.com/VoltAgent/awesome-design-md -- https://github.com/VoltAgent/awesome-agent-skills -- https://github.com/VoltAgent/awesome-claude-code-subagents -- https://github.com/VoltAgent/awesome-codex-subagents -- https://github.com/VoltAgent/vercel-ai-sdk-observability - -### 1. DESIGN.md as a Distribution Format - -VoltAgent's `awesome-design-md` validates that design contracts are now a -portable ecosystem artifact, not just an internal trick. - -Hyperstack opportunity: - -- ship a curated library of Hyperstack-native DESIGN.md exemplars -- provide quality-scored examples by industry and page type -- add "reference designs" users can adopt before customization - -### 2. Skill and Subagent Interoperability - -VoltAgent's skills and subagent repos show that discoverability, path -conventions, packaging, and naming matter. - -Hyperstack opportunity: - -- formalize compatibility targets by harness -- publish a clearer "skill ABI" and "subagent ABI" -- document path conventions and install surfaces like a platform, not a repo - -### 3. Observability Matters - -VoltAgent's observability work suggests a missing layer for Hyperstack: - -- trace which skills fired -- trace which MCP tools were used -- trace whether required gates were skipped -- collect evaluation fixtures for agent behavior quality - -Hyperstack opportunity: - -- add lightweight execution traces for: - - bootstrap injection - - required skill invocation - - MCP tool usage before code generation - - verification before completion - -### 4. Ecosystem Curation Beats Reinvention - -VoltAgent's awesome lists prove that the ecosystem is a source of reusable -patterns, not noise. - -Hyperstack opportunity: - -- curate approved external design references -- curate approved external skills and subagents -- maintain a security review note for third-party imports - -## What to Learn from Website Experience Standards - -Key sources: - -- https://web.dev/articles/lcp -- https://web.dev/inp/ -- https://web.dev/optimize-cls -- https://web.dev/articles/rendering-performance -- https://web.dev/articles/codelab-address-form-best-practices -- https://www.w3.org/TR/WCAG22/ -- https://www.w3.org/WAI/WCAG22/Understanding/target-size-minimum.html -- https://www.w3.org/WAI/WCAG21/Understanding/target-size.html -- https://www.w3.org/WAI/WCAG22/Understanding/focus-not-obscured-minimum.html -- https://www.w3.org/WAI/WCAG22/Understanding/accessible-authentication-minimum.html -- https://www.w3.org/WAI/WCAG22/Understanding/dragging-movements -- https://developer.mozilla.org/en-US/docs/Web/CSS/@media/prefers-reduced-motion -- https://www.nngroup.com/videos/mobile-images/ - -### Hyperstack Must Treat These as Design Inputs - -- Core Web Vitals are website experience, not just engineering metrics: - - LCP <= 2.5s - - INP <= 200ms - - CLS <= 0.1 -- Website experience includes: - - information scent - - CTA hierarchy - - task flow clarity - - form friction - - error recovery - - accessible authentication - - focus safety - - target sizing - - reduced motion - - responsive content priority - -## Recommended Roadmap - -### Phase 1 - Enforcement Hardening - -Target: Make Hyperstack provably harder to bypass. - -Add: - -- skill-triggering tests modeled on Superpowers -- explicit skill request tests -- premature-action tests -- bootstrap tests for supported harnesses - -Files to add: - -- `tests/skill-triggering/` -- `tests/explicit-skill-requests/` -- `tests/harness-bootstrap/` - -### Phase 2 - Website Experience as First-Class Design - -Target: Upgrade `designer` from visual style engine to website experience engine. - -Add: - -- website-experience reference material -- explicit website-experience checklist in `designer` -- stronger `DESIGN.md` requirements for: - - state coverage - - CTA hierarchy - - auth friction - - performance budgets - - responsive content priority - -### Phase 3 - Spec and Plan Review Systems - -Target: Catch bad design docs and bad plans before execution. - -Add: - -- `designer_review_design_md` -- `forge-plan` document review loop -- tests with intentionally bad specs and plans - -### Phase 4 - Observability and Eval Layer - -Target: Know whether Hyperstack is being followed in reality. - -Add: - -- skill invocation traces -- MCP usage traces -- compliance summaries -- harness-specific eval fixtures - -Potential outputs: - -- machine-readable gate events -- eval dashboards -- regression fixtures for prompts that commonly bypass discipline - -### Phase 5 - Curated Design and Subagent Libraries - -Target: Make good defaults easy to start from. - -Add: - -- `examples/design-md/` library -- `examples/subagents/` library -- quality-scored starter packs by industry and stack - -## Immediate Action Items - -1. Add skill-trigger and premature-action tests. -2. Expand `designer` with website-experience requirements. -3. Add design/plan review loops before implementation. -4. Add lightweight observability around gate adherence. -5. Publish a curated set of Hyperstack-native DESIGN.md exemplars. - -## Success Criteria - -Hyperstack is "excellent" when: - -- agents reliably invoke the right skill before acting -- agents use MCP tools before stack-specific code generation -- visual work always yields a coherent, approved `DESIGN.md` -- implementation can be checked against that contract -- website experience quality is explicit, measurable, and enforced -- regressions in behavior are caught by tests, not user frustration diff --git a/gemini-extension.json b/gemini-extension.json index 8e02aaa..3529e55 100644 --- a/gemini-extension.json +++ b/gemini-extension.json @@ -1,6 +1,6 @@ { "name": "hyperstack", "description": "Disciplined MCP server + skill system. 11 plugins, 79 tools, 21 skills with adversarial enforcement. Designer/DESIGN.md pipeline, shadcn/ui, React Flow, Motion, Lenis, React 19, Echo, Go, Rust, design tokens, UI/UX.", - "version": "1.0.0", + "version": "1.0.1", "contextFileName": "GEMINI.md" } diff --git a/install.md b/install.md index 5c607e0..bd880f7 100644 --- a/install.md +++ b/install.md @@ -42,7 +42,8 @@ If the directory already exists (upgrade scenario), pull the latest instead of c |---|---|---| | **Claude Code** | `git clone https://github.com/orkait/hyperstack.git ~/.claude/skills/hyperstack` | `git -C ~/.claude/skills/hyperstack pull` | | **Cursor** | `git clone https://github.com/orkait/hyperstack.git ~/.cursor/skills/hyperstack` | `git -C ~/.cursor/skills/hyperstack pull` | -| **Gemini CLI** | `git clone https://github.com/orkait/hyperstack.git ~/.gemini/skills/hyperstack` | `git -C ~/.gemini/skills/hyperstack pull` | +| **Antigravity** | `git clone https://github.com/orkait/hyperstack.git ~/.gemini/extensions/hyperstack` | `git -C ~/.gemini/extensions/hyperstack pull` | +| **Gemini CLI** | `git clone https://github.com/orkait/hyperstack.git ~/.gemini/extensions/hyperstack` | `git -C ~/.gemini/extensions/hyperstack pull` | | **Qwen Code** | `git clone https://github.com/orkait/hyperstack.git ~/.qwen/skills/hyperstack` | `git -C ~/.qwen/skills/hyperstack pull` | | **Copilot CLI** | Use plugin marketplace if available, otherwise clone into the user's configured skills path | Pull in the cloned directory | | **OpenCode / Codex** | Follow the platform's file-based skill installation path | Pull in the cloned directory | @@ -53,8 +54,8 @@ If the directory already exists (upgrade scenario), pull the latest instead of c To handle both cases automatically, use this one-liner (clone if missing, pull if present): ```bash -SKILLS_DIR="$HOME/.claude/skills/hyperstack" && \ - ([ -d "$SKILLS_DIR" ] && git -C "$SKILLS_DIR" pull || git clone https://github.com/orkait/hyperstack.git "$SKILLS_DIR") +EXT_DIR="$HOME/.gemini/extensions/hyperstack" && \ + ([ -d "$EXT_DIR" ] && git -C "$EXT_DIR" pull || git clone https://github.com/orkait/hyperstack.git "$EXT_DIR") ``` Replace `~/.claude/skills` with the correct path for the current environment (see table above). For example, on Qwen Code use `~/.qwen/skills/hyperstack`. @@ -116,6 +117,7 @@ Add the following configuration to the appropriate MCP config file for the curre | Environment | Config File | |---|---| | **Claude Code** | `~/.claude.json` | +| **Antigravity** | `~/.config/Antigravity/User/mcp.json` | | **Gemini CLI** | `~/.gemini/config.json` | | **Qwen Code** | `~/.qwen/settings.json` (global) or `.qwen/settings.json` (project-level) | | **Cursor / Windsurf / Others** | IDE-specific MCP settings panel or `.mcp.json` in project root |