A systematic Claude Code configuration for spec-driven development (SDD) with quality gates, custom agents, automated hooks, security guardrails, and a full specification pipeline. Balances security, performance, maintainability, and efficacy at every step.
# Clone the dotfiles repo (or your own fork)
git clone git@github.com:joaoariedi/dotfiles.git ~/dotfiles
# Symlink Claude Code config into ~/.claude/
cd ~/dotfiles && stow claude
# Verify symlinks
ls -la ~/.claude/
# CLAUDE.md β ../dotfiles/claude/.claude/CLAUDE.md
# commands/ β ../dotfiles/claude/.claude/commands/
# agents/ β ../dotfiles/claude/.claude/agents/
# hooks/ β ../dotfiles/claude/.claude/hooks/
# skills/ β ../dotfiles/claude/.claude/skills/
# rules/ β ../dotfiles/claude/.claude/rules/
# mcp.json β ../dotfiles/claude/.claude/mcp.jsonsettings.json contains hooks and env vars. It's machine-specific and not managed by stow:
cat > ~/.claude/settings.json << 'EOF'
{
"permissions": {
"allow": [
"Bash($HOME/.claude/hooks/speckit-helper.sh:*)"
]
},
"env": {
"CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS": "1"
},
"hooks": {
"PreToolUse": [
{ "matcher": "Bash", "hooks": [{ "type": "command", "command": "~/.claude/hooks/quality-before-commit.sh", "timeout": 120 }] },
{ "matcher": "Edit|Write", "hooks": [{ "type": "command", "command": "~/.claude/hooks/block-sensitive-files.sh" }] }
],
"PostToolUse": [
{ "matcher": "Edit|Write", "hooks": [
{ "type": "command", "command": "~/.claude/hooks/format-after-edit.sh", "timeout": 15 },
{ "type": "command", "command": "~/.claude/hooks/run-tests-after-edit.sh", "timeout": 30 }
]}
],
"Notification": [
{ "matcher": "", "hooks": [{ "type": "command", "command": "~/.claude/hooks/notify-on-block.sh", "timeout": 5 }] }
],
"Stop": [
{ "matcher": "", "hooks": [{ "type": "command", "command": "~/.claude/hooks/stop-quality-check.sh", "timeout": 10 }] }
]
}
}
EOF
β οΈ Important: Thespeckit-helper.shpermission is required for all spec-kit commands to work. Claude Code blocks$(),||, and|operators in pre-flight commands, so all logic is routed through the helper script.
cd ~/any-project
claude
> /context # should detect tech stack and structure
> /speckit.init # bootstraps .specify/ for spec-driven devAfter stow claude, every new Claude Code session loads the rules, agents, commands, and skills automatically. If you have an existing ~/.claude/ config, back it up first (mv ~/.claude ~/.claude.backup); to pick up newly-added files later, restow with stow -R claude.
This is the topology I run the framework on. The framework itself is host-agnostic β this section just documents one tested setup with explicit trust boundaries.
| Role | OS | Production Access | Always-On | Used For |
|---|---|---|---|---|
| π» Primary laptop | Manjaro Linux | β Full | β No | Day-to-day dev, production deploys, attended sessions |
| π§ Always-on remote workstation | Arch Linux | β GitHub only | β Yes | Long-running tasks, mobile resume target, off-hours work |
Both machines share the same dotfiles (stow claude), so Claude Code behaviour is identical on each: same agents, hooks, skills, rules, MCPs. Only the per-host settings.json (env vars, hook timeouts) differs.
The blast-radius asymmetry is deliberate:
- π Production credentials live only on the laptop. It is offline most of the time and physically attended.
- π The always-on workstation can reach GitHub but not production. A compromise of the higher-exposure host (always online) cannot pivot into production systems.
- π Network is Tailscale-only. Strict ACLs constrain which hosts can reach which services β no public IPs, no port-forwarding, no inbound exposure.
- ποΈ Wazuh monitors the whole stack β file-integrity monitoring, auth events, command auditing β across both machines and any production hosts.
Claude Code's session-portability features pair naturally with this setup:
βΆοΈ Start a long-running task on the always-on workstation before stepping away- π Resume from the laptop later via
/teleport(see Multi-Environment Workflows) - π± Or β start on mobile (claude.ai/code), pull into the laptop terminal when home
Critically, the always-on workstation can autonomously work on GitHub repos (review PR feedback, run CI, commit fixes) without ever holding production credentials. The laptop holds the keys; the always-on host holds the time.
The framework provides three workflow paths. Choose the one that matches your task:
For typos, config updates, formatting fixes, and dependency bumps:
/speckit.fix "fix typo in error message" β apply fix β verify β commit
The triviality gate ensures only genuinely trivial changes bypass the full pipeline. If your change modifies logic, APIs, or schemas, you'll be redirected to the Standard path.
The full spec-driven development pipeline from idea to implementation:
/context β π§ orient (detect stack, tools, structure)
/speckit.init β ποΈ bootstrap (once per project)
/speckit.constitution β π define principles (once per project)
/speckit.brainstorm β π‘ Socratic design exploration (refine the idea) β NEW
/speckit.specify β βοΈ write spec (scenarios, requirements, criteria)
/speckit.clarify β π resolve ambiguities (optional)
/speckit.plan β π design (affected files, data model, API contracts)
/speckit.review β π challenge the plan (scope, architecture, tests) β NEW
/speckit.tasks β π generate task list (phased, with dependencies)
/speckit.checklist β β
pre-implementation gate (optional)
/speckit.analyze β π¬ consistency check (optional)
/speckit.implement β π§ͺ TDD execution (red-green cycle)
/quality β π‘οΈ final quality gate
Specifications live in .specify/specs/<branch>/ and are committed to version control. A constitution in .specify/memory/constitution.md defines project-level governance principles that every plan is validated against.
For projects with existing code that lack formal specifications:
/speckit.baseline β π reverse-engineer spec from code β NEW
/speckit.review β π review the inferred spec/plan
/speckit.tasks β π generate tasks for enhancements
/speckit.implement β π§ͺ execute with quality gates
Every decision in the framework balances four concerns:
| Concern | How the Framework Addresses It |
|---|---|
| π Security | Hooks block sensitive files, gitleaks secrets scanning, OWASP LLM rules, forensic-specialist agent |
| β‘ Performance | performance-audit skill, quality-guardian benchmarks, CI optimization (cancel-in-progress, staged-files-only lint) |
| ποΈ Maintainability | SOLID principle checks, code quality limits, systematic-debugging skill, cross-cutting change maps |
| π― Efficacy | Iron Laws prevent false completions, spec compliance gates, verification-before-completion skill |
The framework composes 5 layers β methodology (spec-kit), agent runtime (Claude Code), specialised sub-agents, integrations (MCP, hooks, rtk, security CLIs), and models (Opus 4.7 / Sonnet 4.6 / Haiku 4.5) β with cross-cutting governance for quality, security, context, and memory. A single SDD request traverses every layer:
sequenceDiagram
autonumber
actor Dev as Developer
participant FW as L1 Β· Methodology<br/>(spec-kit)
participant CC as L2 Β· Claude Code<br/>(main agent)
participant Sub as L2 Β· Sub-Agent<br/>(test-specialist)
participant MCP as L3 Β· MCP / Hooks
participant RTK as L3 Β· rtk proxy
participant Mod as L4 Β· Opus 4.7
Dev->>FW: /speckit.brainstorm "user auth idea"
FW->>CC: socratic exploration
CC->>Mod: refine concept (Q&A)
Mod-->>CC: refined direction
CC-->>Dev: β confirmed concept
Dev->>FW: /speckit.specify "user auth"
FW->>CC: invoke pipeline (spec β plan β tasks)
CC->>Mod: reason about spec
Mod-->>CC: spec draft + plan
CC->>Sub: dispatch (one-shot, isolated ctx)
Sub->>RTK: rtk pytest -q
RTK-->>Sub: compressed digest (β10% tokens)
Sub->>Mod: analyse failing tests
Mod-->>Sub: fix proposal
Sub-->>CC: digest only (200 tok vs 5 000)
CC->>MCP: PreToolUse hook (gitleaks, sensitive-file block)
MCP-->>CC: β safe to write
CC->>Mod: synthesise final patch
Mod-->>CC: code + tests
CC-->>Dev: spec + tests + commit ready
- L1 (methodology) shapes thinking, not state. spec-kit /
/speckit.brainstormdefines structure but holds no conversation context. - Sub-agents isolate context. Dispatched in fresh contexts and discarded β only the digest returns. Primary defence against the >40% "Dumb Zone".
- rtk compresses CLI output (60β90%) before it reaches the main context β the highest-leverage token optimisation in the framework.
- MCP / Hooks enforce safety boundaries the model cannot bypass (gitleaks, sensitive-file block, format-after-edit).
- Models are stateless β every layer above exists to give them the right context and route their output safely.
| Component | Status | Notes |
|---|---|---|
| spec-kit (SDD) | β active | Full pipeline incl. /speckit.brainstorm β specify β plan β tasks β implement |
| OpenSpec | βͺ not adopted | Alternative spec workflow |
| Superpowers | βͺ pattern reference | Skill-pack architecture is the influence |
| Claude Code | β primary runtime | Opus 4.7 / 1M ctx default |
| Codex Β· Opencode Β· Cursor Β· Aider | βͺ alternatives | Same methodology layer would still apply |
| MCP: github, voicemode | β active | See ~/.claude/mcp.json |
| MCP: Semgrep, Snyk, SonarQube | βͺ optional | Add only when CLI scans aren't enough |
| rtk | β available (auto-detected per machine) | 60β90% token reduction on common dev commands |
| Fabric | βͺ pattern reference | Reusable prompt-pattern library |
| gitleaks Β· semgrep Β· trivy Β· ruff Β· gosec | β via Bash | Quality / security CLIs |
| Opus 4.7 / Sonnet 4.6 / Haiku 4.5 | β via Anthropic | Model selection per task |
| GPT Β· Gemini Β· Qwen Β· Llama | βͺ alternatives | Foundation models from other providers |
Seven hooks enforce quality automatically β no manual intervention needed:
- π Pre-commit β secrets detection (gitleaks) + language-specific linting blocks the commit on errors
- π File protection β writes to
.env,*.key,*.pem, credentials, and.git/internals are blocked - π¨ Auto-format β formatters run after every edit (ruff, biome, gofmt, rustfmt)
- π§ͺ Auto-test β test suite runs after source file edits (throttled 15s, non-blocking)
- π Reminders β alerts if source files were edited but tests weren't run
- π Notifications β desktop alerts when the agent needs human input (Linux/macOS)
The quality-guardian agent validates before commit/PR/merge with secrets scanning, SAST, supply chain checks, SOLID architectural analysis, performance validation, and Iron Law enforcement.
The framework implements layered defenses against OWASP LLM vulnerabilities:
| Layer | Mechanism | Covers |
|---|---|---|
| Enforcement | Hooks | Sensitive file blocking, secrets detection, pre-commit quality |
| Guidance | Rules | OWASP LLM Top 10, MCP security, code quality, SOLID principles |
| Analysis | Skills & Agents | Security review, forensic investigation, quality gates |
| Efficacy | Iron Laws | Verification-before-completion, systematic-debugging |
MCP servers follow strict security posture β OAuth 2.1 for production, least privilege, input validation, and human-in-the-loop for high-impact actions.
Type ultrathink in any prompt to bump that turn to high reasoning effort. Use it for:
- ποΈ Complex architectural analysis or ambiguous requirements
- π Security-sensitive code reviews
- π Debugging race conditions or subtle bugs
- π Multi-file refactoring decisions
The effort boost reverts after the response β no persistent mode change needed.
| Mode | How | Best For |
|---|---|---|
| π΄ Opus (default) | Standard mode | Complex architecture, security reviews |
| β‘ Fast Mode | Toggle with /fast |
Quick iterations, bug fixes, exploration |
| π§ Ultrathink | Add ultrathink to prompt |
Deep reasoning on single turn |
| π’ Haiku | Agent frontmatter model: haiku |
Lightweight search, simple edits |
| π‘ Sonnet | Agent frontmatter model: sonnet |
Standard agent work |
Effort levels: max (via /model only) > high (ultrathink) > medium (default) > low
For long-running sessions, the framework uses the Document & Clear pattern:
- πΎ Checkpoint β write session state to a progress file (decisions, files changed, next steps)
- π§Ή Clear β run
/clearto reset the context window βΆοΈ Resume β read the progress file and continue from "Next steps"
See context-management.md rule for detailed guidance and project scaling strategies (small/medium/large).
- π± Remote Control β start work locally with
claude, resume from any device via claude.ai/code - π Teleport β pull cloud/web sessions into local terminal with
/teleport - π Sessions maintain full context across terminal, IDE, web, and mobile
Five specialized agents with no built-in equivalent:
Creates comprehensive test suites after implementation. Analyzes existing test patterns, designs unit/integration/E2E tests, runs coverage analysis. Supports Jest, Vitest, pytest, cargo test, go test, and more.
When to use: After implementing a feature, when you need thorough test coverage.
The quality gate for all code changes. Runs a 7-step validation pipeline:
- π§ Tool discovery and configuration
- π Secrets detection (mandatory, blocks on findings)
- π Code quality validation (lint, types, formatting)
- π Security assessment (delegates to
security-reviewskill + LLM security rules) - β‘ Performance validation (delegates to
performance-auditskill) - ποΈ Architectural pattern validation β SOLID principle checks with chain-of-thought for OCP and DIP
- π§ͺ Regression prevention (full test suite)
Enforces Iron Laws from verification-before-completion and systematic-debugging skills.
When to use: Before any commit, PR, or merge. Spawned automatically by /quality.
Two-stage code review specialist. Stage 1 validates spec compliance (implementation matches plan, all FRs addressed, no scope creep). Stage 2 checks code quality (SOLID, architecture, error handling, security, naming, test coverage).
Produces a structured review report with APPROVE / REQUEST_CHANGES / NEEDS_DISCUSSION verdict.
When to use: Before PR creation, after implementation. Distinct from review-coordinator (which manages the PR lifecycle).
Manages the PR lifecycle β creation, review coordination, feedback integration, and merge. Generates comprehensive PR descriptions with quality metrics. Supports GitHub and GitLab.
When to use: When creating PRs or managing review workflows.
Cybersecurity specialist for defensive forensics. Handles incident response, threat hunting, malware analysis, IOC generation with STIX/TAXII format, and MITRE ATT&CK mapping. Maintains proper chain of custody documentation.
When to use: When a system may be compromised, for security audits, or proactive threat hunting.
Built-in agents handle general tasks:
Explore(codebase search),Plan(architecture),general-purpose(implementation).
For parallel work across services or modules, Agent Teams provide peer-to-peer mesh orchestration:
TeamCreate β TaskCreate β spawn teammates β SendMessage β shutdown β TeamDelete
| Pattern | Use Case |
|---|---|
| Parallel impl | Multi-service feature (API + frontend + worker) |
| Test-driven | TDD with parallel test writing |
| Full pipeline | End-to-end: impl + tests + quality + review + PR |
| Research + build | Deep codebase research while implementing |
Requires CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 (included in settings.json above).
Six skills used by agents and commands internally:
| Skill | Purpose | Auto-invoked? |
|---|---|---|
π context-analysis |
Project structure detection, tech stack analysis | Yes β proactively on new codebases |
π security-review |
Code security checklist (secrets, SAST, SQLi, XSS, auth, supply chain SCA) | Yes β proactively on PRs |
β
verification-before-completion |
Evidence-first completion gate with Iron Law β must run proof commands before claiming done | Yes β proactively before completion |
π systematic-debugging |
4-phase root cause investigation (read β reproduce β evidence β fix) with Iron Law | Yes β proactively on bugs |
β‘ performance-audit |
N+1 queries, blocking I/O, memory leaks, algorithm complexity | No β explicit only |
π spec-template |
Structured Given/When/Then specification generation | No β speckit pipeline only |
The two new skills enforce Iron Laws β non-negotiable rules that prevent false completion claims and shotgun debugging. Each includes a rationalization prevention table to counter common excuses.
| Command | Args | Description |
|---|---|---|
/agent |
<task> |
Start full development workflow with planning and task tracking |
/context |
β | Analyze project tech stack, tools, and structure |
/pr-summary |
β | Generate PR description from current branch diff |
/quality |
β | Run comprehensive quality checks (spawns quality-guardian) |
/security-scan |
β | Scan staged changes for secrets, SQLi, XSS |
/speckit.init |
β | Bootstrap .specify/ directory in current project |
/speckit.constitution |
β | Create/update project governance principles |
/speckit.brainstorm |
<idea> |
Socratic design exploration before specification |
/speckit.specify |
<feature> |
Generate spec with scenarios, requirements, criteria |
/speckit.clarify |
β | Scan spec for ambiguities, ask targeted questions |
/speckit.plan |
β | Generate implementation plan from spec |
/speckit.review |
β | Read-only plan review gate (scope, architecture, tests) |
/speckit.tasks |
β | Generate phased task list from plan and spec |
/speckit.checklist |
β | Generate requirement quality checklists |
/speckit.analyze |
β | Read-only cross-artifact consistency analysis |
/speckit.implement |
β | Execute TDD implementation with quality gates |
/speckit.baseline |
<module> |
Reverse-engineer spec from existing code (brownfield) |
/speckit.fix |
<description> |
Quick-fix bypass for trivial changes |
Shell-script hooks run automatically via settings.json:
| Hook | Trigger | What It Does |
|---|---|---|
π quality-before-commit.sh |
PreToolUse on Bash |
Intercepts git commit β runs gitleaks + language-specific linters, blocks on errors |
π block-sensitive-files.sh |
PreToolUse on Edit|Write |
Blocks writes to .env*, *.key, *.pem, credentials*, .git/*, secrets/ |
π¨ format-after-edit.sh |
PostToolUse on Edit|Write |
Auto-formats edited files (ruff, biome/prettier, gofmt, rustfmt), 10s throttle |
π§ͺ run-tests-after-edit.sh |
PostToolUse on Edit|Write |
Auto-runs test suite after source edits, 15s throttle, non-blocking |
π notify-on-block.sh |
Notification | Desktop alert when agent needs attention (notify-send / osascript) |
π stop-quality-check.sh |
Stop event | Reminds if source files were edited but tests not run |
π§ speckit-helper.sh |
Pre-flight commands | Routes backtick logic to avoid Claude Code permission errors |
Modular policies loaded into every session automatically:
| Rule | Covers |
|---|---|
π code-quality.md |
Function/file size limits, SOLID principles, testing, verification-before-completion, Iron Laws, security test files |
π git-workflow.md |
Commit format, branch naming, co-authoring, staging |
π agent-workflow.md |
4-phase workflow, Agent Teams, CLAUDE.md template guidance (change maps, guardrails, trust boundaries) |
π§ quality-tooling.md |
Per-language tools, tiered validation, lefthook, pre-commit/pre-push separation, CI best practices |
π pipeline-security.md |
ASPM services, open-source security tools, strategic selection guide |
π‘οΈ llm-security.md |
OWASP LLM Top 10 mitigations (prompt injection, excessive agency, data leakage, supply chain) |
π mcp-security.md |
MCP server auth, input validation, tool poisoning defense, server curation |
π¦ context-management.md |
Document & Clear pattern, compact context priorities, project scaling by size |
Functions: < 50 lines
Files: < 500 lines
Complexity: < 10 (cyclomatic)
SOLID: OCP + DIP violations flagged in changed code
Iron Laws: verification-before-completion + systematic-debugging
Enforced by code-quality.md rule and quality-guardian agent. Test coverage follows project-configured thresholds.
Each feature generates artifacts in .specify/specs/<branch>/:
| Artifact | Generated By | Purpose |
|---|---|---|
spec.md |
/speckit.specify or /speckit.baseline |
User scenarios, functional requirements, success criteria |
plan.md |
/speckit.plan |
Design, affected files, constitution compliance |
tasks.md |
/speckit.tasks |
Phased task list with IDs and dependencies |
research.md |
/speckit.plan |
Resolved clarifications |
checklists/*.md |
/speckit.checklist |
Requirement quality checklists |
data-model.md |
/speckit.plan |
Schema changes (if applicable) |
contracts/ |
/speckit.plan |
API contracts (if applicable) |
.specify/
βββ memory/
β βββ constitution.md # project governance principles
βββ templates/
β βββ spec.md # specification template
β βββ plan.md # plan template
β βββ tasks.md # task list template
β βββ checklist.md # checklist template
βββ specs/
βββ feature-name/ # one directory per feature (kebab-case)
βββ spec.md
βββ plan.md
βββ tasks.md
βββ research.md
βββ checklists/
The framework uses Claude Code's built-in task tracker:
| Tool | Usage |
|---|---|
TaskCreate |
Mandatory for any task with >2 steps |
TaskUpdate |
Mark ONE task in_progress at a time; completed immediately after |
TaskGet |
Read full task details before starting work |
TaskList |
Check progress and find next available tasks |
{
"mcpServers": {
"github": {
"transport": "http",
"url": "https://api.githubcopilot.com/mcp/v1",
"scope": "user",
"description": "GitHub PR/Issue automation for review-coordinator agent"
}
}
}Add security MCP servers only when CLI tools are insufficient β each server adds context overhead. See mcp-security.md rule for evaluation criteria.
~/dotfiles/claude/
βββ .claude/
β βββ CLAUDE.md # core config (loaded into system prompt)
β βββ mcp.json # MCP servers (GitHub)
β βββ agents/ # 5 custom agents
β βββ commands/ # 18 slash commands (5 standard + 13 speckit)
β βββ hooks/ # 7 lifecycle hooks (shell scripts)
β βββ rules/ # 8 modular policy files
β βββ skills/ # 6 internal skills
βββ .stow-local-ignore # excludes README from stow
This framework was shaped by patterns observed in several projects:
| Project | Author | Key Contributions |
|---|---|---|
| superpowers | Jesse Vincent | Iron Laws, rationalization prevention tables, Socratic brainstorming with hard gate, verification-before-completion, systematic debugging methodology, evidence-first workflow |
| spec-kit | GitHub | Spec-driven development (SDD) pipeline β the specify β plan β tasks β implement workflow that forms the framework's core |
| FrankYomik | Fabio Akita | Cross-cutting change maps, trust boundary documentation, domain-first CLAUDE.md structure |
| FrankSherlock | Fabio Akita | "What NOT to change" guardrails, architecture-as-constraints pattern, _-prefixed research directories |
| FrankMD | Fabio Akita | AGENTS.md as tool-agnostic contributor guide, concise do/don't lists |
| FrankMega | Fabio Akita | Lefthook parallel hooks, security-specific test files, pre-commit vs pre-push separation, staged-files-only linting |
| speckit-agent-skills | dceoy | speckit.baseline concept β reverse-engineering specs from existing code |
| speckit-wiggum-toolkit | leonardoFu | speckit.research and speckit.reflect concepts β formalized research and retrospective phases |
MIT License β see LICENSE
Framework Version: 4.3.0 Β |Β Last Updated: 2026-03-31 Β |Β Compatibility: Claude Code with sub-agents, hooks, skills, MCP, spec-kit, Agent Teams