The skill that writes skills.
Skill-Forge is a secure, self-improving compiler pipeline for OpenClaw. It observes how you use your AI assistant, detects repeated patterns, and automatically generates new skills to handle those tasks — all behind a security scanner and human approval gate.
Every generated skill passes through a 6-stage pipeline before it can run. Every deployment is reversible in one command.
observe → crystallize → generate → scan → review → deploy
| Stage | What it does |
|---|---|
| Observe | Scans session logs for repeated user request patterns |
| Crystallize | Stores qualifying patterns as reusable "genes" (building blocks for future skills) |
| Generate | Creates skill skeletons from constrained templates — no open-ended code generation |
| Scan | Security gate with hard blocks (eval, exec, subprocess, raw sockets) and soft scoring |
| Review | Machine gate (schema + syntax + scan pass) then human gate (you approve or reject) |
| Deploy | Promotes from staging to active skills with a rollback manifest |
Skill-Forge assumes generated code is untrusted until proven otherwise:
- Deny-by-default permissions — filesystem, network, process, and secrets access are all denied unless explicitly granted
- Hard blocks are non-negotiable —
eval(),exec(),subprocess,os.system(), raw sockets, and dynamic imports cause immediate scan failure. No override. - Staging isolation — generated skills never touch the active
skills/directory until they pass scan + review + deploy - Append-only audit trail — every pipeline action is logged to
events.jsonlwith a SHA-256 hash chain for tamper evidence - Rollback manifest — file hashes are recorded at promotion time; rollback removes or restores in one command
- SKILL.md scanning — documentation is scanned for dangerous instructions, but fenced code blocks (examples) are excluded to prevent false positives
cd ~/.openclaw/workspace/skills/skill-forge
# Check pipeline status
python scripts/forge.py --status
# Observe patterns from the last 7 days of session logs
python scripts/observe.py --days 7
# Run the full pipeline (stops before deploy for human approval)
python scripts/forge.py --full --name my-skill --description "Does something useful"
# Review, approve, and deploy
python scripts/review.py --skill my-skill
python scripts/review.py --skill my-skill --approve
python scripts/deploy.py --skill my-skill
# Rollback if needed
python scripts/rollback.py --skill my-skill
# Rollback and restore previous version from backup
python scripts/rollback.py --skill my-skill --restore| Script | Purpose |
|---|---|
forge.py |
Orchestrator — run full or partial pipeline, check status |
observe.py |
Detect repeated patterns from session logs |
crystallize.py |
Store qualifying patterns as reusable genes |
generate.py |
Template-based skill generation to staging |
scan.py |
Security scanner with hard blocks + soft scoring |
review.py |
Machine gate + human review summary |
deploy.py |
Promote from staging to active skills |
rollback.py |
Undo deployment (with optional --restore) |
_common.py |
Shared utilities, constants, event logging, policy loader |
All runtime data is stored locally in ~/.openclaw/workspace/memory/skill-forge/:
| File | Purpose |
|---|---|
genes.json |
Reusable capability patterns extracted from observations |
capsules.json |
Generated skill records with lifecycle status tracking |
events.jsonl |
Append-only audit trail with SHA-256 hash chain |
rollbacks.json |
Promotion snapshots for one-command rollback |
policy.json |
Security rules — hard blocks + soft warnings (editable) |
staging/ |
Generated skills are staged here before promotion |
skill-forge/
├── SKILL.md # Skill manifest (frontmatter + docs)
├── skill.json # Machine-readable metadata
├── README.md # This file
├── CHANGELOG.md # Version history
├── scripts/
│ ├── _common.py # Shared utilities + constants
│ ├── observe.py # Stage 1: pattern detection
│ ├── crystallize.py # Stage 2: gene extraction
│ ├── generate.py # Stage 3: template-based generation
│ ├── scan.py # Stage 4: security scanning
│ ├── review.py # Stage 5: machine + human gates
│ ├── deploy.py # Stage 6: promotion to active
│ ├── rollback.py # Reverse any deployment
│ └── forge.py # Pipeline orchestrator
└── templates/
├── basic.py.tmpl # Python script template
└── basic-skill.md.tmpl # SKILL.md template
A gene is a reusable pattern extracted from user behavior. When Skill-Forge observes that you keep asking for similar things (e.g., "summarize my emails", "check the weather"), it crystallizes those into genes. Genes have confidence scores based on use count and success rate, and can be composed into new skills.
A capsule is the record of a generated skill as it moves through the pipeline. It tracks the skill's status (staged → promoted → rolled-back), scan results, review gates, file hashes, and timestamps. Think of it as a shipping container that wraps the skill from creation to deployment.
Every action in the pipeline is logged as a JSON line in events.jsonl. Each event includes a SHA-256 checksum and a reference to the previous event's checksum, forming a hash chain. This makes it possible to detect if someone tampered with the audit log.
- Python 3.10+
- OpenClaw workspace at
~/.openclaw/
MIT