Skip to content

pingchesu/hermes-curator-evolver

Repository files navigation

🧬 Hermes Curator Evolver

Make Hermes skills improve from real usage — with evidence, review, and rollback.

Inspired by SkillClaw, adapted for Hermes Agent as a local-first plugin:
session evidence in, safer skill updates out.

Hermes Agent Inspired by SkillClaw AI Skills Agents Python SQLite Safety License

📚 Session evidence 📥 Backfill today 🧠 Optional semantic search 🛡️ Guarded automation
Learn from real Hermes work Import old session_*.json history Embedding + rerank only when selected Append-only notes, backups, rollback

Contents

Quick start: install, backfill, autorun

Copy, paste, done. bootstrap handles the noisy parts: backfill old sessions + enable daily safe autorun.

hermes plugins install pingchesu/hermes-curator-evolver --enable
uv pip install --python ~/.hermes/hermes-agent/venv/bin/python -e ~/.hermes/plugins/curator-evolver
hermes-curator-evolver bootstrap

That is the default, model-free path. It writes only low-risk append-only notes to local agent-created skills. Official/bundled, hub-installed, plugin-provided, skills.external_dirs, pinned, and unknown-source skills are skipped.

Want multilingual semantic/rerank ordering? Make the opt-in explicit:

uv pip install --python ~/.hermes/hermes-agent/venv/bin/python -e "$HOME/.hermes/plugins/curator-evolver[semantic]"
hermes-curator-evolver bootstrap --semantic

Quick checks:

hermes-curator-evolver status
systemctl --user list-timers 'hermes-curator-evolver*' --all --no-pager

If Hermes gateway was already running, restart it once so plugin hooks are loaded. For health checks, timer logs, model details, and uninstall steps, see docs/after-install.md.

At a glance

1. Collect 2. Rank 3. Improve 4. Protect
Tool calls + skill loads + old sessions Evidence counts; optional Qwen + bge rerank Daily append-only notes Only local agent-created skills are writable
flowchart LR
    S[Hermes sessions + tool calls] --> DB[(SQLite evidence)]
    DB --> T[daily bootstrap timer]
    T --> A[append notes to local agent-created skills]
    T -. skip .-> P[official / hub / external / pinned skills]
    A --> B[backup + rollback manifest]
Loading
User concern Short answer
Will it run by itself? Yes. bootstrap enables a daily user-level timer.
Will it rewrite my skills? No. Autorun only updates a managed append-only block.
Will it touch official/team skills? No. Provenance gate skips bundled, hub, plugin, and external_dirs skills.
Can I inspect first? Yes. auto-run --format json is dry-run by default.

Why this exists

Hermes skills are operational memory. They capture how an agent should debug, deploy, research, and communicate in a real environment. But memory decays: stale commands, duplicated workflows, missing caveats, weak trigger descriptions, and hard-won lessons trapped in old session logs.

Hermes Curator Evolver closes that loop: session evidence in, safer skill updates out — without patching Hermes core or silently rewriting your skill library.

Inspired by SkillClaw, made Hermes-native

SkillClaw showed the right idea: agents should evolve skills from session trajectories. Hermes Curator Evolver adapts that idea to a local-first Hermes plugin.

SkillClaw lesson Hermes-native adaptation
Learn from sessions. Runtime hooks + historical backfill feed local SQLite evidence.
Retrieve similar skills before editing. Lexical search by default; optional Qwen embeddings + bge reranking.
Verify skill changes. Dry-run proposals, verifier gates, exact SHA match, backups, rollback.
Avoid uncontrolled mutation. No Hermes core patches, pinned skills are skipped, official/hub/external/plugin skills are protected from unattended writes, autorun is append-only.

Architecture

See docs/architecture.md for the one-page architecture diagram, model usage plan, and safety boundary. See docs/after-install.md for the post-install autorun guide, health checks, uninstall path, and supported models.

flowchart LR
    H[Hermes runtime] --> P[curator-evolver plugin]
    P --> DB[(local SQLite evidence)]
    DB --> R[reports]
    R --> Proposal[dry-run proposals]
    Proposal --> Verify[verifier gate]
    Verify --> Human[human approval]
    Human --> Apply[guarded apply + rollback]
    DB --> Auto[auto-run low-risk append]
    Auto --> Apply
Loading

Model usage plan

Phase Model Purpose Default
v0.1 None Evidence collection and report aggregation. Local/read-only.
v0.2 Hermes configured chat model Draft improvement proposals from evidence + skill text. Optional --draft-with-model; dry-run artifact; no skill writes.
v0.2 Deterministic verifier + future verifier prompt Check grounding, safety, and non-destructive behavior. Blocks mutation by default.
v0.3/v0.5 Qwen/Qwen3-Embedding-0.6B Candidate skill/evidence/user-correction search. Optional --execute-semantic; no default download.
v0.3/v0.5 BAAI/bge-reranker-v2-m3 Re-rank candidates, especially for mixed Chinese/English agent workflows. Optional --rerank; no default download.
v0.4 Verifier + local validation command Guard final reviewed content before apply. Requires approval, backup, verification, rollback.
v0.6 None by default Automatic low-risk append-only skill updates from observed evidence. Optional install-auto; no Hermes core modification.
v0.7 Qwen/Qwen3-Embedding-0.6B + BAAI/bge-reranker-v2-m3 Optional model-assisted autorun candidate ordering. Explicit --semantic-candidates --rerank-candidates; models only reorder evidence-eligible candidates.
v0.9 None Provenance-safe unattended auto-apply. Writes only local agent-created skills; skips bundled, hub, plugin, external, pinned, and unknown sources.
v0.10 None by default One-command setup and clearer public README. bootstrap backfills sessions and installs/enables autorun; bootstrap --semantic is explicit model opt-in.

Safety model

The guarded path requires:

  1. evidence report,
  2. dry-run proposal,
  3. verifier pass,
  4. human-reviewed content,
  5. exact target SHA256 match,
  6. explicit --approve,
  7. backup manifest,
  8. optional validation command,
  9. rollback path.

Hard defaults:

  • ✅ Evidence/report/proposal/candidate commands do not mutate skills.
  • ✅ Semantic mode does not download models by default; --execute-semantic / --rerank are explicit opt-ins.
  • ✅ Apply refuses to run without --approve.
  • ✅ Apply refuses if the target SHA256 changed.
  • ✅ Apply creates a backup before writing.
  • ✅ Failed validation auto-restores the backup.
  • auto-run writes only managed append-only blocks and still requires both --apply-low-risk and --approve-auto-apply before mutation.
  • ✅ Even with both write flags, unattended auto-apply writes only local agent-created skills. Official/bundled skills (.bundled_manifest), hub-installed skills (.hub/lock.json), plugin-provided skills, skills.external_dirs, pinned skills, and unknown sources are skipped.
  • --semantic-candidates / --rerank-candidates are explicit opt-ins and only reorder skills that already passed the evidence threshold.

CLI reference

# One-command bootstrap
hermes-curator-evolver bootstrap
hermes-curator-evolver bootstrap --semantic
hermes-curator-evolver bootstrap --format json

# Evidence
hermes-curator-evolver status
hermes-curator-evolver report --days 7 --format json
hermes-curator-evolver backfill-sessions --sessions-dir ~/.hermes/sessions --days 30 --format json
hermes-curator-evolver analyze --skill hermes-agent --days 30

# Proposal + verifier
hermes-curator-evolver propose --skill hermes-agent --skill-file ./SKILL.md --format json --output proposal.json
hermes-curator-evolver propose --skill hermes-agent --skill-file ./SKILL.md --draft-with-model --model-timeout 180
hermes-curator-evolver verify --proposal-file proposal.json --skill hermes-agent --format json

# Candidate generation
hermes-curator-evolver candidates --query "gateway restart plugin cli" --skills-dir ~/.hermes/skills
hermes-curator-evolver candidates --query "中文 mixed agent skill" --skills-dir ~/.hermes/skills --semantic --format json       # plan only
hermes-curator-evolver candidates --query "中文 mixed agent skill" --skills-dir ~/.hermes/skills --execute-semantic --format json
hermes-curator-evolver candidates --query "中文 mixed agent skill" --skills-dir ~/.hermes/skills --execute-semantic --rerank --format json

# Guarded apply
sha256sum ./SKILL.md
hermes-curator-evolver apply \
  --target ./SKILL.md \
  --content-file ./reviewed-SKILL.md \
  --expected-sha256 <current-sha256> \
  --backup-dir .curator-evolver-backups \
  --verify-command "python -m pytest -q" \
  --approve

# Rollback
hermes-curator-evolver rollback --manifest .curator-evolver-backups/<timestamp>/manifest.json

# Automatic evolution
hermes-curator-evolver auto-run --skills-dir ~/.hermes/skills --format json                  # dry-run
hermes-curator-evolver auto-run --skills-dir ~/.hermes/skills --semantic-candidates --rerank-candidates --format json
hermes-curator-evolver auto-run --skills-dir ~/.hermes/skills --apply-low-risk --approve-auto-apply
hermes-curator-evolver auto-run --skills-dir ~/.hermes/skills --semantic-candidates --rerank-candidates --apply-low-risk --approve-auto-apply
hermes-curator-evolver auto-run --skills-dir ~/.hermes/skills --apply-low-risk --approve-auto-apply --block-auto-apply-skill 'github-*'
hermes-curator-evolver auto-run --skills-dir ~/.hermes/skills --apply-low-risk --approve-auto-apply --allow-auto-apply-skill store-playbook  # only within local agent-created source boundary
hermes-curator-evolver install-auto --schedule daily --enable
hermes-curator-evolver install-auto --schedule daily --enable --semantic-candidates --rerank-candidates
hermes-curator-evolver uninstall-auto

Contributing

Contributions are welcome. See CONTRIBUTING.md for local setup, TDD expectations, PR checklist, smoke tests, and CI behavior.

Credits and inspiration

Inspired by SkillClaw — especially the idea that agent skills should evolve from real session evidence, not only from hand-written maintenance. Hermes Curator Evolver keeps that inspiration, but applies it through Hermes-native plugin hooks, local SQLite evidence, explicit model opt-ins, and conservative guarded writes.

Uninstall

Hermes already provides plugin removal:

hermes plugins disable curator-evolver
hermes plugins uninstall curator-evolver   # alias: remove/rm

If you enabled the optional auto-evolve timer, remove it first:

hermes-curator-evolver uninstall-auto

Plugin removal does not delete historical evidence by default. Remove it manually only if you want a clean slate:

rm -rf ~/.hermes/plugins/curator-evolver/data ~/.hermes/plugins/curator-evolver/backups

Agent tool

When enabled, Hermes can call:

curator_evidence_report

to retrieve a JSON evidence report.

Install from source

git clone https://github.com/pingchesu/hermes-curator-evolver.git
cd hermes-curator-evolver
python -m pip install -e .
hermes plugins enable curator-evolver

If your Hermes environment does not provide pip, use:

uv pip install -e .

Directory-plugin install

You can also symlink this repository into the Hermes plugin directory:

mkdir -p ~/.hermes/plugins
ln -s /path/to/hermes-curator-evolver ~/.hermes/plugins/curator-evolver
hermes plugins enable curator-evolver

Data location

Default:

~/.hermes/plugins/curator-evolver/data/evidence.sqlite

Override:

export HERMES_CURATOR_EVOLVER_DB=/custom/path.sqlite

Roadmap status

  • v0.1 — evidence/report plugin.
  • v0.2 — proposal generation + verifier gate, dry-run by default.
  • v0.3 — candidate generation interface with optional embedding/reranker model plan.
  • v0.4 — guarded apply with explicit approval, backup, verification, and rollback.
  • v0.5 — explicit model execution paths: Hermes chat-model drafts, Qwen embedding candidate ranking, and bge reranking.
  • v0.6 — plug-and-play auto-run + optional systemd timer for low-risk append-only skill improvements without Hermes core changes.
  • v0.7 — explicit --semantic-candidates / --rerank-candidates for model-assisted autorun candidate ordering.
  • v0.8backfill-sessions for existing Hermes history, CONTRIBUTING.md, and GitHub Actions CI.
  • v0.9 — provenance-safe autorun: only local agent-created skills can be auto-applied; bundled, hub, plugin, external, pinned, and unknown sources are skipped.
  • v0.10bootstrap one-command setup plus a shorter, visual quick start.

Built for people who want agent skills to improve — without letting automation silently rewrite the library.

Releases

No releases published

Packages

 
 
 

Contributors

Languages