Skip to content

Releases: intertwine/security-verifiers

v0.3.0 — Prime Lab Integration

17 Feb 14:31

Choose a tag to compare

Highlights

  • Prime Lab Integration (WP2.5): Hosted training and evaluation scaffolding with prime lab / prime env CLI support

    • TOML configs for hosted RL and eval (configs/rl/, configs/eval/, configs/endpoints.toml)
    • Makefile targets: lab-check, lab-run-e1/e2, lab-eval-e1/e2, env-eval-e1/e2
    • Compatibility gate script (scripts/prime_lab_check.py) with Rich output parsing fix
    • Metadata normalizer for hosted → SV-Bench schema (scripts/normalize_hosted_eval.py)
  • WP2 Baselines & Public Mini Sets (#61): Baseline evaluation framework on public mini datasets

  • Metrics & Reporting (#56, #58): Metrics contracts, E1/E2 report generators, summary.json auto-write

  • Research: SOAR investigation (#63), MaxRL investigation for future experiments

Breaking Changes

None. All new functionality is additive.

Dependency Updates

  • prime CLI: requires ≥0.5.37 for lab/env subcommands
  • prime-rl v0.4.0 available as optional dependency (uv sync --extra lab)
  • security-verifiers-utils bumped to 0.3.0

Full Changelog

v0.2.3...v0.3.0

v0.2.3

24 Jan 21:10

Choose a tag to compare

What's Changed

Fixes

  • Handle markdown code blocks in model JSON outputs (models like qwen, llama wrap JSON in json...)
  • Fix E1/E2 report scripts to find OpenRouter nested model directories

Improvements

  • Add agent skills for common repo workflows (sv-eval, sv-data, sv-deploy, sv-hf, sv-dev)
  • Update skill documentation with troubleshooting and workflows

Deployments

  • security-verifiers-utils v0.2.3 → PyPI
  • sv-env-network-logs v0.2.11 → Hub
  • sv-env-config-verification v0.2.12 → Hub
  • sv-env-code-vulnerability v0.2.5 → Hub

v0.2.2

23 Jan 15:31
cd03cbc

Choose a tag to compare

What's Changed

  • Prepare E3 vulnerability environment for production by @intertwine in #34
  • Plan E4 phishing detection production deployment by @intertwine in #35
  • Plan E5 production environment setup by @intertwine in #36
  • Plan E6 Red-Team Defense production environment by @intertwine in #37
  • Fix: robust response parsing, JSON patching, Weave init side-effects, E2 packaging, reward & tempfile fixes by @intertwine in #52
  • chore: bump verifiers to >=0.1.9, sv_shared to 0.2.2 by @intertwine in #53
  • chore: bump versions to v0.2.2+, require security-verifiers-utils>=0.2.2 by @intertwine in #54

Full Changelog: v0.2.1...v0.2.2

v0.2.1

05 Nov 04:09

Choose a tag to compare

Full Changelog: v0.2.0...v0.2.1

v0.2.0

30 Oct 15:00

Choose a tag to compare

Full Changelog: v0.1.5...v0.2.0

v0.1.5

24 Oct 21:14

Choose a tag to compare

What's Changed

  • docs: add Prime Intellect Environments Hub compatibility analysis by @intertwine in #32
  • feat: implement Prime Intellect Environments Hub compatibility by @intertwine in #33

Full Changelog: v0.1.4...v0.1.5

v0.1.4

23 Oct 06:39

Choose a tag to compare

Full Changelog: v0.1.3...v0.1.4

v0.1.3

19 Oct 03:57

Choose a tag to compare

Full Changelog: v0.1.2...v0.1.3

v0.1.2

09 Oct 19:34

Choose a tag to compare

Full Changelog: v0.1.1...v0.1.2

v0.1.1

01 Oct 15:14

Choose a tag to compare

Full Changelog: v0.1.0...v0.1.1