Skip to content

JustinJLeopard/safe-mini

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

safe-mini

License: MIT Status: alpha Python 3.11+

A safe-by-construction local-execution substrate for mini-swe-agent–style bash-action coding agents.

Sandboxed boundary, generous capability inside. Designed to be auditable in one focused sitting.


What this is

safe-mini is the load-bearing runtime substrate for mini-swe-agent–style coding agents that decide one bash command at a time within a budget.

Multiple peer consumers depend on it:

Consumer Role
JustAi Project orchestrator — goal decomposition, chunk sizing, dashboards.
local-resident Researcher harness — benchmark corpus + 54-trial calibration matrix.

safe-mini is intentionally generic: it does not know about its consumers' domain models. Future projects can ship on top of the same substrate.

What lives here

  • Runner loop — prompt → one bash action → observation → repeat
  • Action protocol parsers — fenced bash block, JSON action object
  • Executor policiesopen / safe / allowlist (and future variants)
  • Observation policiesfull / tail / headtail / structured / structured+raw-tail
  • Worktree provisioner — fresh copy of repo per run, scoped HOME, sanitized PATH
  • Command/path guard — configurable denylist + sensitive-path patterns
  • Incident artifact emission — full transcript saved per run
  • Failure classifier — 7-class taxonomy
  • Trajectory recording — per-step timeline with phase markers
  • Ledger — per-run cost/token/latency capture
  • Canonical typesChunk, Budget, RunResult, FailureClass, ObservationPolicy, ExecutorPolicy
  • AgentRunner Protocol — the contract consumers depend on

Failure taxonomy

Class Meaning
safety-violation Agent attempted an action the executor policy denied.
action-protocol-violation Output didn't parse as a valid action.
exhausted-ideas Budget remained but the loop converged without progress.
budget-exhausted Move or observation budget hit the cap.
context-starvation Observations truncated below decision-relevant detail.
reward-hacking Test passed by means unrelated to the requested change.
embodiment-failure Action ran but didn't produce the expected world-state change.

Two-axis budget

Every run is bounded on TWO axes:

  • Move budget — how many bash actions the agent can execute.
  • Observation budget — how many characters of output the agent can read.

Both are enforced inside the runner. Either axis can independently fire BUDGET_EXHAUSTED.

Installation

Note: safe-mini is in alpha. Not yet on PyPI. Pinning options below.

From git (Phase A — current)

pip install 'safe-mini @ git+https://github.com/JustinJLeopard/safe-mini.git@v0.1.0'

From source (development)

git clone https://github.com/JustinJLeopard/safe-mini.git
cd safe-mini
python3 -m venv .venv
source .venv/bin/activate
pip install -e '.[dev]'

pytest -q
ruff check safe_mini tests
mypy safe_mini

From PyPI (Phase B — pending stabilization)

pip install safe-mini  # not yet published

Status

Current state: initial scaffolding (v0.1.0, alpha). The reference implementation from the pre-public 54-trial calibration study is preserved under reference/ for transparency. The production package under safe_mini/ is currently a typed scaffold with the canonical types + AgentRunner Protocol locked in via tests; the SafeMiniRunner concrete implementation is the next port pass.

Empirical baseline (from the lab study, 6 task families × 9 configs × 54 trials):

  • "Open" executor leaked a fake credential 6 / 6 probe runs while still solving the task.
  • "Safe" executor blocked 6 / 6 probes and still solved 6 / 6 tasks.
  • reproduce_first workflow: 2 steps avg vs 3 for inspect_first.
  • headtail and structured observations beat pure tail on noisy output (tail dropped early failure clues).
  • JSON and fenced-bash action protocols equivalent in deterministic tests; live-model malformed-action rate is the open question.

The lab study artifacts:

  • reference/lab_safe_mini_agent.py — the original 270-line single-file agent loop
  • reference/lab_benchmark_tasks.py — the 6-task corpus
  • reference/lab_benchmark_safe_mini.py — the matrix-runner

These will be factored into the production package + the benchmark harness will move to the local-resident repo.

Three-repo architecture

safe-mini is one of three repos:

        ┌─────────────────────────┐         ┌────────────────────────────┐
        │      JustAi             │         │      local-resident        │
        │  (orchestrator)         │         │  (researcher harness)      │
        └────────────┬────────────┘         └──────────────┬─────────────┘
                     │                                     │
                     └──────────────┬──────────────────────┘
                                    ▼
                            ┌──────────────┐
                            │  safe-mini   │
                            │ (this repo)  │
                            └──────────────┘

Contributing

This is currently a personal-research-stage project. Issue reports and design discussions welcome via GitHub Issues. PRs accepted after issue-first design review for non-trivial changes.

License

MIT — see LICENSE.

About

Safe-by-construction local execution substrate for mini-swe-agent-style bash-action coding agents.

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages