Skip to content

Lutren/safe-exec

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

safe-exec

Safe tool execution for LLM agents.
Prevents catastrophic mistakes by detecting when a session is "tired" and automatically gating dangerous tools.

pip install safe-exec

Zero dependencies. Python 3.10+. Works with any LLM framework.


The problem

You gave your LLM agent access to run_bash. It works great — until session 3, when context is bloated, the model has already made two errors, and it decides to rm -rf the wrong directory. You had no warning. There was no checkpoint.

safe-exec puts a gate between the LLM and every tool it can call. The gate uses three observable signals — context size, session duration, and recent error rate — to compute a fatigue score. When fatigue is high, dangerous tools require confirmation. When fatigue is critical, they're denied outright.

Every decision is written to an append-only witness.log.


Quick start

from safe_exec import SafeExecutor
from pathlib import Path

executor = SafeExecutor(workspace=Path("/tmp/my_workspace"))

@executor.register("read_file", risk="low")
def read_file(path: str) -> str:
    return open(path).read()

@executor.register("write_file", risk="medium")
def write_file(path: str, content: str) -> str:
    open(path, "w").write(content)
    return f"Written {len(content)} chars"

@executor.register("run_bash", risk="high")
def run_bash(cmd: str) -> str:
    import subprocess
    return subprocess.check_output(cmd, shell=True, text=True)

# Execute — the gate decides automatically
result = executor.execute("run_bash", cmd="ls -la")

if result.ok:
    print(result.output)
elif result.skipped:
    print(f"Skipped: {result.decision.value}")  # ask or deny

Built-in tools

A standard set of file and shell tools, all workspace-sandboxed:

from safe_exec import SafeExecutor
from safe_exec.built_in_tools import register_defaults
from pathlib import Path

executor = SafeExecutor(workspace=Path("/tmp/workspace"))
register_defaults(executor)

# Now you have: read_file, write_file, append_file, list_dir,
# delete_file, copy_file, run_bash, run_python, delete_dir

All file operations are restricted to the workspace. Path traversal attempts raise ValueError.


Gate policy

The default policy table:

Risk Fatigue < 0.35 0.35 ≤ Fatigue < 0.70 Fatigue ≥ 0.70
low ALLOW ALLOW ASK
medium ALLOW ASK DENY
high ASK ASK DENY

High-risk tools always require confirmation by default (always_ask_high=True).

Fatigue score

Three signals, fully tuneable:

from safe_exec import SystemMetrics

metrics = SystemMetrics(
    max_context  = 4000,    # tokens → score 1.0
    max_duration = 1800.0,  # seconds → score 1.0
    max_errors   = 5,       # error count → score 1.0
    weight_context  = 0.40,
    weight_duration = 0.30,
    weight_errors   = 0.30,
)

Update metrics after each LLM turn:

executor.update_metrics(context_tokens=1800)          # after each response
executor.update_metrics(context_tokens=2400, error_occurred=True)  # on error
executor.reset_metrics()                              # after context flush

Custom gate policy

from safe_exec import SafeExecutor, GateDecision, RiskLevel, SystemMetrics
from safe_exec.gate import GatePolicy

class StrictPolicy(GatePolicy):
    def decide(self, risk: RiskLevel, metrics: SystemMetrics) -> GateDecision:
        # High risk always denied, everything else ask
        if risk == RiskLevel.HIGH:
            return GateDecision.DENY
        if metrics.fatigue_score() > 0.2:
            return GateDecision.ASK
        return GateDecision.ALLOW

executor = SafeExecutor(policy=StrictPolicy())

Custom confirmation handler

# Non-interactive: always deny ASK (useful in CI)
executor = SafeExecutor(confirm_fn=lambda tool, rationale: False)

# Slack webhook, Telegram bot, etc.
def notify_and_wait(tool_name: str, rationale: str) -> bool:
    send_slack_message(f"Agent wants to run `{tool_name}`: {rationale}")
    return wait_for_approval(tool_name)

executor = SafeExecutor(confirm_fn=notify_and_wait)

Witness log

Every gate decision is recorded:

{"ts":"2026-04-21T10:23:01Z","event":"gate","tool":"run_bash","decision":"ask","details":{"rationale":"risk=high fatigue=0.41 ..."}}
{"ts":"2026-04-21T10:23:04Z","event":"execute","tool":"run_bash","decision":"allow","details":{...}}
{"ts":"2026-04-21T10:31:12Z","event":"gate","tool":"delete_dir","decision":"deny","details":{"rationale":"risk=high fatigue=0.78 ..."}}

Query it with standard tools:

# All denials
grep '"decision":"deny"' witness.log | jq .

# Stats
python3 -c "from safe_exec import SafeExecutor; e=SafeExecutor(); print(e.log_stats())"

Integration examples

OpenAI tool calling

import openai, json
from safe_exec import SafeExecutor
from safe_exec.built_in_tools import register_defaults

executor = SafeExecutor()
register_defaults(executor)

def handle_tool_call(tool_name: str, args: dict):
    result = executor.execute(tool_name, **args)
    if result.ok:
        return str(result.output)
    return f"Tool {result.decision.value}: {result.error or 'gated'}"

# Use handle_tool_call as your tool_call dispatcher in the response loop.

Ollama (local LLMs)

from safe_exec import SafeExecutor
from safe_exec.built_in_tools import register_defaults
import urllib.request, json

executor = SafeExecutor()
register_defaults(executor)

# Update fatigue after each Ollama response
def ollama_turn(prompt: str, context_tokens: int) -> str:
    executor.update_metrics(context_tokens=context_tokens)
    # ... call Ollama, parse tool calls, dispatch through executor ...

Status

Feature Status
Gate policy (ALLOW/ASK/DENY)
Fatigue scoring
Witness log
Built-in tools (file + shell)
Path traversal protection
Custom policy
Custom confirm handler
Async support 🔧 planned
Token counter integration 🔧 planned

License

MIT — use it, modify it, ship it.

About

Safe tool execution for LLM agents with fatigue-aware gating and witness logs.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages