Human-in-the-loop execution for LLM agents
-
Updated
Jan 11, 2026 - Python
Human-in-the-loop execution for LLM agents
Guardrails for LLMs: detect and block hallucinated tool calls to improve safety and reliability.
🛡️ Safe AI Agents through Action Classifier
The missing safety layer for AI Agents. Adaptive High-Friction Guardrails (Time-locks, Biometrics) for critical operations to prevent catastrophic errors.
Runtime detector for reward hacking and misalignment in LLM agents (89.7% F1 on 5,391 trajectories).
A runtime authorization layer for LLM tool calls policy, approval, audit logs.
Safety-first agentic toolkit: 10 packages for collapse detection, governance, and reproducible runs.
A2A version of Agent Action Guard: Safe AI Agents through Action Classifier
An open-source engineering blueprint for defining and designing the core capabilities, boundaries, and ethics of any AI agent.
Energy based legality gating SDK for AI reasoning. Predicts, repairs, and audits collapse before it happens; reduces hallucinations and provides numeric audit logs.
A hierarchical AI safety architecture with asymmetric supervisory control.
Canonical texts and implementation primitives for the Safe Superintelligence Framework (v1.2.1): Constitution, Minimum Rescue Protocol, system prompt, decision matrix.
🛡️ Safeguard AI agents from harmful actions with A2A-Agent-Action-Guard, ensuring safe tool usage through effective action classification.
A security-first control plane for autonomous AI code agents: sandboxed execution, hash grounding, diff validation, verification, and full auditability.
Production-ready safety framework preventing identity fusion, synthetic intimacy, and unbounded behavior in AI agent systems. Machine-readable contracts and verse-lang primitives for immediate deployment.
PULSE • Deterministic, fail‑closed release gates for Safe & Useful AI — CI‑enforced, audit‑ready (status.json + Quality Ledger).
🌌 Unify and enhance simulations with Negentropy Constellation, a monorepo of ten robust packages designed for reproducibility and real-world insight.
External kill switch for autonomous runtimes. Validate at enforcement boundaries. Revoke to halt execution.
Add a description, image, and links to the agent-safety topic page so that developers can more easily learn about it.
To associate your repository with the agent-safety topic, visit your repo's landing page and select "manage topics."