agent-safety

Here are 18 public repositories matching this topic...

corv89 / shannot

Human-in-the-loop execution for LLM agents

python linux cli security devops automation mcp sandbox sysadmin python3 developer-tools human-in-the-loop llm llm-agents agent-safety supervised-execution

Updated Jan 11, 2026
Python

SafellmHub / hguard-go

Star

Guardrails for LLMs: detect and block hallucinated tool calls to improve safety and reliability.

middleware machine-learning ai language-models ai-safety prompt-engineering llms toolformer hallucination-detection tool-calling agent-safety

Updated Jul 18, 2025
Go

Pro-GenAI / Agent-Action-Guard

Star

🛡️ Safe AI Agents through Action Classifier

Updated Dec 16, 2025
Python

Agent-Sudo-Org / agent-sudo

Star

The missing safety layer for AI Agents. Adaptive High-Friction Guardrails (Time-locks, Biometrics) for critical operations to prevent catastrophic errors.

ai-safety human-in-the-loop ai-agents guardrails llm-security agent-security agent-safety

Updated Jan 28, 2026
TypeScript

aerosta / rewardhackwatch

Star

Runtime detector for reward hacking and misalignment in LLM agents (89.7% F1 on 5,391 trajectories).

nlp machine-learning monitoring deep-learning transformers pytorch alignment ai-safety fastapi huggingface streamlit distilbert llm rlhf llm-agents agent-safety reward-hacking misalignment

Updated Dec 11, 2025
Python

lemnk / Sudo-agent

Star

A runtime authorization layer for LLM tool calls policy, approval, audit logs.

python agent security authorization developer-tools human-in-the-loop policy-engine jsonl runtime-security audit-logging guardrails llm agent-safety

Updated Jan 31, 2026
Python

Maxbanker / negentropy-constellation

Star

Safety-first agentic toolkit: 10 packages for collapse detection, governance, and reproducible runs.

benchmark time-series simulation reliability observability governance ethics anomaly-detection mlops agent-safety

Updated Dec 9, 2025
Python

Pro-GenAI / A2A-Agent-Action-Guard

Star

A2A version of Agent Action Guard: Safe AI Agents through Action Classifier

Updated Dec 14, 2025
Python

KarmaKoala / The-Agent-Genome-Project

Star

An open-source engineering blueprint for defining and designing the core capabilities, boundaries, and ethics of any AI agent.

protocol specification standard autonomous-agents dev-tools agp ai-ethics agent-framework ai-agent agent-design llm llm-agents agent-architecture agent-safety

Updated Sep 6, 2025

Skwert001 / Reams-Legality-Gate

Star

Energy based legality gating SDK for AI reasoning. Predicts, repairs, and audits collapse before it happens; reduces hallucinations and provides numeric audit logs.

middleware reliability audit compliance observability control-theory ai-safety llm reasoning-language-models agent-safety

Updated Oct 25, 2025

teamact21-source / hierarchical-ai-safety-architecture

Star

A hierarchical AI safety architecture with asymmetric supervisory control.

architecture alignment governance ai-safety interpretability oversight robust-ai hierarchical-control ai-governance auditability agent-safety

Updated Jan 4, 2026

minrescue / safe-superintelligence-framework

Star

Canonical texts and implementation primitives for the Safe Superintelligence Framework (v1.2.1): Constitution, Minimum Rescue Protocol, system prompt, decision matrix.

ai-safety risk-management ai-alignment responsible-ai ai-governance system-prompt auditability agent-safety minimum-rescue

Updated Jan 3, 2026

ggsaad82 / A2A-Agent-Action-Guard

Star

🛡️ Safeguard AI agents from harmful actions with A2A-Agent-Action-Guard, ensuring safe tool usage through effective action classification.

python agent ai hackathon agents ai-safety ai-agents hackathon-project ai-research ai-ethics ai-agent ai-monitoring ai-evaluation generative-ai gen-ai llms-benchmarking agentic-ai agent-safety

Updated Feb 2, 2026
Python

parthamehta123 / safeagent

Star

A security-first control plane for autonomous AI code agents: sandboxed execution, hash grounding, diff validation, verification, and full auditability.

python mcp devtools developer-tools observability autonomous-agents ai-agents fastapi secure-ai llm ai-infrastructure agent-safety

Updated Jan 22, 2026
Python

TheNovacene / verse-ality-agents

Sponsor

Star

Production-ready safety framework preventing identity fusion, synthetic intimacy, and unbounded behavior in AI agent systems. Machine-readable contracts and verse-lang primitives for immediate deployment.

Updated Feb 2, 2026
Python

HKati / pulse-release-gates-0.1

Star

PULSE • Deterministic, fail‑closed release gates for Safe & Useful AI — CI‑enforced, audit‑ready (status.json + Quality Ledger).

Updated Feb 2, 2026
Python

ckaudwns / negentropy-constellation

Star

🌌 Unify and enhance simulations with Negentropy Constellation, a monorepo of ten robust packages designed for reproducibility and real-world insight.

benchmark time-series simulation reliability observability governance ethics anomaly-detection mlops agent-safety

Updated Feb 2, 2026
Python

machineid-io / machineid-runtime-kill-switch

Star

External kill switch for autonomous runtimes. Validate at enforcement boundaries. Revoke to halt execution.

python infrastructure incident-response machineid execution-control runtime-enforcement fail-closed agent-safety autonomous-execution runtime-kill-switch non-cooperative-control external-control-plane

Updated Dec 23, 2025
Python

Improve this page

Add a description, image, and links to the agent-safety topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the agent-safety topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

agent-safety

Here are 18 public repositories matching this topic...

corv89 / shannot

SafellmHub / hguard-go

Pro-GenAI / Agent-Action-Guard

Agent-Sudo-Org / agent-sudo

aerosta / rewardhackwatch

lemnk / Sudo-agent

Maxbanker / negentropy-constellation

Pro-GenAI / A2A-Agent-Action-Guard

KarmaKoala / The-Agent-Genome-Project

Skwert001 / Reams-Legality-Gate

teamact21-source / hierarchical-ai-safety-architecture

minrescue / safe-superintelligence-framework

ggsaad82 / A2A-Agent-Action-Guard

parthamehta123 / safeagent

TheNovacene / verse-ality-agents

HKati / pulse-release-gates-0.1

ckaudwns / negentropy-constellation

machineid-io / machineid-runtime-kill-switch

Improve this page

Add this topic to your repo