Skip to content

dacuma-labs/agent-security-audit

Repository files navigation

🛡️ Agent Security Audit

Security audits for the AI agents you build.

Building agents with LangChain, CrewAI, OpenAI Agents, MCP, or any other framework? This skill lets your AI coding assistant audit the agent code you're developing — catching prompt injection risks, tool permission issues, memory poisoning, RAG vulnerabilities, and more — before they reach production.

Just ask your assistant to audit your project. It runs a static scanner on your codebase, interprets the findings in context, and produces a structured security report with actionable mitigations.

This skill follows the Agent Skills format.

This is a skill, not a standalone tool. Your AI coding assistant (GitHub Copilot, Claude, Cursor, etc.) consumes the skill instructions and drives the audit for you — you just ask in natural language.


Installation

npx skills add dacuma-labs/agent-security-audit

Or clone manually:

git clone https://github.com/dacuma-labs/agent-security-audit.git

The scanner requires Python 3.10+ with zero external dependencies (stdlib only).


Usage

Once installed, the skill is automatically available to your AI coding assistant. While you're developing your agent, ask it to run a security audit in natural language:

Example prompts:

Audit the security of my agent project
Check this codebase for prompt injection risks
Review tool permissions and human-in-the-loop controls
Harden my MCP server before deployment
Are there memory poisoning risks in this agent?

Use when you are developing:

  • AI agents with any SDK or framework (LangChain, CrewAI, OpenAI, MCP, etc.)
  • Tool-calling systems that need permission and validation checks
  • RAG pipelines that could be vulnerable to injection
  • Multi-agent architectures with trust boundary concerns
  • MCP servers or agentic APIs heading toward deployment

Do NOT use when:

  • Your project has no agentic components (no LLM calls, no tools, no prompts)
  • You need runtime testing or red teaming (this is static analysis only)
  • You need exploit generation or offensive security testing

Why this exists

If you're building AI agents, you're dealing with a new class of security risks that traditional linters and SAST tools don't catch: prompt injection, tool abuse, memory poisoning, RAG injection, permission sprawl, and multi-agent trust boundaries. These are risks unique to agentic code, and they need specific detection.

This skill bridges that gap. Install it, ask your assistant to audit, and get a professional security report without leaving your editor. It provides:

  1. A static scanner — a Python script that detects agentic security patterns across 20+ file types and 13 agent frameworks.
  2. Workflow instructions — a structured 5-step process that guides an AI agent from raw scan results to a professional security report.
  3. A classification taxonomy — severity levels, attack surfaces, and categories aligned with the OWASP Top 10 for LLM Applications (2025).

The scanner generates signals. The AI agent interprets them in context, filters false positives, and writes the report. This division keeps the scanner simple and deterministic while leveraging the agent's reasoning for nuanced judgment.


What it detects

19 agent-specific rules (AGT) covering prompt injection, tool abuse, memory poisoning, RAG injection, multi-agent trust, unbounded loops, and more. 8 general security rules (SEC) for secrets, deserialization, SSRF, and logging leaks. 6 heuristic checks (H) for TODOs, timeouts, unpinned dependencies, and skipped validation.

Full rule catalog (33 rules)

Agent-specific risks (AGT rules)

Rule Risk Surface
AGT001 Unsafe prompt concatenation with external input prompt
AGT002 Tool definition without parameter validation tools
AGT003 Destructive operation without human confirmation tools
AGT004 Model output consumed as instructions without validation prompt
AGT005 Memory write without sanitization memory
AGT006 Retrieved context injected without delimiters rag
AGT007 Agent with excessive tool scope tools
AGT008 Multi-agent delegation without boundary checks tools
AGT009 System message without trust boundary separation prompt
AGT010 File operation with externally influenced path filesystem
AGT011 Path construction with variable input filesystem
AGT012 Unbounded agent loop config
AGT013 Agent recursion without depth limit config
AGT014 Missing token or cost budget config
AGT015 Code execution without sandbox tools
AGT016 MCP/tool server without authentication tools
AGT017 Agent output to downstream system without guardrails prompt
AGT018 Agent HTTP endpoint without rate limiting network
AGT019 LLM model version not pinned config

General security rules (SEC rules)

Rule Risk Surface
SEC001 Hardcoded secret or credential in source secrets
SEC002 Dynamic code or command execution tools
SEC003 Unsafe YAML deserialization config
SEC004 Unsafe deserialization of untrusted data config
SEC005 HTTP request without URL restriction (SSRF) network
SEC006 Sensitive data leaking through logging logs
SEC007 Debug mode enabled config
SEC008 SSL/TLS verification disabled config

Heuristic checks (H rules)

Rule Risk Surface
H001 Unresolved security TODO config
H002 Timeout disabled or set to zero network
H003 Validation explicitly skipped tools
H004 Secret value in .env file secrets
H005 Unpinned dependency version dependencies
H006 Wildcard dependency in package.json dependencies

Supported frameworks

The scanner automatically detects and profiles projects using:

LangChain · LangGraph · OpenAI Agents · CrewAI · AutoGen · LlamaIndex · Semantic Kernel · MCP · Vercel AI SDK · Anthropic SDK · Google ADK · Pydantic AI · Mastra

Detection details
Framework Detection
LangChain langchain imports and usage
LangGraph langgraph imports
OpenAI Agents tool_call, function_call, OpenAI + tools patterns
CrewAI crewai imports
AutoGen autogen imports
LlamaIndex llamaindex imports
Semantic Kernel semantic_kernel imports
MCP (Model Context Protocol) McpServer, @modelcontextprotocol/, MCP transport patterns
Vercel AI SDK @ai-sdk/, createAgent
Anthropic SDK Anthropic + tool patterns, Claude tool_use
Google ADK google.adk, Gemini + tool patterns
Pydantic AI pydantic_ai imports
Mastra mastra, @mastra/ imports

Detection feeds the project profile, which determines the agentic_level (high / medium / low) and informs the agent's risk assessment.


Supported languages

Python, TypeScript, JavaScript, Go, Java, Ruby, PHP, C#, Rust, Kotlin, Swift, Shell, plus config formats: JSON, YAML, TOML, INI, Markdown, .env.

Comment stripping is applied for Python, C-family languages (TS, JS, Go, Java, C#, Rust, Kotlin, Swift, PHP), and Shell scripts to reduce false positives from commented-out code.


Skill structure

Each skill contains:

├── SKILL.md                    # Workflow instructions for the AI agent
├── report-template.md          # Report structure the agent must follow
├── scripts/
│   └── agent_security_audit.py # Static scanner (zero dependencies, stdlib only)
├── reference/
│   ├── taxonomy.md             # Severity, surfaces, categories, JSON schema
│   └── owasp-llm-top-10.md    # OWASP LLM Top 10 (2025) mapping
└── examples/
    └── sample-output.json      # Calibration reference for report generation

How it works

┌──────────────┐     JSON      ┌──────────────┐      Report     ┌────────────┐
│ Static       │──────────────▶│ AI Agent     │────────────────▶│ Security   │
│ Scanner      │  (signals)    │ (SKILL.md)   │  (structured)   │ Report     │
│ (Python)     │               │              │                 │ (Markdown) │
└──────────────┘               └──────────────┘                 └────────────┘
       │                              │
   Detects:                     Interprets:
   • Framework patterns         • Filters false positives
   • Security signals           • Classifies real risks
   • Agentic indicators         • Maps to OWASP LLM Top 10
   • Comment-aware scan         • Writes actionable report

Step 1 — The agent runs the scanner on the target project. The scanner walks the file tree, applies 25+ regex rules and heuristics, builds a project profile, and returns structured JSON with findings, summaries, and recommendations.

Step 2 — The agent reviews the project profile: what frameworks are used, what agentic patterns exist, and how autonomous the system is.

Step 3 — The agent filters false positives using project context. The scanner is intentionally sensitive — it flags signals, not verdicts.

Step 4 — The agent classifies each remaining finding as a real risk, potential concern, or weak signal, considering blast radius and existing mitigations.

Step 5 — The agent writes the final report following the report template, including executive summary, findings by severity, mitigations, and verification steps.


OWASP LLM Top 10 alignment

Every finding category maps to the OWASP Top 10 for LLM Applications (2025):

Audit category OWASP LLM risk
prompt_injection, prompt_concatenation LLM01 — Prompt Injection
rag_injection LLM01 — Prompt Injection, LLM08 — Vector Weaknesses
secret_exposure, logging_leak LLM02 — Sensitive Information Disclosure
memory_poisoning LLM04 — Data and Model Poisoning
unsafe_execution, tool_abuse, output_trust LLM05 — Improper Output Handling
permission_sprawl, missing_confirmation, tool_no_schema, multi_agent_trust LLM06 — Excessive Agency
hardening_gap LLM10 — Unbounded Consumption

Full mapping in reference/owasp-llm-top-10.md.


Scanner details

CLI options and exit codes
# JSON output (for agent consumption)
python scripts/agent_security_audit.py <project-path> --format json

# Human-readable text output
python scripts/agent_security_audit.py <project-path> --format text

# Save JSON to file
python scripts/agent_security_audit.py <project-path> --json report.json

# Exclude directories
python scripts/agent_security_audit.py <project-path> --exclude node_modules,.venv,dist

# Scan a single file
python scripts/agent_security_audit.py <project-path>/src/agent.py --format json
Exit code Meaning
0 No critical or high severity findings
1 Critical or high severity findings detected
2 Error (e.g. target path does not exist)

Design principles

  • Defensive only — no red teaming, no exploit generation, no code execution, no network access.
  • Signals, not verdicts — the scanner is intentionally sensitive. Contextual judgment is delegated to the AI agent.
  • Agent-first output — structured JSON designed for LLM consumption, not human dashboards.
  • Zero dependencies — the scanner runs anywhere Python 3.10+ is available, with no pip install needed.
  • OWASP-aligned — every category maps to the LLM Top 10 (2025) for industry-standard risk communication.

References

Resource Description
SKILL.md Full workflow instructions for the AI agent
report-template.md Report structure to follow
reference/taxonomy.md Classification system: severity, surfaces, categories
reference/owasp-llm-top-10.md OWASP LLM Top 10 (2025) mapping
examples/sample-output.json Sample scanner output for calibration
Agent Skills format Skill packaging standard

Contributing

Contributions are welcome! This skill is only as good as its rule catalog. If you have ideas for new AGT or SEC rules, please open an issue or a PR.

Check out our Contributing Guide to get started.


License

MIT

Releases

No releases published

Packages

 
 
 

Contributors

Languages