Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
d103e53
feat(skill): introduce `owasp-ml`
JasonTheDeveloper Mar 27, 2026
2d60264
feat(agents): add reference to `owasp-ml` skill
JasonTheDeveloper Mar 27, 2026
d9f7f15
feat(agent): update `CUSTOM-AGENTS.md` to add reference to `owasp-ml`
JasonTheDeveloper Mar 27, 2026
8b2e26a
feat(prompt): update `targetSkill` to allow `owasp-ml` as an option
JasonTheDeveloper Mar 27, 2026
54ed854
feat(collections): add `owasp-ml`
JasonTheDeveloper Mar 27, 2026
327b6e5
feat(plugins): add `owasp-ml`
JasonTheDeveloper Mar 27, 2026
a2ae835
style(format): replace `-` with `*` and add newline under heading
JasonTheDeveloper Mar 28, 2026
896e81f
Merge branch 'main' into feat/1205
JasonTheDeveloper Mar 31, 2026
1ed8db9
Merge branch 'main' into feat/1205
JasonTheDeveloper Mar 31, 2026
7c487a7
feat(skills): set `user-invocable: false`
JasonTheDeveloper Apr 1, 2026
bbd34f5
feat(skills): append description with `Brought to you by microsoft/hv…
JasonTheDeveloper Apr 1, 2026
4a3ed55
Merge branch 'main' into feat/1205
JasonTheDeveloper Apr 2, 2026
e097c54
fix(docs): update `collectionCards.ts` with updated artifact count
JasonTheDeveloper Apr 2, 2026
f760a0a
Merge branch 'main' into feat/1205
JasonTheDeveloper Apr 2, 2026
ce46b6a
docs(collections): include `owasp-ml` skill in `security.collection.md`
JasonTheDeveloper Apr 2, 2026
3b509df
docs(plugin): update readme
JasonTheDeveloper Apr 2, 2026
75de93a
Merge branch 'main' into feat/1205
JasonTheDeveloper Apr 2, 2026
212e998
Merge branch 'main' into feat/1205
JasonTheDeveloper Apr 2, 2026
840c337
Merge branch 'main' into feat/1205
WilliamBerryiii Apr 2, 2026
37ac2de
fix(skill): update license to reflect source material license
JasonTheDeveloper Apr 3, 2026
988e0b9
Merge branch 'main' into feat/1205
JasonTheDeveloper Apr 7, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/CUSTOM-AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -355,7 +355,7 @@ Users are responsible for verifying their repository's `.gitignore` configuratio

**Subagents:** Codebase Profiler, Skill Assessor, Finding Deep Verifier, Report Generator

**Critical:** Orchestrator-only pattern. Delegates codebase profiling, skill assessment, adversarial finding verification, and report generation to specialized subagents. Uses OWASP skills (`owasp-agentic`, `owasp-llm`, `owasp-top-10`) for vulnerability references. Supports incremental comparison with prior scan reports.
**Critical:** Orchestrator-only pattern. Delegates codebase profiling, skill assessment, adversarial finding verification, and report generation to specialized subagents. Uses OWASP skills (`owasp-agentic`, `owasp-llm`, `owasp-top-10`, `owasp-ml`) for vulnerability references. Supports incremental comparison with prior scan reports.

### code-review-functional

Expand Down
3 changes: 2 additions & 1 deletion .github/agents/security/security-reviewer.agent.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ Report path pattern (plan): `.copilot-tracking/security/{{YYYY-MM-DD}}/plan-risk

Sequence number resolution: Determine `{{NNN}}` by listing existing reports in the date directory, extracting the highest sequence number, incrementing by one, and zero-padding to three digits. Start at `001` when no reports exist.

Skill resolution: Read the applicable OWASP skill (e.g., `owasp-top-10`, `owasp-llm`, `owasp-agentic`) to access vulnerability references. Follow the skill's normative reference links to load vulnerability reference documents.
Skill resolution: Read the applicable OWASP skill (e.g., `owasp-top-10`, `owasp-llm`, `owasp-agentic`, `owasp-ml`) to access vulnerability references. Follow the skill's normative reference links to load vulnerability reference documents.

### Subagents

Expand All @@ -112,6 +112,7 @@ Skill resolution: Read the applicable OWASP skill (e.g., `owasp-top-10`, `owasp-
* owasp-agentic
* owasp-llm
* owasp-top-10
* owasp-ml

## Subagent Prompt Templates

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ Scan the repository to identify its technology stack and determine which OWASP s

## Constants

Skill resolution: Read the applicable OWASP skill by name (e.g., `owasp-top-10`, `owasp-llm`, `owasp-agentic`).
Skill resolution: Read the applicable OWASP skill by name (e.g., `owasp-top-10`, `owasp-llm`, `owasp-agentic`, `owasp-ml`).

### Technology Signals

Expand All @@ -51,6 +51,9 @@ owasp-top-10:
- "REST API endpoints"
- "Server-side templates"
- "Web framework config (Express, Django, Flask, Rails, Spring)"
owasp-ml:
- "ML training/inference code"
- "Model files (.pt, .h5, .onnx, .pkl)"
Comment on lines +54 to +56
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RI-04 — Technology signals may be too narrow (Medium)

Other skill signal blocks have 3–5 entries for broader detection coverage. ML codebases commonly include framework imports (torch, tensorflow, sklearn, keras, transformers) and additional model formats (.safetensors). Adding at least one framework-import signal would improve the codebase profiler's detection accuracy.

Consider expanding to:

owasp-ml:
  - "ML training/inference code"
  - "Model files (.pt, .h5, .onnx, .pkl, .safetensors)"
  - "ML framework imports (torch, tensorflow, sklearn, keras, transformers)"

```

## Codebase Profile Format
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ Perform deep adversarial verification of all FAIL and PARTIAL findings for a sin

## Constants

Skill resolution: Read the applicable OWASP skill by name (e.g., `owasp-top-10`, `owasp-llm`, `owasp-agentic`). Follow the skill's normative reference links to access vulnerability references.
Skill resolution: Read the applicable OWASP skill by name (e.g., `owasp-top-10`, `owasp-llm`, `owasp-agentic`, `owasp-ml`). Follow the skill's normative reference links to access vulnerability references.

Verdict values: CONFIRMED, DISPROVED, DOWNGRADED.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ Assess exactly one OWASP skill per invocation. Read all vulnerability references

## Constants

Skill resolution: Read the applicable OWASP skill by name (e.g., `owasp-top-10`, `owasp-llm`, `owasp-agentic`). Follow the skill's normative reference links to access the vulnerability index and individual vulnerability references.
Skill resolution: Read the applicable OWASP skill by name (e.g., `owasp-top-10`, `owasp-llm`, `owasp-agentic`, `owasp-ml`). Follow the skill's normative reference links to access the vulnerability index and individual vulnerability references.

### Status Values

Expand Down
4 changes: 2 additions & 2 deletions .github/prompts/security/security-review.prompt.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
name: security-review
agent: Security Reviewer
description: "Runs an OWASP vulnerability assessment against the current codebase - Brought to you by microsoft/hve-core"
argument-hint: "[scope=path/to/dir] [mode={audit|diff|plan}] [targetSkill={owasp-top-10|owasp-llm|owasp-agentic}]"
argument-hint: "[scope=path/to/dir] [mode={audit|diff|plan}] [targetSkill={owasp-top-10|owasp-llm|owasp-agentic|owasp-ml}]"
---

# Vulnerability Scan
Expand All @@ -13,7 +13,7 @@ argument-hint: "[scope=path/to/dir] [mode={audit|diff|plan}] [targetSkill={owasp
## Inputs

* ${input:mode:audit}: (Optional, defaults to audit) Scanning mode: `audit`, `diff`, or `plan`.
* ${input:targetSkill}: (Optional) Single OWASP skill to assess. Bypasses codebase profiling when provided. Available skills: `owasp-agentic`, `owasp-llm`, `owasp-top-10`.
* ${input:targetSkill}: (Optional) Single OWASP skill to assess. Bypasses codebase profiling when provided. Available skills: `owasp-agentic`, `owasp-llm`, `owasp-top-10`, `owasp-ml`.
* ${input:scope}: (Optional) Specific directories or paths to focus on. When omitted, assesses the full codebase.
* ${input:plan}: (Optional) Implementation plan document path. Inferred from attached files or conversation context when not provided.

Expand Down
45 changes: 45 additions & 0 deletions .github/skills/security/owasp-ml/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
---
name: owasp-ml
description: OWASP Machine Learning Top 10 (2023) vulnerability knowledge base for identifying, assessing, and remediating security risks in machine learning systems - Brought to you by microsoft/hve-core.
license: CC-BY-SA-4.0
user-invocable: false
metadata:
authors: "OWASP Machine Learning Security Project"
spec_version: "1.0"
framework_revision: "1.0.0"
last_updated: "2026-02-16"
skill_based_on: "https://github.com/chris-buckley/agnostic-prompt-standard"
content_based_on: "https://owasp.org/www-project-machine-learning-security-top-10/"
---

# OWASP ML Top 10 — Skill Entry

This `SKILL.md` is the **entrypoint** for the OWASP ML Top 10 skill.

The skill encodes the **OWASP Machine Learning Security Top 10** as structured, machine-readable references
that an agent can query to identify, assess, and remediate machine learning security risks.

## Normative references (ML Top 10)

1. [00 Vulnerability Index](references/00-vulnerability-index.md)
2. [01 Input Manipulation Attack](references/01-input-manipulation-attack.md)
3. [02 Data Poisoning Attack](references/02-data-poisoning-attack.md)
4. [03 Model Inversion Attack](references/03-model-inversion-attack.md)
5. [04 Membership Inference Attack](references/04-membership-inference-attack.md)
6. [05 Model Theft](references/05-model-theft.md)
7. [06 AI Supply Chain Attacks](references/06-ai-supply-chain-attacks.md)
8. [07 Transfer Learning Attack](references/07-transfer-learning-attack.md)
9. [08 Model Skewing](references/08-model-skewing.md)
10. [09 Output Integrity Attack](references/09-output-integrity-attack.md)
11. [10 Model Poisoning](references/10-model-poisoning.md)

## Skill layout

* `SKILL.md` — this file (skill entrypoint).
* `references/` — the ML Top 10 normative documents.
* `00-vulnerability-index.md` — index of all vulnerability identifiers, categories, and cross-references.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RI-03 — "index" vs "master index" (Low)

All three sibling OWASP skills say master index of all vulnerability identifiers here. Minor wording inconsistency.

Suggested change
* `00-vulnerability-index.md` — index of all vulnerability identifiers, categories, and cross-references.
* `00-vulnerability-index.md`master index of all vulnerability identifiers, categories, and cross-references.

* `01` through `10` — one document per vulnerability aligned with OWASP ML Security Top 10 numbering.

---
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RI-01 — Missing Third-Party Attribution (High)

All sibling OWASP skills (owasp-top-10, owasp-llm, owasp-agentic) include a ## Third-Party Attribution section acknowledging OWASP Foundation copyright and CC-BY-SA-4.0 license terms. This section is missing from owasp-ml.

Since the content is derived from the OWASP ML Security Top 10, proper attribution is required by the CC-BY-SA-4.0 license.

Suggested addition (before the closing ---):

## Third-Party Attribution

Copyright © OWASP Foundation.
OWASP® Machine Learning Security Top 10 (2023) content is derived from works by the
OWASP Foundation, licensed under CC BY-SA 4.0
(<https://creativecommons.org/licenses/by-sa/4.0/>).
Source: <https://owasp.org/www-project-machine-learning-security-top-10/>
Modifications: Vulnerability descriptions restructured into agent-consumable reference
documents with added detection and remediation guidance.
OWASP® is a registered trademark of the OWASP Foundation. Use does not imply endorsement.


*🤖 Crafted with precision by ✨Copilot following brilliant human instruction, then carefully refined by our team of discerning human reviewers.*
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
---
title: OWASP ML Top 10 Vulnerability Index
description: Index of OWASP Machine Learning Security Top 10 (2023) vulnerability identifiers, categories, and cross-references
---

# 00 Vulnerability Index

This document provides the index for the OWASP Machine Learning Security Top 10 vulnerabilities.
Each entry includes its identifier, title, and primary category.

## Vulnerability catalog

| ID | Title | Category |
|---|---|---|
| ML01:2023 | Input Manipulation Attack | Input Security |
| ML02:2023 | Data Poisoning Attack | Data Integrity |
| ML03:2023 | Model Inversion Attack | Privacy |
| ML04:2023 | Membership Inference Attack | Privacy |
| ML05:2023 | Model Theft | Intellectual Property |
| ML06:2023 | AI Supply Chain Attacks | Supply Chain |
| ML07:2023 | Transfer Learning Attack | Model Integrity |
| ML08:2023 | Model Skewing | Data Integrity |
| ML09:2023 | Output Integrity Attack | Output Security |
| ML10:2023 | Model Poisoning | Model Integrity |

## Cross-reference matrix

Each vulnerability document follows a consistent structure:

1. Description — what the vulnerability is and how it manifests in machine learning systems.
2. Risk — concrete consequences of exploitation and business impact.
3. Vulnerability checklist — indicators that the system is exposed.
4. Prevention controls — defensive measures and rectification steps.
5. Example attack scenarios — realistic exploitation narratives.
6. Detection guidance — signals and methods to identify exposure.
7. Remediation — immediate and long-term actions to contain and resolve.

## Category groupings

### Input Security

* ML01:2023 Input Manipulation Attack

### Data Integrity

* ML02:2023 Data Poisoning Attack
* ML08:2023 Model Skewing

### Privacy

* ML03:2023 Model Inversion Attack
* ML04:2023 Membership Inference Attack

### Intellectual Property

* ML05:2023 Model Theft

### Supply Chain

* ML06:2023 AI Supply Chain Attacks

### Model Integrity

* ML07:2023 Transfer Learning Attack
* ML10:2023 Model Poisoning

### Output Security

* ML09:2023 Output Integrity Attack

---

*🤖 Crafted with precision by ✨Copilot following brilliant human instruction, then carefully refined by our team of discerning human reviewers.*
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
---
title: 'ML01: Input Manipulation Attack'
description: OWASP ML Top 10 reference for input manipulation and adversarial attack vulnerabilities including crafted perturbations that cause incorrect model outputs
---

# 01 Input Manipulation Attack

Identifier: ML01:2023
Category: Input Security

## Description

Input Manipulation Attacks is an umbrella term which includes Adversarial Attacks, a type of attack
in which an attacker deliberately alters input data to mislead the model. The attacker crafts
inputs with small, carefully designed perturbations that cause the model to produce incorrect
outputs while appearing legitimate to human observers. This category affects any machine learning
system that accepts external input, including image classifiers, intrusion detection systems, and
natural language processing models.

## Risk

* Misclassification of inputs leading to security bypass or harm to the system.
* The manipulated input may not be noticeable to the naked eye, making the attack difficult to
detect.
* Exploitation requires technical knowledge of deep learning and input processing techniques.
* Attackers with knowledge of the model's architecture can craft targeted perturbations.
* Cascading failures when misclassified inputs trigger downstream actions in automated pipelines.

## Vulnerability checklist

* The model lacks adversarial training and has not been exposed to adversarial examples during
training.
* No input validation is performed to detect anomalies, unexpected values, or patterns.
* The model is not designed with robust architectures or defense mechanisms against manipulative
inputs.
* Model predictions are consumed directly without downstream verification or confidence
thresholding.
* No monitoring is in place to detect distribution shifts or anomalous input patterns at inference
time.
* The model's architecture and parameters are accessible to potential attackers.

## Prevention controls

1. Train the model on adversarial examples to improve robustness against manipulated inputs.
2. Use models designed with robust architectures and activation functions that incorporate defense
mechanisms against adversarial perturbations.
3. Implement input validation to check input data for anomalies such as unexpected values or
patterns and reject inputs that are likely to be malicious.
4. Apply confidence thresholding to flag or reject predictions below a confidence threshold.
5. Use ensemble methods that combine multiple models to reduce the likelihood that a single
adversarial perturbation fools all models.
6. Restrict access to model internals to prevent attackers from crafting targeted attacks.

## Example attack scenarios

### Scenario A — Image classification bypass

A deep learning model is trained to classify images into categories such as dogs and cats. An
attacker manipulates an image that is visually similar to a legitimate image of a cat but contains
small, carefully crafted perturbations that cause the model to misclassify it as a dog. When the
model is deployed in a real-world setting, the attacker uses the manipulated image to bypass
security measures or cause harm to the system.

### Scenario B — Network intrusion detection evasion

A deep learning model is trained to detect intrusions in a network. An attacker manipulates
network traffic by carefully crafting packets that evade the model's intrusion detection system.
The attacker alters features of the network traffic such as the source IP address, destination IP
address, or payload in a way that avoids detection. The attacker may hide their source IP address
behind a proxy server or encrypt the payload. This leads to data theft, system compromise, or
other forms of damage.

## Detection guidance

* Monitor input distributions at inference time for statistical anomalies or distribution shifts
compared to training data.
* Log all inputs and outputs to detect patterns of adversarial probing.
* Implement anomaly detection on incoming data to flag inputs that deviate significantly from
expected distributions.
* Compare model confidence scores over time to detect sudden drops or unusual prediction patterns.
* Use gradient-based detection methods to identify inputs that produce unusually large gradients.

## Remediation

* Retrain the model with adversarial examples incorporated into the training dataset.
* Deploy input validation filters to reject or quarantine suspicious inputs before inference.
* Implement robust model architectures that are inherently resistant to small perturbations.
* Add confidence-based gating to suppress low-confidence predictions.
* Restrict public access to model APIs and internals to limit attacker reconnaissance.
* Continuously monitor model performance in production for accuracy degradation that may indicate
ongoing adversarial attacks.

---

*🤖 Crafted with precision by ✨Copilot following brilliant human instruction, then carefully refined by our team of discerning human reviewers.*
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
---
title: 'ML02: Data Poisoning Attack'
description: OWASP ML Top 10 reference for data poisoning vulnerabilities including training data manipulation, label corruption, and pipeline compromise
---

# 02 Data Poisoning Attack

Identifier: ML02:2023
Category: Data Integrity

## Description

Data poisoning attacks occur when an attacker manipulates the training data to cause the model to
behave in an undesirable way. The attacker injects malicious data into the training dataset by
compromising the data storage system, exploiting vulnerabilities in data collection pipelines, or
corrupting the data labeling process. The poisoned data causes the model to learn incorrect
patterns, leading to unreliable predictions when deployed. This attack is particularly dangerous
because manipulated training data may be difficult to detect and can persist through multiple
retraining cycles.

## Risk

* The model will make incorrect predictions based on the poisoned data, leading to false decisions
and potentially serious consequences.
* The attack has moderate exploitability and is difficult to detect.
* Attackers who have access to the training data or the data collection pipeline can execute the
attack.
* Lack of data validation and insufficient monitoring of the training data increase exposure.
* Poisoned data may persist across retraining cycles if not identified and removed.

## Vulnerability checklist

* Training data is not thoroughly validated or verified before use.
* No data integrity checks such as checksums or digital signatures are applied to training
datasets.
* Training data is stored without encryption or secure transfer protocols.
* Training data is not separated from production data.
* Access controls do not restrict who can access or modify the training data.
* No anomaly detection is applied to training data to detect sudden distribution changes or
labeling inconsistencies.
* Multiple independent data labelers are not used to cross-validate labeling accuracy.
* No separate validation set is used to verify model behavior after training.

## Prevention controls

1. Ensure that training data is thoroughly validated and verified before use by implementing data
validation checks and employing multiple data labelers to validate labeling accuracy.
2. Store training data securely using encryption, secure data transfer protocols, and firewalls.
3. Separate training data from production data to reduce the risk of training data compromise.
4. Implement access controls to limit who can access the training data and when.
5. Regularly monitor the training data for anomalies and conduct audits to detect data tampering.
6. Validate the model using a separate validation set that was not used during training to detect
poisoning attacks that may have affected the training data.
7. Train multiple models using different subsets of the training data and use an ensemble to make
predictions, reducing the impact of poisoning attacks.
8. Use anomaly detection techniques to detect abnormal behavior in the training data such as
sudden changes in data distribution or data labeling.

## Example attack scenarios

### Scenario A — Poisoning a spam classifier

An attacker poisons the training data for a deep learning model that classifies emails as spam or
not spam. The attacker injects maliciously labeled spam emails into the training dataset by
compromising the data storage system, hacking into the network, or exploiting a vulnerability in
the data storage software. The attacker also manipulates the data labeling process by falsifying
labels or bribing data labelers to provide incorrect labels.

### Scenario B — Poisoning a network traffic classifier

An attacker poisons the training data for a deep learning model used to classify network traffic
into categories such as email, web browsing, and video streaming. The attacker introduces a large
number of examples of network traffic incorrectly labeled as a different type of traffic, causing
the model to make incorrect traffic classifications when deployed. This leads to misallocation of
network resources or degradation of network performance.

## Detection guidance

* Apply statistical analysis to training datasets to detect sudden distribution shifts or
anomalous labeling patterns.
* Use holdout validation sets to compare model behavior against known-clean baselines.
* Monitor model accuracy over retraining cycles for unexpected degradation.
* Cross-validate data labels using multiple independent labelers or automated consistency checks.
* Audit data pipeline access logs for unauthorized modifications to training datasets.

## Remediation

* Remove identified poisoned data from the training dataset and retrain the model.
* Implement data provenance tracking to trace the origin of all training data.
* Enforce strict access controls and audit logging on data storage and labeling systems.
* Deploy anomaly detection on data ingestion pipelines to catch future poisoning attempts.
* Use ensemble models trained on different data subsets to reduce single-point-of-failure risk.
* Conduct periodic audits of data labeling quality and consistency.

---

*🤖 Crafted with precision by ✨Copilot following brilliant human instruction, then carefully refined by our team of discerning human reviewers.*
Loading
Loading