Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -153,11 +153,17 @@ jobs:
run: apm audit --ci

# Gate B: regeneration drift (producer-side).
# NOTE: Once `apm-action` ships a CLI version that includes the
# default-on `apm audit` drift detection (issue #1071), this entire
# step becomes redundant -- Gate A above already catches the same
# divergence via install-replay. Keep this bash check until then as
# a defense-in-depth fallback.
#
# The action's `apm install` step re-integrated local .apm/ into
# .github/ via target auto-detection. If anything in the governed
# integration directories changed, someone edited the regenerated
# output without updating the canonical .apm/ source.
- name: Check APM integration drift
- name: Check APM integration drift (legacy bash fallback, see #1071)
run: |
if [ -n "$(git status --porcelain -- .github/ .claude/ .cursor/ .opencode/)" ]; then
echo "::error::APM integration files are out of date."
Expand Down
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

### Added

- **`apm audit` now catches forgotten installs and hand-edits by default.** No more shipping stale `.github/instructions/` because someone forgot to re-run `apm install`, no more silent hand-edits to regenerated content. Opt out with `--no-drift`. See the [Drift Detection guide](https://danielmeppiel.github.io/awd-cli/guides/drift-detection/). (#1137, closes #1071, supersedes scope of #898)

### Fixed

- **Parallel subdir install race.** `apm install` no longer intermittently fails with `RuntimeError: Subdirectory '<path>' not found in repository` when multiple dependencies resolve to different subdirectories of the same `repo@ref`. The shared clone cache now stores subdir-agnostic bare clones and each consumer materializes its own working tree (mirrors the WS3 `GitCache` pattern). (#1135, fixes #1126)
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,7 @@ Agent context is executable in effect — a prompt is a program for an LLM. APM

- **[Content security](https://microsoft.github.io/apm/enterprise/security/)** — `apm install` blocks compromised packages before agents read them; `apm audit` runs the same checks on demand
- **[Lockfile integrity](https://microsoft.github.io/apm/enterprise/governance/)** — `apm.lock` records resolved sources and content hashes for full provenance
- **[Drift detection](https://microsoft.github.io/apm/guides/drift-detection/)** — `apm audit` rebuilds your agent context in scratch and diffs it against your working tree to catch hand-edits before they ship
- **[MCP trust boundaries](https://microsoft.github.io/apm/guides/mcp-servers/)** — transitive MCP servers require explicit consent

### 3. Governed by policy
Expand Down
6 changes: 3 additions & 3 deletions docs/src/content/docs/enterprise/governance-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -274,8 +274,8 @@ This is the certitude section. Read it twice if you are deciding whether `apm au

| Surface | What it bypasses LOCALLY | What it CANNOT bypass | Reviewable in |
|---|---|---|---|
| `apm install --no-policy` | All 17 policy checks at install (incl. transitive MCP, hash pin) | The 7 baseline checks in `apm audit --ci` | git diff of `apm.lock.yaml` in PR |
| `APM_POLICY_DISABLE=1` env | Same as `--no-policy` plus the 17 audit policy checks | The 7 baseline checks in `apm audit --ci` | PR diff; CI env vars in Actions logs |
| `apm install --no-policy` | All 17 policy checks at install (incl. transitive MCP, hash pin) | The 7 baseline checks plus integration drift detection in `apm audit --ci` | git diff of `apm.lock.yaml` in PR |
| `APM_POLICY_DISABLE=1` env | Same as `--no-policy` plus the 17 audit policy checks | The 7 baseline checks plus integration drift detection in `apm audit --ci` | PR diff; CI env vars in Actions logs |
| Manual edit to `apm.lock.yaml` | Nothing; install regenerates the file each run | Audit baseline `ref-consistency` and `deployed-files-present` | git diff |
| Manual edit to deployed file post-install | Local file content until next audit | Audit baseline `content-integrity` (re-hashes deployed files); hidden-Unicode scan in `apm audit` content mode | git diff of the deployed file in PR |
| Direct `git clone` of an APM package, bypassing install | Everything; nothing detects out-of-band file drops | Audit baseline `no-orphaned-packages` and audit-only `unmanaged-files` | git diff |
Expand All @@ -287,7 +287,7 @@ This is the certitude section. Read it twice if you are deciding whether `apm au
Notes on specific rows:

- **`apm install --no-policy`** also bypasses the `apm install --mcp` preflight, the transitive-MCP preflight, and any project-side `policy.hash` pin.
- **`APM_POLICY_DISABLE=1`** short-circuits discovery to `outcome="disabled"` everywhere -- including `apm audit --ci`, where the 17 policy checks are skipped (the 7 baseline checks still run).
- **`APM_POLICY_DISABLE=1`** short-circuits discovery to `outcome="disabled"` everywhere -- including `apm audit --ci`, where the 17 policy checks are skipped (the 7 baseline checks and integration drift detection still run).
- **Manual lockfile edits**: `content_hash` mismatch on registry-proxy deps is caught at the next install when downloads resume.
- **Direct `git clone`**: `unmanaged-files` only flags governed dirs and only when configured to `warn` / `deny`.
- **Fork-to-personal-org**: discovery resolves via `git remote get-url origin`; branch protection on the upstream repo is the trust boundary.
Expand Down
2 changes: 1 addition & 1 deletion docs/src/content/docs/getting-started/quick-start.md
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,7 @@ apm install github/awesome-copilot/skills/review-and-refactor
- `apm_modules/` -- add to `.gitignore`. Rebuilt from the lockfile on install.

:::tip[Keeping deployed files in sync]
When you update `apm.yml`, re-run `apm install` and commit the changed `.github/`, `.claude/`, `.cursor/`, and `.gemini/` files. A [CI drift check](../../integrations/ci-cd/#verify-deployed-primitives) catches stale files automatically.
When you update `apm.yml`, re-run `apm install` and commit the changed `.github/`, `.claude/`, `.cursor/`, and `.gemini/` files. A [CI drift check](../../guides/drift-detection/) catches stale files automatically.
:::

:::note[Using Codex or Gemini?]
Expand Down
152 changes: 152 additions & 0 deletions docs/src/content/docs/guides/drift-detection.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
---
title: Drift Detection
sidebar:
order: 7
---

`apm audit` runs **drift detection by default** so a stale working tree
cannot ship to production unnoticed. This page explains what drift means,
how the check works, and the escape hatch when you need to disable it.

## Try it now

```bash
cd <your-apm-project>
apm audit
```

If you have any `.apm/` sources or installed dependencies, the audit
will replay your install into a scratch tmpdir and report any drift.
No writes to your working tree, no network, no MCP calls.

Common first-run results:

- **Clean tree, no drift** -- exit 0, no output beyond the standard
audit summary.
- **Forgot to re-run `apm install`** -- drift findings under kind
`unintegrated` for every `.apm/` source whose deployed counterpart
is missing.
- **First run on a pre-marker cache** -- a one-line warning asking
you to run `apm install` once so cache pin markers are written.

## What is integration drift?

Integration drift is any divergence between what `apm install` would
deploy from your locked dependencies and what is actually on disk.
Three kinds matter:

| Kind | Meaning | Typical cause |
|---|---|---|
| `unintegrated` | A `.apm/` source file is committed but its deployed counterpart is missing | Forgot to re-run `apm install` after adding/editing local primitives |
| `modified` | A deployed file's content differs from what install would produce | Hand-edit to a regenerated file under `.github/`, `.claude/`, `.cursor/`, etc. |
| `orphaned` | A deployed file exists with no current source backing it | Removed a dependency or local primitive without re-running install |

All three previously required ad-hoc `git status --porcelain` scripts in
CI to detect. With drift detection, `apm audit` catches every case in
one read-only command -- nothing in your project, lockfile, or
`apm_modules/` is mutated.

## How it works

```mermaid
flowchart LR
A[apm audit] --> B[Read apm.lock.yaml<br/>+ cache contents]
B --> C[Replay install<br/>into scratch tmpdir]
C --> D[Diff scratch tree<br/>vs project]
D --> E[Render findings<br/>text / JSON / SARIF]
```

The replay is **cache-only** -- no network, no git fetch, no MCP
registry call. It will fail fast with `CacheMissError` if the lockfile
references content not present in the persistent cache (run
`apm install` once first).

False-positive guards normalize:

- Build-ID lines (e.g. APM-generated `<!-- Build ID: ... -->` markers).
- CRLF -> LF line endings (Windows checkouts of LF-canonical sources).
- UTF-8 BOM byte-order marks.

## Default behaviour and exit codes

| Mode | Drift findings | Exit code |
|---|---|---|
| `apm audit` | Reported in stdout | 0 (advisory only) |
| `apm audit --ci` | Reported and counted as failure | 1 |
| `apm audit --no-drift` | Skipped entirely | governed only by other checks |

In `--ci` mode drift findings are pooled with the seven baseline lockfile
checks (`lockfile-exists`, `ref-consistency`, etc.) plus integration
drift detection -- a single non-zero exit covers all of them.

## When to use `--no-drift`

The escape hatch exists for two legitimate cases:

1. **Tight inner loops** where you intentionally have local edits and
just want a content-only safety scan (`apm audit --no-drift -v`).
2. **Performance budgets** in matrix CI where you've already covered
drift in a single non-matrix job upstream.

Drift detection is also auto-skipped when `--strip` or `--file` is used;
both target a single payload and have nothing to diff against. Combining
`--no-drift` with `--strip` or `--file` is rejected with a usage error
(rather than silently picking one).

## Output formats

**Text (TTY default)** -- color-coded, one finding per line, grouped by kind.

**JSON** -- the audit report gains a top-level `drift` key:

```json
{
"report_format_version": "1.0",
"checks": [...],
"drift": [
{
"path": ".github/instructions/foo.md",
"kind": "modified",
"package": "<local>",
"inline_diff": "..."
}
]
}
```

**SARIF** -- findings are appended to `runs[0].results` with rule IDs
`apm/drift/modified`, `apm/drift/unintegrated`, `apm/drift/orphaned`,
ready to surface in GitHub code-scanning.

## CI integration

The recommended CI gate is now a single line:

```yaml
- run: apm audit --ci
```

### Before vs after: the legacy bash workaround

Previously CI pipelines had to grep `git status` to catch un-installed
or hand-edited deployed files. That workaround is no longer needed:

```yaml
# Legacy -- no longer needed once apm-action ships with drift support
- run: |
if [ -n "$(git status --porcelain -- .github/ .claude/ .cursor/)" ]; then
exit 1
fi
```

`apm audit --ci` subsumes this entirely AND catches three additional
classes of drift the bash workaround missed (`unintegrated` of source
files never integrated, `orphaned` files left behind by removed
dependencies, and `modified` files normalized for build-id /
line-ending / BOM noise).

For org-policy enforcement, combine with `--policy org` -- drift
detection composes orthogonally with the 17 audit-only policy checks.

See also: [CI Policy Enforcement](../ci-policy-setup/),
[Governance Guide](../../enterprise/governance-guide/).
23 changes: 12 additions & 11 deletions docs/src/content/docs/integrations/ci-cd.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,22 +60,23 @@ This step is not needed if your team only uses GitHub Copilot and Claude, which

### Verify Deployed Primitives

To ensure `.github/`, `.claude/`, `.cursor/`, `.opencode/`, and `.gemini/` integration files stay in sync with `apm.yml`, add a drift check:
`apm audit --ci` catches integration drift by default -- no separate
`git status` step required:

```yaml
- name: Check APM integration drift
run: |
apm install
if [ -n "$(git status --porcelain -- .github/ .claude/ .cursor/ .opencode/ .gemini/)" ]; then
echo "APM integration files are out of date. Run 'apm install' and commit."
exit 1
fi
- name: Audit + drift check
run: apm audit --ci
```

This catches cases where a developer updates `apm.yml` but forgets to re-run `apm install`.
This single command runs the seven baseline lockfile checks PLUS integration
drift detection (default-on) AND replays
the install pipeline into a scratch tree to detect missed `apm install`
runs, hand-edited deployed files, and orphaned files. See the
[Drift Detection guide](../../guides/drift-detection/) for details and
opt-out (`--no-drift`).

:::tip[We dogfood this]
APM's own repo uses the `APM Self-Check` job in [`microsoft/apm`'s `ci.yml`](https://github.com/microsoft/apm/blob/main/.github/workflows/ci.yml) as a reference implementation for installing APM, running CI validation commands such as `apm audit --ci`, and checking for drift with `git status --porcelain`. Use it as a practical example when wiring these checks into your own workflow.
APM's own repo uses the `APM Self-Check` job in [`microsoft/apm`'s `ci.yml`](https://github.com/microsoft/apm/blob/main/.github/workflows/ci.yml) as a reference implementation for installing APM and running `apm audit --ci`. Use it as a practical example when wiring these checks into your own workflow.
:::

## Azure Pipelines
Expand Down Expand Up @@ -147,7 +148,7 @@ apm install

## Governance with `apm audit`

`apm audit --ci` verifies lockfile consistency in CI (7 baseline checks, no configuration). Add `--policy org` to enforce organizational rules (17 additional checks). For full setup including SARIF integration and GitHub Code Scanning, see the [CI Policy Enforcement guide](../../guides/ci-policy-setup/).
`apm audit --ci` verifies lockfile consistency in CI (7 baseline checks plus integration drift detection, no configuration). Add `--policy org` to enforce organizational rules (17 additional checks). For full setup including SARIF integration and GitHub Code Scanning, see the [CI Policy Enforcement guide](../../guides/ci-policy-setup/).

For content scanning and hidden Unicode detection, `apm install` automatically blocks critical findings. Run `apm audit` for on-demand reporting. See [Governance](../../enterprise/governance-guide/) for the full governance model.

Expand Down
3 changes: 2 additions & 1 deletion docs/src/content/docs/reference/cli-commands.md
Original file line number Diff line number Diff line change
Expand Up @@ -459,11 +459,12 @@ apm audit [PACKAGE] [OPTIONS]
- `-v, --verbose` - Show info-level findings and file details
- `-f, --format [text|json|sarif|markdown]` - Output format: `text` (default), `json` (machine-readable), `sarif` (GitHub Code Scanning), `markdown` (step summaries). Cannot be combined with `--strip` or `--dry-run`.
- `-o, --output PATH` - Write report to file. Auto-detects format from extension (`.sarif`, `.sarif.json` → SARIF; `.json` → JSON; `.md` → Markdown) when `--format` is not specified.
- `--ci` - Run lockfile consistency checks for CI/CD gates. Exit 0 if clean, 1 if violations found. Auto-discovers org policy from the org `.github` repo unless `--no-policy` is set. Runs the 7 baseline checks: lockfile presence, ref consistency, deployed files present, no orphaned packages, MCP config consistency, content integrity (Unicode + hash drift on every deployed file including local content), includes consent (advisory).
- `--ci` - Run lockfile consistency checks for CI/CD gates. Exit 0 if clean, 1 if violations found. Auto-discovers org policy from the org `.github` repo unless `--no-policy` is set. Runs the 7 baseline checks: lockfile presence, ref consistency, deployed files present, no orphaned packages, MCP config consistency, content integrity (Unicode + hash drift on every deployed file including local content), includes consent (advisory). Integration drift detection runs by default alongside the baseline checks and contributes to the exit code (use `--no-drift` to opt out).
- `--policy SOURCE` - *(Experimental)* Policy source. Accepts: `org` (auto-discover from your project's git remote), `owner/repo` (defaults to github.com), an `https://` URL, or a local file path. Used with `--ci` for policy checks. Without this flag, `--ci` auto-discovers.
- `--no-policy` - Skip policy discovery and enforcement entirely. Equivalent to `APM_POLICY_DISABLE=1`.
- `--no-cache` - Force fresh policy fetch (skip cache). Only relevant with policy discovery active.
- `--no-fail-fast` - Run all checks even after a failure. By default, CI mode stops at the first failing check to save time.
- `--no-drift` - Skip integration drift detection. Drift detection is on by default (whole-project audit only) and replays the install pipeline into a scratch tree to catch missed `apm install` runs, hand-edited deployed files, and orphaned files. Mutually exclusive with `--strip`/`--file`. See the [Drift Detection guide](../../guides/drift-detection/).

**Examples:**
```bash
Expand Down
4 changes: 3 additions & 1 deletion packages/apm-guide/.apm/skills/apm-usage/commands.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,9 @@

| Command | Purpose | Key flags |
|---------|---------|-----------|
| `apm audit [PKG]` | Scan for security issues | `--file PATH`, `--strip`, `--dry-run`, `-v`, `-f [text\|json\|sarif\|md]`, `-o PATH`, `--ci`, `--policy SOURCE`, `--no-cache`, `--no-fail-fast` |
| `apm audit [PKG]` | Scan for security issues + detect integration drift | `--file PATH`, `--strip`, `--dry-run`, `-v`, `-f [text\|json\|sarif\|md]`, `-o PATH`, `--ci`, `--policy SOURCE`, `--no-cache`, `--no-fail-fast`, `--no-drift` |

`apm audit` runs **drift detection by default** (issue #1071). It replays `apm install` cache-only into a temporary scratch tree and diffs the result against your working tree. Catches three failure modes: (1) `.apm/` source added without re-running `apm install`, (2) hand-edits to deployed files that diverge from canonical source, (3) orphan files left after their source was removed. The scan is read-only -- never writes to your project, lockfile, or `apm_modules/`. Build IDs, CRLF line endings, and BOMs are normalized away so they cannot trigger false positives. Use `--no-drift` to opt out (e.g. fast inner loops); the flag is mutually exclusive with `--strip`/`--file`. In `--ci` mode drift findings produce exit code 1 alongside the seven baseline lockfile checks. Drift output is integrated into JSON (top-level `drift` key) and SARIF (rule IDs `apm/drift/<kind>` where kind is `modified`/`unintegrated`/`orphaned`).

## Distribution

Expand Down
28 changes: 28 additions & 0 deletions scripts/test-integration.sh
Original file line number Diff line number Diff line change
Expand Up @@ -524,6 +524,34 @@ run_e2e_tests() {
exit 1
fi

# Run drift-detection integration tests -- offline, no tokens needed
# Guards `apm audit` drift replay (Phase D) across all 9 drift cases,
# multi-target, --no-drift opt-out, and false-positive guards
# (CRLF, BOM, Build ID line). Pinning these tests prevents silent
# regression of the drift contract.
log_info "Running drift detection integration tests..."
echo "Command: pytest tests/integration/test_drift_check.py -v -s --tb=short"

if pytest tests/integration/test_drift_check.py -v -s --tb=short; then
log_success "Drift detection integration tests passed!"
else
log_error "Drift detection integration tests failed!"
exit 1
fi

# Run drift-detection E2E tests -- offline, no tokens needed
# Verifies the no-write contract, air-gap proof, performance smoke,
# and JSON/SARIF output shapes for the `apm audit` drift surface.
log_info "Running drift detection E2E tests..."
echo "Command: pytest tests/integration/test_drift_check_e2e.py -v -s --tb=short"

if pytest tests/integration/test_drift_check_e2e.py -v -s --tb=short; then
log_success "Drift detection E2E tests passed!"
else
log_error "Drift detection E2E tests failed!"
exit 1
fi

log_success "All integration test suites completed successfully!"


Expand Down
Loading
Loading