Skip to content

chore(rc1): harden preflight binding, acceptance metrics, and pages publish contract#17

Merged
joy7758 merged 3 commits intomainfrom
fix/pages-artifact-tar
Mar 18, 2026
Merged

chore(rc1): harden preflight binding, acceptance metrics, and pages publish contract#17
joy7758 merged 3 commits intomainfrom
fix/pages-artifact-tar

Conversation

@joy7758
Copy link
Owner

@joy7758 joy7758 commented Mar 18, 2026

Scope

  • Bind preflight output to run evidence via preflight_digest + run_binding_digest
  • Harden acceptance matrix with verification_coverage_rate and verified_acceptance_rate
  • Add Pages publish manifest allowlist + JSON top-level schema checks
  • Upgrade task033 sensitivity probe to multi-trial study and refresh RC1 docs

Why

Close protocol exploit surfaces identified in RC1 review:

  • preflight as convention vs bound invariant
  • compliance inflation when verification signal is missing
  • pages artifact/schema drift

Validation

  • venv/bin/python -m unittest discover -s tests (19 passed)
  • py_compile for changed modules
  • local manifest required-file + schema check passed

@joy7758 joy7758 merged commit 3f8ad34 into main Mar 18, 2026
2 checks passed
@joy7758 joy7758 deleted the fix/pages-artifact-tar branch March 18, 2026 03:20
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 457302aad0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +138 to +139
"preflight_bound_count": float(preflight_bound_count),
"preflight_bound_rate": (preflight_bound_count / effective_total) if effective_total else 0.0,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Exclude infra-invalid rows from preflight_bound_rate

If any task row is classified as INFRA_INVALID after the run-level preflight succeeds (for example, a provider/auth failure during execution), that row still carries preflight_digest and is counted in preflight_bound_count, but effective_total removes it from the denominator. In that case summarize_acceptance() can emit impossible values like preflight_bound_rate == 2.0 for one valid row plus one infra-invalid row, which then propagates into MetricsTracker.summary() and the generated reports. Either filter infra-invalid rows out of the numerator or divide by total instead.

Useful? React with 👍 / 👎.

Comment on lines +22 to +23
"governor_file",
"governor",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Drop unconditional governor keys from the Pages schema

metrics/report.py only adds the top-level governor_file/governor fields when the CLI was actually given a governor run (if "governor" in modes at metrics/report.py:425-427), but this manifest now requires those keys on every published comparison.json. A valid report such as --baseline ... --eco ... will therefore fail the new schema check in the Pages workflow even though governor-less mode sets are explicitly supported.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant