diff --git a/.github/workflows/scientific-validation.yml b/.github/workflows/scientific-validation.yml new file mode 100644 index 0000000..b953d34 --- /dev/null +++ b/.github/workflows/scientific-validation.yml @@ -0,0 +1,82 @@ +name: Scientific Validation + +on: + workflow_dispatch: + schedule: + - cron: "15 2 * * 2" + +permissions: + contents: read + +concurrency: + group: scientific-validation + cancel-in-progress: false + +jobs: + offline-scientific-validation: + name: Offline Scientific Validation + runs-on: ubuntu-latest + + steps: + - name: Checkout + uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 + + - name: Set up Python + uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0 + with: + python-version: "3.11" + + - name: Install project + run: | + python -m pip install --upgrade pip + pip install -e ".[dev]" + + - name: Generate offline scientific validation reports + run: | + mkdir -p artifacts/scientific-validation/offline + python scripts/scientific_validation_report.py \ + --json \ + --persistence-dir artifacts/scientific-validation/offline/bundles \ + --output-json artifacts/scientific-validation/offline/report.json \ + --output-markdown artifacts/scientific-validation/offline/report.md + + - name: Upload offline scientific validation artifacts + uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7.0.0 + with: + name: scientific-validation-offline + path: artifacts/scientific-validation/offline/** + + live-concordance-panel: + name: Live Concordance Panel + if: ${{ secrets.CTX_API_KEY != '' }} + runs-on: ubuntu-latest + env: + CTX_API_KEY: ${{ secrets.CTX_API_KEY }} + + steps: + - name: Checkout + uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 + + - name: Set up Python + uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0 + with: + python-version: "3.11" + + - name: Install project + run: | + python -m pip install --upgrade pip + pip install -e ".[dev]" + + - name: Generate live concordance panel reports + run: | + mkdir -p artifacts/scientific-validation/live + python scripts/live_concordance_panel.py \ + --json \ + --output-json artifacts/scientific-validation/live/report.json \ + --output-markdown artifacts/scientific-validation/live/report.md + + - name: Upload live concordance artifacts + uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7.0.0 + with: + name: scientific-validation-live + path: artifacts/scientific-validation/live/** diff --git a/docs/testing_matrix.md b/docs/testing_matrix.md index 2fa4d75..bd12769 100644 --- a/docs/testing_matrix.md +++ b/docs/testing_matrix.md @@ -7,8 +7,8 @@ | Cross-transport parity | Tool catalog/metadata consistency, shared audit logs | `pytest tests/test_mcp_conformance_suite.py` | | Interop – live MCP | Public interop builders over HTTP transport, partial-data tolerance, handoff object shape, and explicit live fixture refresh workflow | `python scripts/mcp_interop_smoke.py --endpoint http://127.0.0.1:8000/mcp --json`, `python scripts/mcp_interop_smoke.py --endpoint http://127.0.0.1:8000/mcp --capture-dir tests/golden/interop_live --refresh-live-fixtures --json`, and `.github/workflows/live-interop-smoke.yml` | | Predictive guardrails | Guardrail enforcement, audit bundle persistence | `pytest tests/test_predictive_regression.py` | -| Scientific validation | Offline orchestrator scenario scorecards, evidence-assessment rollups, interop attachment coverage | `python scripts/scientific_validation_report.py --json` and `pytest tests/workflows/test_scientific_validation_report.py` | -| Live concordance panel | CTX-backed ToxVal reference cases with expected observed-concordance outcomes and drift detection | `python scripts/live_concordance_panel.py --json` and `pytest tests/workflows/test_live_concordance_panel.py` | +| Scientific validation | Offline orchestrator scenario scorecards, evidence-assessment rollups, interop attachment coverage | `python scripts/scientific_validation_report.py --json`, `pytest tests/workflows/test_scientific_validation_report.py`, and `.github/workflows/scientific-validation.yml` | +| Live concordance panel | CTX-backed ToxVal reference cases with expected observed-concordance outcomes and drift detection | `python scripts/live_concordance_panel.py --json`, `pytest tests/workflows/test_live_concordance_panel.py`, and `.github/workflows/scientific-validation.yml` | | CTX connectivity | Live API health, credential validation | `scripts/smoke_ctx.sh` | | Agent integration | Codex/Gemini/Claude CLI flows | Follow `docs/integration_guides/mcp_integration.md` | diff --git a/docs/workflow_testing_strategy.md b/docs/workflow_testing_strategy.md index 098be8f..040dc03 100644 --- a/docs/workflow_testing_strategy.md +++ b/docs/workflow_testing_strategy.md @@ -27,6 +27,7 @@ | Pipeline | Trigger | Suites | Reporting | | --- | --- | --- | --- | | PR Gate | Every pull request | Unit + contract + fast orchestrator smoke scenarios | Pytest JUnit + summary comment | +| Scientific validation automation | Weekly on Tuesday plus manual dispatch | Offline scenario scorecards and live concordance drift checks | JSON/Markdown artifacts uploaded by `.github/workflows/scientific-validation.yml` | | Nightly Sandbox | 01:00 UTC daily | Full orchestrator scenarios, predictive regression, metadata validation | HTML bundle reports, JSON provenance summaries, Slack/email alerts | | Weekly Load | Off-peak (e.g., Sunday) | Locust/k6 runs at 10× expected load | Aggregated latency/throughput dashboards, CSV export | @@ -54,8 +55,8 @@ - Author Locust file targeting websocket endpoints; parameterize chemical lists and concurrency. - Provide k6 script for REST-based predictive services. 5. **Automation** - - Create GitHub Actions workflows (`ci-workflows.yml`, `nightly-workflows.yml`) wiring environment secrets, persistence directories, and artifact uploads. - - Ensure nightly job stores bundle metadata + reports in `artifacts/workflows//`. + - `.github/workflows/scientific-validation.yml` now runs the offline report on every scheduled/manual execution and runs the live concordance panel when `CTX_API_KEY` is available, uploading JSON/Markdown artifacts for both. + - Extend this into a broader nightly workflow once orchestrator scenarios beyond the offline suite are stable enough for routine CI execution. ## Risks & Mitigations - **External API instability:** add sandbox fallback configuration and skip markers when endpoints are offline; ensure nightly job reports skips. diff --git a/tests/test_workflow_hardening.py b/tests/test_workflow_hardening.py index 79088bf..b917de0 100644 --- a/tests/test_workflow_hardening.py +++ b/tests/test_workflow_hardening.py @@ -71,3 +71,19 @@ def test_live_interop_smoke_workflow_exists_with_pinned_tooling() -> None: assert "actions/setup-python@" in text assert "uvicorn epacomp_tox.transport.websocket:app" in text assert "scripts/mcp_interop_smoke.py" in text + + +def test_scientific_validation_workflow_exists_with_artifact_capture() -> None: + text = _workflow_text("scientific-validation.yml") + assert "name: Scientific Validation" in text + assert "workflow_dispatch:" in text + assert 'cron: "15 2 * * 2"' in text + assert "secrets.CTX_API_KEY" in text + assert "actions/checkout@" in text + assert "actions/setup-python@" in text + assert "actions/upload-artifact@" in text + assert 'pip install -e ".[dev]"' in text + assert "scripts/scientific_validation_report.py" in text + assert "scripts/live_concordance_panel.py" in text + assert "artifacts/scientific-validation/offline/report.json" in text + assert "artifacts/scientific-validation/live/report.json" in text