Feat/open source governance#3
Merged
maliah1010 merged 14 commits intodevfrom Mar 29, 2026
Merged
Conversation
- CODEOWNERS: auto-assign @antnewman as reviewer on all PRs - ISSUE_TEMPLATE: structured bug report and feature request forms (yml), blank issues disabled - PULL_REQUEST_TEMPLATE: contributor checklist with test/lint/docs gates - workflows/ci.yml: GitHub Actions CI running pytest + ruff for all 3 packages on push to dev and PRs targeting main/dev - CODE_OF_CONDUCT.md: Contributor Covenant at repo root (GitHub surfaces this in the Community Standards checklist) - SECURITY.md: centralised security policy at repo root covering all packages, private reporting via GitHub advisories
P2 - NISTA Confidence Score Evolution: - Add ConfidenceScoreRecord, NISTAThresholdConfig, TrendDirection, ThresholdBreach, NISTAScoreHistory to schemas/nista/history.py - Extend NISTAValidator.validate() with optional history parameter (return signature unchanged; persistence is a side effect) - Add confidence_scores table to shared SQLite store P3 - Assurance Recommendation Tracker: - Add pm_data_tools/assurance/ module with Recommendation models, RecommendationExtractor (wraps ConfidenceExtractor), and RecurrenceDetector (sentence-transformers with graceful fallback) - Add recommendations table to shared SQLite store Infrastructure: - Add pm_data_tools/db/store.py — shared AssuranceStore (SQLite) - Add pm-assure MCP server with nista_score_trend, track_recommendations, recommendation_status tools - Add tests/test_assurance/ with conftest, P2 and P3 test suites - Add docs/assurance.md and docs/assurance-for-practitioners.md - Update README features table, pyproject.toml optional extras
P1 — Artefact Currency Validator (documented, v0.4.0 planned): - Introduces ArtefactCurrencyValidator, CurrencyConfig, CurrencyScore - Status flags: CURRENT / OUTDATED / ANOMALOUS_UPDATE - Detects genuinely stale artefacts and last-minute compliance updates P2 — Longitudinal Compliance Tracker (was: NISTA Score History): - NISTAScoreHistory → LongitudinalComplianceTracker - NISTAThresholdConfig → ComplianceThresholdConfig - MCP tool nista_score_trend → nista_longitudinal_trend - history.py retained as backward-compat shim; new canonical: longitudinal.py - test_nista_history.py → test_longitudinal_compliance.py (stub retained) P3 — Cross-Cycle Finding Analyzer (was: Recommendation Tracker): - RecommendationExtractor → FindingAnalyzer - Recommendation → ReviewAction - RecommendationStatus → ReviewActionStatus - RecommendationExtractionResult → FindingAnalysisResult - MCP tool track_recommendations → track_review_actions - MCP tool recommendation_status → review_action_status - extractor.py retained as backward-compat shim; new canonical: analyzer.py - test_recommendation_tracker.py → test_finding_analyzer.py (stub retained) Docs: full rewrite of assurance.md (technical) and assurance-for-practitioners.md (PM audience) covering all three features. Backward-compat alias table added to developer reference.
- packages/pm-data-tools/README.md: full rewrite covering LongitudinalComplianceTracker, FindingAnalyzer, ArtefactCurrencyValidator, and updated architecture diagram - packages/pm-data-tools/CHANGELOG.md: add v0.3.0 entry with all renamed classes, deprecation notices, and new MCP tool names - packages/pm-data-tools/docs/nista/README.md: add LongitudinalComplianceTracker usage section with config example - packages/pm-mcp-servers/README.md: full rewrite; add pm-assure server with nista_longitudinal_trend, track_review_actions, review_action_status tools - packages/pm-mcp-servers/examples/sample_queries.md: add P2/P3 assurance query examples and full assurance review workflow - packages/pm-mcp-servers/examples/claude_integration.md: add pm-assure server config block with env var guidance - packages/pm-mcp-servers/IMPLEMENTATION_ROADMAP.md: mark Phase 3 complete; add P1 Artefact Currency Validator as next roadmap item - docs/getting-started.md: add quick-start examples for P2 and P3; add pm-assure to MCP server config block - docs/architecture-overview.md: add pm-assure server to diagram and server list; bump version to 1.1, update maintainer and date
…rgence Monitor P1 — ArtefactCurrencyValidator (assurance/currency.py): - CurrencyStatus enum: CURRENT / OUTDATED / ANOMALOUS_UPDATE - CurrencyConfig: max_staleness_days (90), anomaly_window_days (3) - CurrencyScore result model with staleness_days and message - check_artefact_currency() single-artefact check - check_batch() batch processing with ISO-8601 string support - 14 tests in test_currency.py covering all branches and edge cases P4 — DivergenceMonitor (assurance/divergence.py): - SignalType enum: STABLE / HIGH_DIVERGENCE / LOW_CONSENSUS / DEGRADING_CONFIDENCE - DivergenceConfig with divergence_threshold, min_consensus, degradation_window - DivergenceSnapshot and DivergenceResult Pydantic models - DivergenceMonitor.check() classifies, persists snapshot, returns result - 17 tests in test_divergence.py - divergence_snapshots table added to AssuranceStore - insert_divergence_snapshot() and get_divergence_history() store methods MCP server (pm-assure/server.py): - check_artefact_currency tool (4th tool) - check_confidence_divergence tool (5th tool) assurance/__init__.py: exports all P1 and P4 public API conftest.py: currency_validator, divergence_monitor, divergence_monitor_strict fixtures
… logger P5 (AdaptiveReviewScheduler): analyses P1–P4 signals to recommend optimal review timing. Composite scoring with configurable per-source weights, IMMEDIATE/EXPEDITED/STANDARD/DEFERRED urgency classification, date clamping, and SQLite persistence. MCP tool: recommend_review_schedule. P6 (OverrideDecisionLogger): structured logging and post-outcome tracking for governance override decisions. Pattern analysis computes impact rate and top authorisers. MCP tools: log_override_decision, analyse_override_patterns. - scheduler.py: AdaptiveReviewScheduler, SchedulerConfig, SchedulerRecommendation, SchedulerSignal, ReviewUrgency - overrides.py: OverrideDecisionLogger, OverrideDecision, OverridePatternSummary, OverrideType, OverrideOutcome - store.py: review_schedule_recommendations and override_decisions tables with insert_schedule_recommendation, get_schedule_history, upsert_override_decision, get_override_decisions, update_override_outcome - server.py: 3 new MCP tools (tools 6–8); server total now 8 tools - test_scheduler.py: 18 tests; test_overrides.py: 15 tests - docs/assurance.md: full P4–P6 reference sections; logging and table inventory - IMPLEMENTATION_ROADMAP.md: updated to reflect P1–P6 complete
…Overhead Optimiser Completes Horizon 2 of the assurance feature roadmap. P7 — Lessons Learned Knowledge Engine (assurance/lessons.py): - Ingest, search, and analyse structured lessons with contextual metadata - Keyword and semantic search (sentence-transformers optional, graceful fallback) - Pattern analysis across the lessons corpus - MCP tools: ingest_lesson, search_lessons P8 — Assurance Overhead Optimiser (assurance/overhead.py): - Track assurance activity effort and correlate with confidence outcomes - Detect duplicate/overlapping checks (same_artefact, same_type_same_week, no_findings_repeat) - Classify efficiency (OPTIMAL/UNDER_INVESTED/OVER_INVESTED/MISALLOCATED) - Generate human-readable optimisation recommendations - MCP tools: log_assurance_activity, analyse_assurance_overhead Store: 3 new tables (lessons_learned, assurance_activities, overhead_analyses) Tests: 21 tests for P7, 20 tests for P8 (131 total, all passing)
…Domain Classifier Completes the full 10-feature assurance roadmap (all three horizons). P9 — Agentic Assurance Workflow Engine (assurance/workflows.py): - 5 workflow types: FULL_ASSURANCE, COMPLIANCE_FOCUS, CURRENCY_FOCUS, TREND_ANALYSIS, RISK_ASSESSMENT - 8 step executors (one per P1-P8) with fail-safe error handling - Inter-step data flow: P1/P2/P3/P4 outputs passed to P5 scheduler - Health classification: HEALTHY / ATTENTION_NEEDED / AT_RISK / CRITICAL - Aggregated risk signals, recommended actions, executive summary - MCP tools 13-14: run_assurance_workflow, get_workflow_history P10 — Project Domain Classifier (assurance/classifier.py): - 4 complexity domains: CLEAR / COMPLICATED / COMPLEX / CHAOTIC - 7 explicit indicators with inversion for clarity/track-record fields - 4 store-derived signals from P2, P3, P6, P8 - Weighted composite score (explicit 0.70, derived 0.30) with renormalisation - Domain assurance profiles with tailored review cadence (14-90 days) - MCP tools 15-16: classify_project_domain, reclassify_from_store Store additions (db/store.py): - workflow_executions table + insert_workflow_execution, get_workflow_history - domain_classifications table + insert_domain_classification, get_domain_classifications Tests: 35 new tests for P9 (test_workflows.py), 35 for P10 (test_classifier.py) All 218 assurance tests passing.
- Extend assurance-for-practitioners.md with P4-P10 practitioner guides: Confidence Divergence Monitor, Adaptive Review Scheduler, Override Decision Logger, Lessons Learned Knowledge Engine, Assurance Overhead Optimiser, Assurance Workflow Engine, and Project Domain Classifier — plus a combined usage guide and extended FAQ section (file grows from 246 to 754 lines) - Add docs/database-schema.md: full schema reference for all 10 SQLite tables with column types, nullability, constraints, and example queries - Add docs/data-model-reference.md: canonical Pydantic model field reference for all major models across P2-P10 including serialisation notes - Update barrier-mapping.md: replace placeholder [INSERT QUOTE] template markers with substantive barrier summaries; clean up structure - Update architecture-overview.md: fix pm-assure server description to reflect 16 tools; update maintainer to maliah1010 - Update getting-started.md: fix GitHub URLs from antnewman to maliah1010
Replace placeholder-based structure with content grounded in the PDATF Green Paper and Newman (2026) "From Policy to Practice". Six barrier themes (Leadership, Data/Interop, Digital/Tech, Skills, Procurement, Risk/Ethics) replacing previous eight-barrier approximation. Direct quotes throughout. Platform mapping adapted from paper's Table 11, extended to include pm-assure P1-P10. Indicative Principles table added. References updated to match paper.
Add dedicated section for the interactive Project Delivery Toolkit (projects-toolkit.netlify.app) with feature table, barrier mapping, and guidance on how to use it alongside the document. Also add to summary table.
Three Universal Dashboard Specification (UDS) v0.1.0 definitions covering all ten assurance features (P1-P10): - assurance-overview.uds.yaml: executive SRO single-screen health summary - assurance-deep-dive.uds.yaml: PMO analyst five-tab operational view - assurance-portfolio.uds.yaml: portfolio director cross-project comparison
Generates a realistic portfolio of 15 UK government projects with 12 months of assurance history, populating all 10 AssuranceStore SQLite tables. Covers CLEAR/COMPLICATED/COMPLEX/CHAOTIC complexity domains with internally consistent narrative arcs per project archetype. Supports --output and --verify CLI flags; deterministic via random.seed(42).
maliah1010
pushed a commit
that referenced
this pull request
Mar 29, 2026
docs: update DOI and add Limitations section
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Merging OS gov into dev as now make artificial data