Skip to content

Refresh ROADMAP to drop phase numbering and reflect shipped state#313

Merged
igerber merged 5 commits intomainfrom
roadmap-review
Apr 18, 2026
Merged

Refresh ROADMAP to drop phase numbering and reflect shipped state#313
igerber merged 5 commits intomainfrom
roadmap-review

Conversation

@igerber
Copy link
Copy Markdown
Owner

@igerber igerber commented Apr 18, 2026

Summary

  • Rewrite ROADMAP.md with new structure: Current State / Recently Shipped / Shipping Next / Under Consideration / AI-Agent Track / Long-term. No top-level phase numbering, no version stamps in headings, no third-party product names.
  • Absorb dCDH (shipped end-to-end across PRs Add ChaisemartinDHaultfoeuille (dCDH) DID_M estimator (Phase 1) #290, Add Phase 2 multi-horizon event study for dCDH estimator #294, dCDH Phase 3a: placebo SE, non-binary treatment, parity SE assertions #300, Add Phase 3 PR B: covariates, trends, and extensions for dCDH #302, Add HonestDiD integration for dCDH, summary() Phase 3 blocks #303, Add survey support to dCDH estimator #307) into the Current State estimator list and the Recently Shipped summary. Remove the separate "Future Estimators / Phase 1-3" dCDH roadmap block.
  • Add research-informed candidates to Under Consideration: DiD with no untreated group (de Chaisemartin et al. 2024, inverse-SDiD), efficient staggered DiD (Chen-Sant'Anna-Xie 2025), distributional DiD for staggered (Ciaccio 2024), LP-DiD (Dube et al., JAE 2025 published), few-treated-units inference (Alvarez-Ferman-Wuthrich 2025), Riesz-representation sensitivity (Bach et al. 2025), compositional-change inference (Sant'Anna-Xu 2025), triple-difference identification audit (Ortiz-Villavicencio-Sant'Anna 2025).
  • Promote the AI-agent track to a named long-term arc with a stated vision and named building blocks (BusinessReport, DiagnosticReport, context-aware practitioner_next_steps, AI-legible diagnostic surfaces, scenario-to-estimator guidance).
  • Update companion docs (docs/business-strategy.md Section 8, docs/survey-roadmap.md, docs/practitioner_decision_tree.rst, docs/choosing_estimator.rst, docs/api/chaisemartin_dhaultfoeuille.rst, docs/api/efficient_did.rst, README.md, diff_diff/guides/llms-full.txt) to remove stale "deferred to Phase N" language now that the deferred items have shipped.
  • Correct EfficientDiD covariate-path documentation: the doubly-robust sieve-based covariate path is shipped; several docs still said "not yet supported."
  • Fix broken anchor docs/survey-roadmap.md#deferred-work-consolidated in TODO.md to point at the existing #current-limitations section.

Methodology references (required if estimator / math changes)

  • Method name(s): N/A - no methodology changes. Content-only docs refresh.
  • Paper / source link(s): N/A
  • Any intentional deviations from the source (and why): None

Validation

  • Tests added/updated: No test changes (content-only).
  • Backtest / simulation / notebook evidence (if applicable): N/A
  • Verified ROADMAP.md has no em-dashes, no named third-party products, no version numbers in headings, and no stale top-level phase numbering.
  • Confirmed remaining Phase N strings across the repo are all in whitelisted build-milestone history docs (docs/methodology/REGISTRY.md, docs/methodology/continuous-did.md, docs/performance-plan.md, docs/survey-roadmap.md historical sections) or internal TODO.md tech-debt entries that are out of scope.
  • EfficientDiD covariate-path correction verified against diff_diff/efficient_did.py (covariates parameter on fit() is shipped with doubly-robust sieve-based path).

Security / privacy

  • Confirm no secrets/PII in this PR: Yes

Generated with Claude Code

Rewrites ROADMAP.md to organize work as Current State / Recently Shipped /
Shipping Next / Under Consideration / AI-Agent Track / Long-term. Drops
top-level phase numbering. Absorbs dCDH into the shipped estimator list.
Promotes future-estimator candidates from the 2025-26 methodology research
(including the inverse-SDiD "no untreated group" estimator, efficient
staggered DiD, distributional DiD for staggered, LP-DiD, few-treated-units
inference, Riesz-representation sensitivity, compositional-change inference,
and the triple-difference covariate audit). Promotes the AI-agent track to
a named long-term arc.

Updates companion docs (docs/business-strategy.md Section 8, README.md
dCDH section, docs/practitioner_decision_tree.rst, docs/choosing_estimator.rst,
docs/api/chaisemartin_dhaultfoeuille.rst, docs/api/efficient_did.rst,
docs/survey-roadmap.md, diff_diff/guides/llms-full.txt) to remove stale
phase-deferral language now that the deferred items have shipped. Corrects
EfficientDiD covariate-path documentation (the doubly-robust path is
shipped; several docs still said "not yet supported"). Fixes the broken
TODO.md anchor link into survey-roadmap.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

Overall Assessment

⚠️ Needs changes

Executive Summary

  • EfficientDiD docs now advertise shipped covariate support, but the refreshed text still presents the estimator as unconditionally efficiency-bound-attaining. That conflicts with the registry and class docstring for the shipped covariate path.
  • dCDH placebo inference is now described inconsistently across touched surfaces. Some docs still say all placebo SEs are NaN, while the code and registry say only single-period DID_M^pl is NaN; multi-horizon DID^{pl}_l has inference.
  • ROADMAP.md now lists “Efficient staggered DiD” as future work even though shipped EfficientDiD is already the staggered-adoption estimator.
  • I did not find code, performance, or security regressions in the touched diff; the issues are methodology/documentation accuracy.

Methodology

Code Quality
No findings.

Performance
No findings.

Maintainability

  • Severity: P3. Impact: The same estimator contract is duplicated across README, API docs, estimator-choice docs, roadmap text, and runtime AI guides, and this PR already left those surfaces out of sync. Concrete fix: centralize the sensitive estimator snippets or add a lightweight consistency check for key claims such as the EfficientDiD covariate-efficiency caveat and the dCDH single-lag vs dynamic placebo inference split.

Tech Debt
No deferrable blocker here. The P1 items above are current methodology/SE documentation inaccuracies, not TODO-grade follow-ups.

Security
No findings.

Documentation/Tests

  • Severity: P3, tracked in TODO. Impact: .txt AI guides are outside current doc-smoke coverage, which helps explain the llms-full.txt drift; this is already tracked at TODO.md:L95-L95. Concrete fix: no approval blocker, but extending doc-smoke coverage to .txt guides would reduce recurrence.

Path to Approval

  1. Update EfficientDiD docs in README.md, docs/api/efficient_did.rst, docs/choosing_estimator.rst, and diff_diff/guides/llms-full.txt so the no-covariate path is the only unqualified efficiency-bound claim, and the shipped covariate path is described with the registry’s DR / linear-OR caveat.
  2. Update all touched dCDH docs/guides so they say: single-period DID_M^pl has NaN inference; multi-horizon DID^{pl}_l has analytical SE and bootstrap override when n_bootstrap > 0.
  3. Rewrite or drop the ROADMAP.md “Efficient staggered DiD” candidate so it no longer describes the already-shipped staggered EfficientDiD as future work.

…acebo SE contract; rescope Under-Consideration entry

EfficientDiD efficiency claim: scope the unqualified "achieves the
semiparametric efficiency bound" language to the no-covariate path in
README.md, docs/api/efficient_did.rst, and diff_diff/guides/llms-full.txt.
The doubly-robust covariate path uses a linear OLS outcome regression; per
docs/methodology/REGISTRY.md L899-L903 and the class docstring in
diff_diff/efficient_did.py L139-L146, this preserves DR consistency but
does not generically attain the efficiency bound unless the conditional
mean is linear in the covariates.

dCDH placebo SE contract: correct the inconsistent placebo-NaN statements
in README.md and docs/choosing_estimator.rst; tighten diff_diff/guides/
llms-full.txt. The actual contract (per docs/methodology/REGISTRY.md L546
and diff_diff/chaisemartin_dhaultfoeuille.py): single-period DID_M^pl
(L_max=None) has NaN SE because the per-period aggregation path has no
influence-function derivation, with NaN-consistent inference even under
n_bootstrap > 0; multi-horizon dynamic placebos DID^{pl}_l (L_max >= 1)
have valid analytical SE via the placebo IF and bootstrap SE when
n_bootstrap > 0.

ROADMAP.md rescope: the "Efficient staggered DiD" Under-Consideration
entry incorrectly framed shipped staggered EfficientDiD as future work.
Rewrite as "Nonparametric / flexible outcome regression for EfficientDiD
DR covariate path" - the actual gap vs the Chen-Sant'Anna-Xie 2025 Section
4 proposal, where nonparametric OR (sieve / kernel / ML) would close the
efficiency gap the current linear OLS OR leaves open on the covariate
path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 18, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: dbe1472433fc8abfbaed133fd1bbaa02900271a6


Overall Assessment

⚠️ Needs changes

Executive Summary

  • Re-review: the prior dCDH placebo-inference mismatch appears resolved in the touched docs; the updated README/guide/chooser text now matches the registry’s split between single-period DID_M^pl (NaN SE) and multi-horizon DID^{pl}_l (analytical/bootstrap SE).
  • Re-review: the prior roadmap issue that treated shipped staggered EfficientDiD support as future work is resolved; the future item is now correctly framed as a nonparametric outcome-regression extension.
  • One P1 remains: the EfficientDiD covariate-path caveat was fixed in some touched surfaces, but not all. docs/choosing_estimator.rst:L363-L374 and ROADMAP.md:L18 still overstate the shipped covariate path as efficiency-optimal, contrary to the registry and class docstring.
  • [Newly identified] The refreshed dCDH API page still pulls in a stale Phase-1 class docstring through autodoc, so the built page will continue to show contradictory support status unless the in-code docstring is updated.
  • I did not find code, performance, or security regressions in the changed diff.

Methodology

  • Severity: P1. Impact: EfficientDiD’s shipped covariate contract is still not documented consistently after this refresh. docs/choosing_estimator.rst:L363-L374 still presents EfficientDiD as the generic “maximum statistical efficiency” choice while advertising covariate support, and ROADMAP.md:L18 says it is “semiparametrically efficient with doubly robust covariates.” That conflicts with the source-of-truth contract in docs/methodology/REGISTRY.md:L899-L903 and diff_diff/efficient_did.py:L135-L146, which state that the no-covariate path attains the bound, while the shipped covariate path is doubly robust but does not generically attain the semiparametric efficiency bound under the linear OLS outcome regression. The same contradiction now appears inside ROADMAP.md itself because ROADMAP.md:L104 correctly describes the covariate-path efficiency gap. Concrete fix: align both touched surfaces to the registry/docstring language by reserving the unqualified efficiency-bound claim for the no-covariate path and describing the current covariate path as DR-consistent with the linear-OR caveat.

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • No findings.

Tech Debt

  • No new deferrable tech-debt items. The remaining P1 is a current methodology-claim mismatch, not TODO-grade deferred work.

Security

  • No findings.

Documentation/Tests

  • Severity: P2. Impact: docs/api/chaisemartin_dhaultfoeuille.rst:L98-L116 now has a refreshed top-of-page description, but it still renders .. autoclass:: diff_diff.ChaisemartinDHaultfoeuille, and the underlying class docstring at diff_diff/chaisemartin_dhaultfoeuille.py:L294-L326 still says “Phase 1” and future-phase pointers. The built API page will therefore continue to show stale phase-deferral language despite this PR’s stated goal of removing it. Concrete fix: update the class docstring to the shipped contract, or suppress the autogenerated class prose until the docstring is aligned.

Path to Approval

  1. Update the EfficientDiD section in docs/choosing_estimator.rst:L363-L374 so “maximum statistical efficiency” is explicitly scoped to the no-covariate path, and add the registry’s linear-OLS covariate-path caveat.
  2. Rewrite the EfficientDiD bullet in ROADMAP.md:L18 so it matches the shipped contract and no longer contradicts ROADMAP.md:L104.

…ing surfaces; update dCDH class docstring

Remaining EfficientDiD surfaces the prior pass missed:

- ROADMAP.md line 18 (Current State estimator list) previously described
  EfficientDiD as "semiparametrically efficient with doubly robust
  covariates" - unqualified, and contradicted ROADMAP.md line 104 which
  correctly describes the covariate-path efficiency gap. Rewritten to
  reserve the unqualified bound claim for the no-covariate path and
  describe the DR covariate path with the linear-OR caveat.
- docs/choosing_estimator.rst lines 363-374 previously said "maximum
  statistical efficiency" while advertising covariate support. Scoped
  the efficiency claim to the no-covariate path and expanded the
  covariate-path note with the DR / linear-OR caveat from REGISTRY.md
  L899-L903.

dCDH class docstring (autodoc source):

- diff_diff/chaisemartin_dhaultfoeuille.py lines 294-326 still opened
  with "de Chaisemartin-D'Haultfoeuille (dCDH) estimator - Phase 1." and
  listed "Phase 1 deliverables" / "Phase 1 contract" / "reserved for a
  future phase." Because docs/api/chaisemartin_dhaultfoeuille.rst renders
  `.. autoclass::` on this class, the built API page was still showing
  stale phase-deferral language. Rewrote the docstring to describe the
  shipped contract (headline DID_M plus multi-horizon DID_l, covariate
  adjustment, trends, non-binary treatment, HonestDiD integration, and
  survey support - only `aggregate` on fit() still raises
  NotImplementedError). Did not touch the other Phase-N references in
  the file (internal code comments and runtime-visible error messages),
  which are out of scope per the refresh plan.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 18, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 6ba9fbfff804c5fc7946fb886e25ddfe320e6d7b


Overall Assessment

✅ Looks good

Executive Summary

Methodology

  • Severity: P3. Impact: docs/api/efficient_did.rst:L147-L153 still says EfficientDiD “Achieves semiparametric bound” in the comparison table, which overreads the actual shipped contract for the covariate path. The caveat is already documented in the registry and surrounding note, so this is informational rather than blocking. Concrete fix: change the table cell to “Achieves semiparametric bound on the no-covariate path” and, if desired, mirror the DR caveat in the covariates row.

Code Quality

No findings.

Performance

No findings.

Maintainability

No findings.

Tech Debt

No findings. I did not see new deferred-work items that need TODO.md tracking.

Security

No findings.

Documentation/Tests

  • Severity: P3. Impact: diff_diff/guides/llms-full.txt:L643-L647 still advertises EfficientDiD clustering as unavailable, which is stale against the estimator contract in diff_diff/efficient_did.py:L156-L160. This does not affect library behavior, but it can mislead agent-facing guidance. Concrete fix: update that constructor comment to describe cluster-robust analytical SEs and cluster-level multiplier bootstrap instead of “Not yet implemented.”
  • No test-gap finding. This remains a docs-focused PR, and the touched Python change is docstring-only.

…orrect stale EfficientDiD cluster comment

- docs/api/efficient_did.rst comparison table: the "Efficiency" row cell
  still read "Achieves semiparametric bound" unqualified, even though
  the surrounding page note, registry, and class docstring scope that
  claim to the no-covariate path. Updated to match those sources and
  mirrored the DR caveat in the Covariates row.
- diff_diff/guides/llms-full.txt EfficientDiD constructor comment: the
  `cluster: str | None = None` line was still annotated "Not yet
  implemented," contrary to diff_diff/efficient_did.py L154-L160 which
  documents shipped cluster-robust SEs (Liang-Zeger on EIF values) and
  cluster-level multiplier bootstrap. Updated the comment to describe
  the actual shipped behavior so the runtime LLM guide no longer
  misleads agent-facing consumers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 18, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 7cd9e08bb98ea23c0397c941bade31bb5b6812e3


Overall Assessment

✅ Looks good

Executive Summary

Methodology

No findings. I cross-checked the changed EfficientDiD and dCDH claims against the Methodology Registry and live class docstrings, including the EfficientDiD DR-efficiency caveat and the dCDH single-period versus dynamic placebo SE contract.

Code Quality

No findings. The touched Python change in diff_diff/chaisemartin_dhaultfoeuille.py:L294-L337 is docstring-only.

Performance

No findings. No runtime-affecting code paths changed.

Maintainability

No findings.

Tech Debt

No findings. I did not see new deferred work that needed separate tracking in TODO.md.

Security

No findings.

Documentation/Tests

No findings. For a docs refresh plus one docstring-only Python touch, the lack of test changes is appropriate, and the previously flagged EfficientDiD doc drift is now addressed.

@igerber igerber added the ready-for-ci Triggers CI test workflows label Apr 18, 2026
…oe-ligature

The UTF-8 roundtrip test in tests/test_guides.py used '\u2014' (em-dash)
as the non-ASCII fingerprint in llms-full.txt. The roadmap-refresh pass
replaced em-dashes with hyphens throughout the guide (per the project's
hyphens-over-em-dashes style), which removed the fingerprint and broke
the test.

The guide still contains '\u0153' (oe-ligature, from the author name
"D'Haultfoeuille" in the dCDH section), which is the remaining non-ASCII
character and a more stable fingerprint since it's tied to a paper
citation rather than stylistic punctuation. Updated the test accordingly.

No behavior change; this is a test-side assertion update to match the
guide's current character set.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 18, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 13e08debfb0ab0566c32b4c9261725baf8625a5f


Overall Assessment

✅ Looks good

Executive Summary

Methodology

Code Quality

Performance

  • Severity: none. Impact: docs/test/docstring-only changes do not affect runtime paths or estimator performance. Concrete fix: none.

Maintainability

Tech Debt

Security

  • Severity: none. Impact: docstring/docs/test-only changes add no secret exposure or executable attack surface. Concrete fix: none.

Documentation/Tests

  • Severity: none. Impact: the changed documentation is internally consistent with the shipped contracts, and the updated UTF-8 sentinel in tests/test_guides.py:L45-L49 matches llms-full.txt:L233-L233. I could not run the touched test in this sandbox because pytest is not installed and numpy is unavailable for import-time package initialization. Concrete fix: none.

@igerber igerber merged commit 475f4e3 into main Apr 18, 2026
19 checks passed
@igerber igerber deleted the roadmap-review branch April 18, 2026 15:25
igerber added a commit that referenced this pull request Apr 18, 2026
Absorbs #312 (sdid scale fix), #313 (roadmap refresh + dCDH docstring
rewrite), and #314 (within_transform convergence warnings) from main.

Conflicts resolved in:
- diff_diff/chaisemartin_dhaultfoeuille.py: took main's comprehensive
  Phase 1-3 feature list in the class docstring but merged in the
  PR #311 group-vs-PSU bootstrap-clustering framing and the
  replicate-weight survey-support line. Kept the PR #311
  'user-specified cluster= not supported + automatic PSU-level under
  survey_design' wording for the cluster= parameter docstring (strictly
  more accurate than main's 'always clusters at the group level' text).
- diff_diff/guides/llms-full.txt: kept main's more detailed placebo SE
  contract paragraph (which already distinguishes single-period NaN
  from multi-horizon analytical/bootstrap) and appended the sup-t /
  shared-weights / cross-horizon coverage details from the PR #311
  update. Kept the PR #311 survey_design signature comment that
  mentions TSL + replicate + PSU bootstrap.

Full regression across touched areas: 336 + 324 passing.
igerber added a commit that referenced this pull request Apr 18, 2026
… remove deprecated SyntheticDiD params

Package four merged PRs (#312 SDID catastrophic cancellation at extreme Y scale,
#313 roadmap refresh, #314 FE imputation non-convergence signaling, #315 Frank-Wolfe
SC weight solver non-convergence signaling) as 3.1.2. Also remove the
SyntheticDiD(lambda_reg=...) and SyntheticDiD(zeta=...) kwargs, which have been
deprecated with DeprecationWarning since v2.3.1 (2026-02-10) in favor of
zeta_omega / zeta_lambda; their warning messages announced removal in v3.1. Passing
the old kwargs now raises TypeError at __init__ and ValueError: Unknown parameter
at set_params. Internal ridge-regression helpers that accept a lambda_reg parameter
(compute_synthetic_weights, rank_control_units, Rust FFI bindings) are unaffected.

Version strings bumped in diff_diff/__init__.py, pyproject.toml, rust/Cargo.toml,
and diff_diff/guides/llms-full.txt. CHANGELOG populated with Fixed / Changed /
Removed sections and comparison-link footer. TODO.md's "Deprecated Code" entry
removed now that the task is done.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
igerber added a commit that referenced this pull request Apr 18, 2026
Per CI review feedback (#316): removing public kwargs under a patch version
violates Semantic Versioning, which CHANGELOG.md explicitly claims to adhere to.
Restore lambda_reg and zeta handling in SyntheticDiD.__init__ and set_params
as warning-only, and bump the removal target in the DeprecationWarning text
from "v3.1" to "v4.0.0". The 3.1.2 release now carries only the four fix/doc
PRs (#312 SDID scale, #313 roadmap, #314 FE imputation convergence, #315
Frank-Wolfe convergence) with no breaking changes.

- diff_diff/synthetic_did.py: restore deprecated kwargs + warnings (v4.0.0 text)
- tests/test_methodology_sdid.py: restore TestDeprecatedParams class + set_params deprecation test
- tests/test_estimators.py: restore test_deprecated_params
- CHANGELOG.md: drop Removed section; add Changed entry documenting the v3.1 -> v4.0.0 bump in the removal target
- TODO.md: restore Deprecated Code section with v4.0.0 removal target and SemVer rationale

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-ci Triggers CI test workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant