Skip to content

Signal non-convergence in FE imputation alternating-projection solvers#314

Merged
igerber merged 6 commits intomainfrom
fix/fe-imputation-convergence-warnings
Apr 18, 2026
Merged

Signal non-convergence in FE imputation alternating-projection solvers#314
igerber merged 6 commits intomainfrom
fix/fe-imputation-convergence-warnings

Conversation

@igerber
Copy link
Copy Markdown
Owner

@igerber igerber commented Apr 18, 2026

Summary

  • Add warn_if_not_converged helper in diff_diff.utils (reusable across axis-B fixes).
  • Thread a converged flag through the four alternating-projection loops in TwoStageDiD._iterative_fe, TwoStageDiD._iterative_demean, ImputationDiD._iterative_fe, ImputationDiD._iterative_demean; call the helper after the loop.
  • Pattern matches the existing logistic + Poisson IRLS warnings at linalg.py:1329-1376.
  • Warning-only: no new public parameter, no behavior change on inputs that already converge.

Methodology references (required if estimator / math changes)

  • Methods: Borusyak-Jaravel-Spiess (2024) imputation DiD; Gardner (2022) two-stage DiD.
  • Intentional deviation from R: none. R's didimputation and did2s do not emit a non-convergence signal either, but this library already emits convergence warnings in its other IRLS paths; extending that convention to alternating projection is an additive signal, not a methodology change.
  • REGISTRY.md updated under both ImputationDiD and TwoStageDiD with **Note:** labels describing the new signal.

Validation

  • New tests in tests/test_imputation.py::TestImputationEdgeCases (3) and tests/test_two_stage.py::TestSilentWarningAudit (3) exercise non-convergence under tight tol/max_iter and assert the warning fires, plus one negative-control test per estimator.
  • 156 tests pass across test_imputation.py + test_two_stage.py (150 pre-existing + 6 new). Zero regressions.
  • No tutorial or notebook changes.

Security / privacy

  • Confirm no secrets/PII in this PR: Yes

TwoStageDiD and ImputationDiD each run two iterative alternating-projection
solvers (_iterative_fe, _iterative_demean) whose convergence loop exited
silently on max_iter exhaustion, returning the current iterate as if
converged. This matches the silent-failure pattern audited under axis B of
the silent-failures initiative (findings #2-#5).

Adds a shared warn_if_not_converged helper in diff_diff.utils and calls it
from all four alternating-projection loops on non-convergence. Pattern
mirrors the existing logistic + Poisson IRLS convergence warnings in
linalg.py (lines 1329-1376). Warning-only: no new public parameter, no
behavior change on inputs that already converge.

Updates REGISTRY.md entries for ImputationDiD and TwoStageDiD with Note
labels describing the new signal.

Axis-B regression-lint baseline: 10 silent range(max_iter) loops -> 6
remaining (Frank-Wolfe and TROP addressed in follow-up PRs).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

Overall assessment

✅ Looks good

Executive summary

Methodology

Code Quality

  • Severity: none. Impact: no code-quality defects identified in the changed lines; extracting diff_diff/utils.py:L68-L88 removes duplication cleanly. Concrete fix: none.

Performance

  • Severity: none. Impact: no meaningful performance risk; the added boolean flag and post-loop helper call are negligible relative to the iterative groupby passes. Concrete fix: none.

Maintainability

  • Severity: none. Impact: no maintainability issue identified; the shared helper makes future convergence-warning changes less error-prone across estimators. Concrete fix: none.

Tech Debt

  • Severity: none. Impact: no new deferred work needs TODO.md tracking from this diff. Concrete fix: none.

Security

  • Severity: none. Impact: no security issues identified; the PR only adds warning plumbing, docs, and tests. Concrete fix: none.

Documentation/Tests

  • Severity: P3. Impact: _iterative_demean now has a positive warning test, but there is no matching convergent negative-control test in either estimator, so an accidental always-warn regression on the demean path would not be caught. Concrete fix: add warnings.catch_warnings(record=True) tests mirroring the FE negative control for _iterative_demean in tests/test_imputation.py:L2119-L2129 and tests/test_two_stage.py:L1356-L1366.

Review note: this was a static review only. I could not execute the tests in this environment because pytest and numpy are unavailable.

Mirrors the positive + negative coverage already present for _iterative_fe.
Ensures an accidental always-warn regression on the alternating-projection
demean path would fail the test suite.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 18, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 5fcb63d375b55a47d5c527ed46b7f554914ee363


Overall Assessment

Needs changes.

Executive Summary

  • Re-review scope: the prior P3 about missing convergent negative-control coverage for _iterative_demean is addressed by the new no-warning tests in tests/test_imputation.py:L2131 and tests/test_two_stage.py:L1368.
  • The affected methods are ImputationDiD and TwoStageDiD. The new warning hooks are confined to the untreated-observation FE / residualization solvers and are documented in docs/methodology/REGISTRY.md:L1086 and docs/methodology/REGISTRY.md:L1164; they do not change the source-material estimator equations or variance formulas. (academic.oup.com)
  • Severity P1 [Newly identified]: the same silent non-convergence pattern still exists in the weighted alternating-projection loops inside diff_diff/utils.py:L1865, even though this PR adds a shared warning helper in the same file.
  • That gap is live in production code: diff_diff/wooldridge.py:L651 always passes a weight vector into diff_diff/wooldridge.py:L667, so Wooldridge’s within-transformation still has the silent-failure pattern this PR is standardizing away.
  • No P0 findings.
  • Static review only: I could not execute pytest here because pytest is not installed.

Methodology

  • Severity: P3 informational. Impact: the changed estimator paths are the alternating-projection FE / demeaning routines in diff_diff/imputation.py:L872, diff_diff/imputation.py:L956, diff_diff/two_stage.py:L855, and diff_diff/two_stage.py:L933. Borusyak-Jaravel-Spiess derive the imputation-form estimator, and Gardner’s two-stage procedure identifies fixed effects from untreated observations before estimating treatment effects on residualized outcomes; this PR only adds post-loop signaling and leaves those method definitions intact. The additive behavior change is documented in the Methodology Registry notes cited above. Concrete fix: none. (academic.oup.com)

Code Quality

  • Severity: P1 [Newly identified]. Impact: this PR fixes the silent non-convergence pattern in four alternating-projection loops, but diff_diff/utils.py:L1882 and diff_diff/utils.py:L1893 still run weighted alternating projections up to a hard-coded 100 iterations and then silently return the last iterate with no warn_if_not_converged() call. That is exactly the checklist’s “partial pattern fix” case. Because diff_diff/wooldridge.py:L651 supplies either survey weights or np.ones(...) and diff_diff/wooldridge.py:L667 routes through within_transform(), the unresolved silent-failure path is not dead code. Concrete fix: add converged/for...else handling to both weighted loops in within_transform() using diff_diff/utils.py:L68, or refactor those loops into a shared private helper that emits the warning on exhaustion.

Performance

No findings.

Maintainability

No findings. The shared helper in diff_diff/utils.py:L68 is a maintainability improvement for the four modified estimator loops.

Tech Debt

No separate finding. TODO.md:L51 does not already track the within_transform() silent non-convergence path, so the P1 above is unmitigated.

Security

No findings.

Documentation/Tests

No unmitigated findings. The prior re-review concern about missing convergent _iterative_demean negative controls is resolved by tests/test_imputation.py:L2131 and tests/test_two_stage.py:L1368. I could not run the tests in this environment because pytest is unavailable.

Path to Approval

  1. Apply the same non-convergence warning pattern to the weighted alternating-projection loops in diff_diff/utils.py:L1882 and diff_diff/utils.py:L1893, so exhausting the hard-coded 100 iterations is no longer silent.
  2. Add regression tests for that weighted path: one forced non-convergence case that asserts the warning, and one convergent negative control. Covering the production caller via diff_diff/wooldridge.py:L667 would be sufficient.

…ction

AI review on PR #314 surfaced two more silent alternating-projection loops
in diff_diff.utils.within_transform (previously hard-coded range(100) with
tol=1e-8). The weighted path is exercised on every WooldridgeDiD fit via
wooldridge.py:667, so leaving it silent would have left this PR as a
"partial pattern fix" for axis B.

- Expose max_iter and tol as optional kwargs (defaults preserved: 100, 1e-8).
- Track non-converged variables across the per-variable inner loops and
  emit a single consolidated UserWarning naming which variables failed,
  reusing the warn_if_not_converged helper already added in this PR.
- Add a positive warning test and a convergent negative control in
  TestWithinTransformationAlgebra.

Axis-B regression-lint baseline: 10 -> 4 silent range(max_iter) loops
remaining (Frank-Wolfe, TROP global outer+inner, TROP local).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 18, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 4a7fee3785b3a59f6cde665bc289e5a8c9b9070f


Overall Assessment

✅ Looks good

Executive Summary

Methodology

Code Quality

Performance

  • No findings.

Maintainability

  • No findings. Centralizing warning emission in diff_diff/utils.py:L68-L88 reduces duplicated convergence-warning code across the estimator loops and the weighted utility path.

Tech Debt

  • No findings.

Security

  • No findings.

Documentation/Tests

Both parameters apply only when weights is not None.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 18, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: ef388a044463d3c7e4a333a74f6cc1192b9f41e4


Overall Assessment

✅ Looks good

Executive Summary

  • Re-review scope: the prior documentation gap on within_transform(max_iter, tol) is resolved in diff_diff/utils.py:L1816.
  • Methodology-affected estimators are ImputationDiD and TwoStageDiD. The new warn_if_not_converged() calls are post-loop only in diff_diff/imputation.py:L872 and diff_diff/two_stage.py:L855; they do not change the BJS/Gardner estimating equations, weighting, identification logic, or variance formulas.
  • The warning-only behavior is documented in the Methodology Registry for those two estimators at docs/methodology/REGISTRY.md:L1086 and docs/methodology/REGISTRY.md:L1164.
  • No unmitigated P0/P1 findings.
  • P3 only: the shared within_transform() warning path also affects WooldridgeDiD and survey-weighted TWFE, SunAbraham, and BaconDecomposition, but the registry updates only describe ImputationDiD and TwoStageDiD.
  • Static review only: I could not execute the new tests because pytest is not installed in this environment.

Methodology

Code Quality

  • No findings. Centralizing the warning emission in diff_diff/utils.py:L68 removes duplicated warning text and keeps the silent-failure fix consistent across the touched loops.

Performance

  • No findings. The added work is a constant-time post-loop check and warning call.

Maintainability

  • No findings.

Tech Debt

  • No findings. I did not see a new silent-correctness issue that would need TODO tracking.

Security

  • No findings.

Documentation/Tests

…tors

AI review surfaced that the shared within_transform weighted path also
affects WooldridgeDiD and survey-weighted TwoWayFixedEffects, SunAbraham,
and BaconDecomposition, but the prior notes only covered ImputationDiD
and TwoStageDiD.

Adds one shared note under 'Absorbed Fixed Effects with Survey Weights'
covering the cross-estimator behavior, plus a per-estimator note in the
WooldridgeDiD section (since its path always routes through the weighted
helper, regardless of whether survey weights are provided).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 18, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 48622a3fb01357d825e4645d9dc61b6ac8f76b1f


Overall Assessment

✅ Looks good

Executive Summary

  • Re-review scope: the prior P3 documentation gap around shared weighted within_transform() consumers appears resolved by the new absorbed-FE registry note and the Wooldridge-specific note at docs/methodology/REGISTRY.md:2525 and docs/methodology/REGISTRY.md:1304.
  • Methodology-affected paths are warning-only post-loop checks in ImputationDiD and TwoStageDiD, plus the shared weighted within_transform() path at diff_diff/imputation.py:949, diff_diff/imputation.py:1011, diff_diff/two_stage.py:926, diff_diff/two_stage.py:984, and diff_diff/utils.py:1922.
  • I did not find any change to estimating equations, control/comparison-group logic, weighting, identification assumptions, or variance/SE computation.
  • No unmitigated P0/P1 findings.
  • One P3 informational item remains: the estimator registry notes document the iterative FE warning but not the newly added iterative demeaning warning paths.
  • Static review only: pytest is not installed here, and import-time runtime deps are missing, so I could not execute the added tests.

Methodology

Code Quality

  • No findings. Centralizing the warning text in diff_diff/utils.py:68 keeps the new signal consistent across the touched alternating-projection loops.

Performance

  • No findings. The added work is a constant-time convergence-flag check and an optional warning after the loop.

Maintainability

  • No findings. The shared-helper documentation coverage is materially better now that the absorbed-FE registry note explicitly names TwoWayFixedEffects, SunAbraham, BaconDecomposition, and WooldridgeDiD at docs/methodology/REGISTRY.md:2529.

Tech Debt

  • No findings. I did not identify a new correctness issue that should be deferred into TODO.md.

Security

  • No findings.

Documentation/Tests

…mean

The warning behavior added in this PR also fires from the iterative
demeaning helper (covariate residualization in both estimators, plus the
ImputationDiD pre-trend test). Earlier notes named only the FE solver.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 18, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: f74941661a3d116fe2f37c713d6214024be1b7df


Overall Assessment

✅ Looks good

No unmitigated P0/P1 findings.

Executive Summary

Methodology

Code Quality

Performance

  • No findings. The change adds only constant-time convergence bookkeeping and optional warning emission after loop completion; it does not add iterations, change solvers, or alter asymptotic cost.

Maintainability

  • No findings. The shared helper removes duplicated warning strings, and the registry updates now cover all touched warning paths, including the previously missing _iterative_demean notes.

Tech Debt

  • No findings. I did not identify a new silent-correctness issue or a deferrable gap that needs a TODO.md entry.

Security

  • No findings. The diff is limited to warning emission, docs, and tests; there is no new secrets surface, external I/O, or trust-boundary change.

Documentation/Tests

@igerber igerber added the ready-for-ci Triggers CI test workflows label Apr 18, 2026
@igerber igerber merged commit 44e7522 into main Apr 18, 2026
23 of 24 checks passed
@igerber igerber deleted the fix/fe-imputation-convergence-warnings branch April 18, 2026 15:31
igerber added a commit that referenced this pull request Apr 18, 2026
Absorbs #312 (sdid scale fix), #313 (roadmap refresh + dCDH docstring
rewrite), and #314 (within_transform convergence warnings) from main.

Conflicts resolved in:
- diff_diff/chaisemartin_dhaultfoeuille.py: took main's comprehensive
  Phase 1-3 feature list in the class docstring but merged in the
  PR #311 group-vs-PSU bootstrap-clustering framing and the
  replicate-weight survey-support line. Kept the PR #311
  'user-specified cluster= not supported + automatic PSU-level under
  survey_design' wording for the cluster= parameter docstring (strictly
  more accurate than main's 'always clusters at the group level' text).
- diff_diff/guides/llms-full.txt: kept main's more detailed placebo SE
  contract paragraph (which already distinguishes single-period NaN
  from multi-horizon analytical/bootstrap) and appended the sup-t /
  shared-weights / cross-horizon coverage details from the PR #311
  update. Kept the PR #311 survey_design signature comment that
  mentions TSL + replicate + PSU bootstrap.

Full regression across touched areas: 336 + 324 passing.
igerber added a commit that referenced this pull request Apr 18, 2026
… remove deprecated SyntheticDiD params

Package four merged PRs (#312 SDID catastrophic cancellation at extreme Y scale,
#313 roadmap refresh, #314 FE imputation non-convergence signaling, #315 Frank-Wolfe
SC weight solver non-convergence signaling) as 3.1.2. Also remove the
SyntheticDiD(lambda_reg=...) and SyntheticDiD(zeta=...) kwargs, which have been
deprecated with DeprecationWarning since v2.3.1 (2026-02-10) in favor of
zeta_omega / zeta_lambda; their warning messages announced removal in v3.1. Passing
the old kwargs now raises TypeError at __init__ and ValueError: Unknown parameter
at set_params. Internal ridge-regression helpers that accept a lambda_reg parameter
(compute_synthetic_weights, rank_control_units, Rust FFI bindings) are unaffected.

Version strings bumped in diff_diff/__init__.py, pyproject.toml, rust/Cargo.toml,
and diff_diff/guides/llms-full.txt. CHANGELOG populated with Fixed / Changed /
Removed sections and comparison-link footer. TODO.md's "Deprecated Code" entry
removed now that the task is done.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
igerber added a commit that referenced this pull request Apr 18, 2026
Per CI review feedback (#316): removing public kwargs under a patch version
violates Semantic Versioning, which CHANGELOG.md explicitly claims to adhere to.
Restore lambda_reg and zeta handling in SyntheticDiD.__init__ and set_params
as warning-only, and bump the removal target in the DeprecationWarning text
from "v3.1" to "v4.0.0". The 3.1.2 release now carries only the four fix/doc
PRs (#312 SDID scale, #313 roadmap, #314 FE imputation convergence, #315
Frank-Wolfe convergence) with no breaking changes.

- diff_diff/synthetic_did.py: restore deprecated kwargs + warnings (v4.0.0 text)
- tests/test_methodology_sdid.py: restore TestDeprecatedParams class + set_params deprecation test
- tests/test_estimators.py: restore test_deprecated_params
- CHANGELOG.md: drop Removed section; add Changed entry documenting the v3.1 -> v4.0.0 bump in the removal target
- TODO.md: restore Deprecated Code section with v4.0.0 removal target and SemVer rationale

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
igerber added a commit that referenced this pull request Apr 18, 2026
CI review on #322 flagged that the 3.1.3 entries for PR #319 (sparse->dense
lstsq fallback) and PR #317 (TROP non-convergence) claimed ConvergenceWarning,
but the actual implementations emit UserWarning (imputation.py, two_stage.py,
utils.py, trop.py, and the REGISTRY.md contract all use UserWarning). Users
filtering warnings by category would be misled.

Same factual error was in the 3.1.2 entries I wrote in PR #316 for PR #314
and PR #315. Fixing both entries in this PR — CHANGELOG is a living doc and
the warning-category drift is actionable-ly misleading.

No code or test changes; CHANGELOG-only edit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-ci Triggers CI test workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant