v2.3.1.902: bug fixes, new broom methods, and documentation improvements#253
v2.3.1.902: bug fixes, new broom methods, and documentation improvements#253
Conversation
When clustervars includes a variable beyond idname, the analytical variance matrix V = t(inffunc) %*% inffunc / n ignores between-cluster correlation, making the Wald pre-test anti-conservative. Instead of silently producing a misleading p-value, emit a message() explaining why the test is skipped and direct users to bootstrap CIs. The test remains valid (and is reported as before) for balanced panels, unbalanced panels (influence functions aggregated to unit level), and repeated cross-sections (CLT applies per independent observation). Closes #186.
Two bugs fixed: 1. pre_process_did2.R: dreamerr's 'NULL | match' enforces scalar length, so passing clustervars = c(idname, extra_var) failed before idname could be stripped. Fix: strip idname from clustervars before the dreamerr check in validate_args, and again in the main body (validate_args works on a local copy). The non-faster_mode path (mboot.R) already handled this. 2. att_gt.R: extra_clustervars[!is.null(extra_clustervars)] is a no-op for NULL but produces NA_character_ when applied to character(0), causing a spurious Wald suppression message. Fix: remove the redundant line since the filter already produces character(0) when no extra vars remain. Closes #175.
pre_process_did.R had this check commented out as 'too slow'. Replaced the slow sapply/split approach with unique() + anyDuplicated(), which is O(n) and matches the check already in pre_process_did2.R (faster_mode). Both paths now throw the same clear error when gname is not constant within a unit across periods. Closes #138.
Document all return columns explicitly, explain the distinction between conf.low/conf.high (simultaneous bootstrap bands) and point.conf.low/point.conf.high (pointwise qnorm intervals), and add a @details section clarifying when to use each.
- Add nobs.MP and nobs.AGGTEobj S3 methods (closes #246) - Add statistic (t-stat) and p.value columns to tidy.MP and tidy.AGGTEobj following broom conventions: estimate | std.error | statistic | p.value | conf.low | conf.high (closes #243) - p.value is pointwise/marginal and documented as not accounting for multiple testing across ATT(g,t) cells - Covers all four aggregation types: dynamic, group, calendar, simple
cc29f67 to
40b7d53
Compare
There was a problem hiding this comment.
Pull request overview
This PR updates the did R package to address clustering and irreversibility edge cases in att_gt(), improves the Wald pre-test behavior under extra clustering, and expands broom/stats integrations (tidy() and nobs()) with clearer documentation.
Changes:
- Fix
faster_modeargument validation whenclustervarsredundantly includesidname, and add an irreversibility check to the non-faster_modepath. - Skip the Wald pre-test when extra clustering is requested (beyond
idname) and inform the user. - Add
nobs()methods forMP/AGGTEobjand extendtidy()outputs withstatisticandp.value, plus expanded Rd documentation.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
R/pre_process_did2.R |
Strips idname from clustervars before validation (fixes faster_mode crash) and repeats stripping in caller scope. |
R/pre_process_did.R |
Adds an O(n) irreversibility/time-invariance check for gname within unit. |
R/att_gt.R |
Suppresses Wald pre-test when extra clustering is used and emits an informative message. |
R/tidy.R |
Adds nobs() S3 methods and adds statistic/p.value columns to tidy() outputs with expanded roxygen docs. |
man/tidy.MP.Rd |
Updates title/description and expands return-value documentation for tidy.MP. |
man/tidy.AGGTEobj.Rd |
Updates title/description and expands return-value documentation + details for tidy.AGGTEobj. |
man/nobs.MP.Rd |
New Rd entry for nobs.MP. |
man/nobs.AGGTEobj.Rd |
New Rd entry for nobs.AGGTEobj. |
NAMESPACE |
Registers nobs S3 methods and imports stats::nobs. |
DESCRIPTION |
Bumps package version to 2.3.1.903. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| estimate = x$att, | ||
| std.error = x$se, | ||
| statistic = x$att / x$se, | ||
| p.value = 2 * (1 - stats::pnorm(abs(x$att / x$se))), | ||
| conf.low = x$att - x$c * x$se, |
There was a problem hiding this comment.
tidy.MP() and tidy.AGGTEobj() now compute and expose statistic/p.value, but the test suite doesn’t appear to cover these new columns. Add a small testthat case that checks the columns exist and that p.value matches the expected normal-approximation calculation (including handling of NA standard errors).
| Package: did | ||
| Title: Treatment Effects with Multiple Periods and Groups | ||
| Version: 2.3.1.902 | ||
| Version: 2.3.1.903 |
There was a problem hiding this comment.
PR title/description reference version 2.3.1.902, but DESCRIPTION bumps the package to 2.3.1.903. Please align the stated release version (either update the PR metadata/changelog text, or keep the version at 2.3.1.902 if that’s what’s intended).
| #' @return A data frame with one row per ATT(g,t) estimate and columns: | ||
| #' \item{term}{ATT(g,t) label} | ||
| #' \item{group}{the treatment cohort g} | ||
| #' \item{time}{the time period t} | ||
| #' \item{estimate}{the ATT(g,t) point estimate} |
There was a problem hiding this comment.
The roxygen @return section uses \item{...}{...} directly. In the generated Rd this ends up inside \value{} without a surrounding \describe{}, which is invalid Rd markup and can break R CMD check. Wrap the column list in \describe{ ... } (or use an itemize/markdown list) and re-run roxygen so the man pages regenerate correctly.
R/tidy.R
Outdated
| #' \item{statistic}{t-statistic (\code{estimate / std.error})} | ||
| #' \item{p.value}{two-sided pointwise p-value (\code{2*(1-pnorm(|t|))}). |
There was a problem hiding this comment.
The docs label statistic as a “t-statistic” and show pnorm(|t|), but the implementation uses a normal approximation (pnorm(abs(...))). Consider documenting this as a z/Wald statistic and use valid R syntax in the formula (e.g., abs(statistic)), to avoid misleading users.
| #' \item{statistic}{t-statistic (\code{estimate / std.error})} | |
| #' \item{p.value}{two-sided pointwise p-value (\code{2*(1-pnorm(|t|))}). | |
| #' \item{statistic}{Wald z-statistic (\code{estimate / std.error})} | |
| #' \item{p.value}{two-sided pointwise p-value (\code{2 * (1 - pnorm(abs(statistic)))}). |
| #' @return A data frame whose columns depend on \code{type}: | ||
| #' \item{type}{the aggregation type: \code{"simple"}, \code{"dynamic"}, | ||
| #' \code{"group"}, or \code{"calendar"}} | ||
| #' \item{term}{label for each estimate} | ||
| #' \item{estimate}{point estimate} | ||
| #' \item{std.error}{standard error} | ||
| #' \item{statistic}{t-statistic (\code{estimate / std.error})} | ||
| #' \item{p.value}{two-sided pointwise p-value (\code{2*(1-pnorm(|t|))}). | ||
| #' Marginal per-estimate; does \strong{not} account for multiple testing | ||
| #' across event times or groups.} | ||
| #' \item{conf.low, conf.high}{simultaneous confidence band limits. When | ||
| #' \code{bstrap=TRUE} and \code{cband=TRUE} these use the bootstrap uniform | ||
| #' critical value (\code{crit.val.egt}); otherwise they equal the pointwise | ||
| #' intervals. For \code{type="simple"} and the overall average row of | ||
| #' \code{type="group"}, a single scalar is returned so simultaneous and | ||
| #' pointwise coincide.} | ||
| #' \item{point.conf.low, point.conf.high}{pointwise confidence interval limits | ||
| #' always using \code{qnorm(1 - alp/2)}.} |
There was a problem hiding this comment.
Same as above for tidy.AGGTEobj: the @return block uses \item{...} entries (which should be within a \describe{} list in Rd) and describes a “t-statistic” while computing p-values via the normal CDF. Adjust the roxygen markup/wording and re-run roxygen so man/tidy.AGGTEobj.Rd is valid and accurate.
| #' @return A data frame whose columns depend on \code{type}: | |
| #' \item{type}{the aggregation type: \code{"simple"}, \code{"dynamic"}, | |
| #' \code{"group"}, or \code{"calendar"}} | |
| #' \item{term}{label for each estimate} | |
| #' \item{estimate}{point estimate} | |
| #' \item{std.error}{standard error} | |
| #' \item{statistic}{t-statistic (\code{estimate / std.error})} | |
| #' \item{p.value}{two-sided pointwise p-value (\code{2*(1-pnorm(|t|))}). | |
| #' Marginal per-estimate; does \strong{not} account for multiple testing | |
| #' across event times or groups.} | |
| #' \item{conf.low, conf.high}{simultaneous confidence band limits. When | |
| #' \code{bstrap=TRUE} and \code{cband=TRUE} these use the bootstrap uniform | |
| #' critical value (\code{crit.val.egt}); otherwise they equal the pointwise | |
| #' intervals. For \code{type="simple"} and the overall average row of | |
| #' \code{type="group"}, a single scalar is returned so simultaneous and | |
| #' pointwise coincide.} | |
| #' \item{point.conf.low, point.conf.high}{pointwise confidence interval limits | |
| #' always using \code{qnorm(1 - alp/2)}.} | |
| #' @return A data frame with the following columns (depending on \code{type}): | |
| #' \describe{ | |
| #' \item{type}{the aggregation type: \code{"simple"}, \code{"dynamic"}, | |
| #' \code{"group"}, or \code{"calendar"}.} | |
| #' \item{term}{label for each estimate.} | |
| #' \item{estimate}{point estimate.} | |
| #' \item{std.error}{standard error.} | |
| #' \item{statistic}{standard normal (z) test statistic | |
| #' (\code{estimate / std.error}).} | |
| #' \item{p.value}{two-sided pointwise p-value based on the standard normal | |
| #' distribution (\code{2 * (1 - pnorm(|statistic|))}). Marginal | |
| #' per-estimate; does \strong{not} account for multiple testing across | |
| #' event times or groups.} | |
| #' \item{conf.low, conf.high}{simultaneous confidence band limits. When | |
| #' \code{bstrap=TRUE} and \code{cband=TRUE} these use the bootstrap uniform | |
| #' critical value (\code{crit.val.egt}); otherwise they equal the pointwise | |
| #' intervals. For \code{type="simple"} and the overall average row of | |
| #' \code{type="group"}, a single scalar is returned so simultaneous and | |
| #' pointwise coincide.} | |
| #' \item{point.conf.low, point.conf.high}{pointwise confidence interval limits | |
| #' always using \code{qnorm(1 - alp/2)}.} | |
| #' } |
| nobs.MP <- function(object, ...) { | ||
| object$n | ||
| } |
There was a problem hiding this comment.
New nobs() S3 methods were added, but there are no testthat assertions covering them. Please add tests that nobs(att_gt(...)) and nobs(aggte(...)) return the expected unit count for both faster_mode=TRUE and FALSE, so future refactors don’t silently change behavior.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
R/tidy.R
Outdated
| #' @return Integer. The number of unique units in the data. | ||
| #' @export | ||
| nobs.MP <- function(object, ...) { | ||
| object$n |
There was a problem hiding this comment.
nobs.MP() returns object$n directly, but MP$n is set in att_gt() via ifelse(faster_mode, dp$id_count, dp$n), which coerces the value to double. This makes stats::nobs() return a numeric instead of an integer (and contradicts the Rd/@return). Consider coercing to integer here (or avoiding ifelse() when computing n in att_gt()).
| object$n | |
| as.integer(object$n) |
tests/testthat/test-tidy.R
Outdated
| xformla = ~1, | ||
| data = mpdta, | ||
| bstrap = TRUE, | ||
| cband = TRUE |
There was a problem hiding this comment.
This test computes att_gt() with bstrap = TRUE and does not set biters, so it will run the default 1000 multiplier bootstrap iterations. That’s likely to make the unit test suite very slow/flaky compared to the rest of the repository’s tests (which generally set bstrap = FALSE). Consider setting bstrap = FALSE for these tidy/nobs structure checks, or explicitly setting a small biters value (and/or skip_on_cran()) if bootstrap-based output is required.
| cband = TRUE | |
| cband = TRUE, | |
| biters = 20 |
…trap/cband=FALSE)
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 11 out of 11 changed files in this pull request and generated 7 comments.
Comments suppressed due to low confidence (1)
R/pre_process_did2.R:33
dreamerr::check_set_arg(args$clustervars, "NULL | match", ...)runs before the explicitlength(args$clustervars) > 1check below, and (per the comment) enforces scalar length. That makes the custom "only provide 1 cluster variable" error effectively unreachable for multi-variable inputs. If you want the clearer error message, checklength(args$clustervars)(after strippingidname) before callingcheck_set_arg().
# Flag for clustervars and weightsname
checkvar_message <- "__ARG__ must be NULL or a character scalar that is a name of a column from the dataset."
dreamerr::check_set_arg(args$weightsname, args$clustervars, "NULL | match", .choices = data_names, .message = checkvar_message, .up = 1)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| std.error = c(x$overall.se, x$se.egt), | ||
| statistic = c(x$overall.att, x$att.egt) / c(x$overall.se, x$se.egt), | ||
| p.value = 2 * (1 - stats::pnorm(abs(c(x$overall.att, x$att.egt) / c(x$overall.se, x$se.egt)))), | ||
| conf.low = c(x$overall.att - stats::qnorm(1 - x$DIDparams$alp/2) * x$overall.se, x$att.egt - x$crit.val.egt * x$se.egt), | ||
| conf.high = c(x$overall.att + stats::qnorm(1 - x$DIDparams$alp/2) * x$overall.se, x$att.egt + x$crit.val.egt * x$se.egt), |
There was a problem hiding this comment.
In tidy.AGGTEobj for type == "group", the p.value expression has unbalanced parentheses, which will cause a parse error when the package is loaded. Adjust the parentheses so the 2 * (1 - pnorm(abs(...))) expression is syntactically valid (and consider assigning the statistic vector once to avoid repeating the division).
| test_that("tidy.MP returns expected columns", { | ||
| td <- broom::tidy(mp) | ||
| expect_s3_class(td, "data.frame") | ||
| expect_true(all(c("term", "group", "time", "estimate", "std.error", | ||
| "statistic", "p.value", | ||
| "conf.low", "conf.high", | ||
| "point.conf.low", "point.conf.high") %in% names(td))) |
There was a problem hiding this comment.
These tests call broom::tidy(), but broom is not listed in DESCRIPTION under Suggests (or Imports). This will make R CMD check fail when broom isn't installed. Add broom to Suggests (recommended) or guard the tests with testthat::skip_if_not_installed("broom").
tests/testthat/test-tidy.R
Outdated
|
|
||
| #----------------------------------------------------------------------------- | ||
| # | ||
| # Tests for tidy(), glance(), and nobs() S3 methods |
There was a problem hiding this comment.
File header says these are tests for tidy(), glance(), and nobs(), but there are no glance() assertions in this file. Either add glance() coverage or update the header comment to avoid misleading future maintainers.
| # Tests for tidy(), glance(), and nobs() S3 methods | |
| # Tests for tidy() and nobs() S3 methods |
| # Strip idname from clustervars in the caller's copy of args. | ||
| # validate_args works on a local copy, so we must repeat this here. | ||
| # Clustering at the unit level (idname) is already done implicitly. | ||
| if (!is.null(args$clustervars) && !is.null(args$idname) && (args$idname %in% args$clustervars)) { | ||
| args$clustervars <- setdiff(args$clustervars, args$idname) | ||
| if (length(args$clustervars) == 0L) args$clustervars <- NULL | ||
| } |
There was a problem hiding this comment.
The idname-stripping logic for clustervars is duplicated here and in validate_args(). Consider returning the (possibly modified) args from validate_args() (or factoring the stripping into a small helper) so the behavior stays consistent and future changes only need to be made in one place.
man/nobs.MP.Rd
Outdated
| Integer. The number of unique units in the data. | ||
| } | ||
| \description{ | ||
| Number of observations used to fit an MP object | ||
| } |
There was a problem hiding this comment.
The title/description say "Number of observations", but the return value is "the number of unique units" (which can differ from row count in panel data). Consider updating the title/description to say "unique units" to avoid ambiguity about what nobs() reports for MP objects.
man/nobs.AGGTEobj.Rd
Outdated
| Integer. The number of unique units in the data. | ||
| } | ||
| \description{ | ||
| Number of observations used to fit an AGGTEobj object | ||
| } |
There was a problem hiding this comment.
The title/description say "Number of observations", but the return value is "the number of unique units". Consider updating the title/description to explicitly say "unique units" to avoid confusion (especially for panel data where rows != units).
| Package: did | ||
| Title: Treatment Effects with Multiple Periods and Groups | ||
| Version: 2.3.1.902 | ||
| Version: 2.3.1.903 |
There was a problem hiding this comment.
PR title/description references version 2.3.1.902, but DESCRIPTION bumps to 2.3.1.903. Please confirm the intended release/dev version so the PR metadata and package version stay in sync.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Flag for clustervars and weightsname | ||
| checkvar_message <- "__ARG__ must be NULL or a character scalar that is a name of a column from the dataset." | ||
| dreamerr::check_set_arg(args$weightsname, args$clustervars, "NULL | match", .choices = data_names, .message = checkvar_message, .up = 1) |
There was a problem hiding this comment.
In validate_args(), clustervars is passed to dreamerr::check_set_arg() which (per the comment) enforces scalar length. That means user inputs like clustervars = c(idname, extra1, extra2) will error with the generic dreamerr message before reaching the later custom length check (lines ~70-76). Consider checking the post-strip length of args$clustervars before calling check_set_arg (and then validating each element is in data_names), so users get the intended “only 1 extra cluster var” error and the later length check isn't dead/unreachable code.
| # Flag for clustervars and weightsname | |
| checkvar_message <- "__ARG__ must be NULL or a character scalar that is a name of a column from the dataset." | |
| dreamerr::check_set_arg(args$weightsname, args$clustervars, "NULL | match", .choices = data_names, .message = checkvar_message, .up = 1) | |
| # After stripping idname, ensure all remaining clustervars (if any) are column names. | |
| if (!is.null(args$clustervars)) { | |
| invalid_clust <- setdiff(args$clustervars, data_names) | |
| if (length(invalid_clust) > 0L) { | |
| stop("Each element of 'clustervars' must be the name of a column from the dataset.") | |
| } | |
| } | |
| # Flag for clustervars and weightsname | |
| checkvar_message <- "__ARG__ must be NULL or a character scalar that is a name of a column from the dataset." | |
| dreamerr::check_set_arg(args$weightsname, "NULL | match", .choices = data_names, .message = checkvar_message, .up = 1) |
| #' \item{statistic}{t-statistic (\code{estimate / std.error})} | ||
| #' \item{p.value}{two-sided pointwise p-value | ||
| #' (\code{2 * (1 - pnorm(abs(statistic)))}). |
There was a problem hiding this comment.
The documentation calls statistic a “t-statistic”, but the code (and docs for p.value) use a standard normal approximation via pnorm(). To avoid confusion for users (and to match broom conventions), consider renaming this to “z-statistic” or explicitly noting it is a normal-approximation test statistic.
| #' \item{statistic}{t-statistic (\code{estimate / std.error})} | |
| #' \item{p.value}{two-sided pointwise p-value | |
| #' (\code{2 * (1 - pnorm(abs(statistic)))}). | |
| #' \item{statistic}{z-statistic / normal-approximation test statistic | |
| #' (\code{estimate / std.error})} | |
| #' \item{p.value}{two-sided pointwise p-value based on the standard normal | |
| #' approximation (\code{2 * (1 - pnorm(abs(statistic)))}). |
| #' \item{term}{label for each estimate} | ||
| #' \item{estimate}{point estimate} | ||
| #' \item{std.error}{standard error} | ||
| #' \item{statistic}{t-statistic (\code{estimate / std.error})} |
There was a problem hiding this comment.
Same as above: the docs describe statistic as a “t-statistic”, but p-values are computed using pnorm() (normal approximation). Consider calling it a z-statistic (or clarifying it’s normal-approximation) to keep the return spec internally consistent.
| #' \item{statistic}{t-statistic (\code{estimate / std.error})} | |
| #' \item{statistic}{z-statistic (normal-approximation; \code{estimate / std.error})} |
Summary
This PR consolidates all changes on the
v2.3.1.902-bugfixesbranch, addressing several open issues and improving the package's robustness, usability, and documentation.Bug fixes
faster_modecrash whenclustervars = c(idname, extra_var)(closes Error in mboot when clustering at two variables #175):dreamerrwas enforcing scalar length beforeidnamewas stripped fromclustervars; also fixed a subtlecharacter(0)[!is.null(...)]→NA_character_bug inatt_gt.Rthat spuriously triggered the Wald suppression messagefaster_modepath (closes Nested unbalanced panel data #138): both code paths now give the same clear error when treatment is not irreversible, using an efficient O(n)anyDuplicatedcheckmessage()New broom methods
nobs()forMPandAGGTEobj(closes Implement nobs #246): returns the number of unique cross-sectional unitsstatisticandp.valuecolumns intidy()(closes Please supply p-values #243): added totidy.MPand all four aggregation types intidy.AGGTEobj, followingbroomcolumn conventions (estimate | std.error | statistic | p.value | conf.low | conf.high). p-values are pointwise/marginal and documented as not correcting for multiple testing.Documentation
tidy.MPandtidy.AGGTEobj: full@returnsections documenting every column, and a@detailssection explaining the distinction between simultaneous (conf.low/conf.high) and pointwise (point.conf.low/point.conf.high) confidence intervalsJEL replication & diagnostics (pre-existing on branch)
Test plan
devtools::test()— 0 FAIL, 1 SKIP (known upstream DRDID bug), 287 PASSdevtools::check()— 0 errors, 0 warningstidy(att_gt(...))returnsstatisticandp.valuecolumnsnobs(att_gt(...))returns number of unique unitsatt_gt(..., clustervars = c(idname, extra_var), faster_mode = TRUE)no longer crashes🤖 Generated with Claude Code