-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Make if_any() and if_all() consistent in all contexts
#7747
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
f391498
c9ff2ed
16ba69d
3aecb7c
c22e0f6
1ed3235
a7afead
29dab94
21023cb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -44,3 +44,5 @@ | |
| ^\.cache$ | ||
| ^\.vscode$ | ||
| ^[\.]?air\.toml$ | ||
| ^\.claude$ | ||
| ^CLAUDE\.md$ | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,16 @@ | ||
| { | ||
| "$schema": "https://json.schemastore.org/claude-code-settings.json", | ||
| "permissions": { | ||
| "allow": [ | ||
| "Bash(find:*)", | ||
| "Bash(R:*)", | ||
| "Bash(Rscript:*)", | ||
| "Bash(air format:*)", | ||
| "Edit(R/**)", | ||
| "Edit(tests/**)", | ||
| "Edit(vignettes/**)", | ||
| "Edit(NEWS.md)" | ||
| ], | ||
| "deny": [] | ||
| } | ||
| } |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,76 @@ | ||
| # CLAUDE.md | ||
|
|
||
| This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. | ||
|
|
||
| ## Package Overview | ||
|
|
||
| dplyr is a tidyverse R package providing a grammar of data manipulation. | ||
|
|
||
| ## Key development commands | ||
|
|
||
| ### General | ||
|
|
||
| - When running R from the console, always run it with `--quiet --vanilla` | ||
| - Always run `air format .` after generating code | ||
|
|
||
| ### Testing | ||
|
|
||
| - Tests for `R/{name}.R` go in `tests/testthat/test-{name}.R` | ||
| - Use `devtools::test(reporter = "check")` to run all tests | ||
| - Use `devtools::test(filter = "{name}", reporter = "check")` to run tests for `R/{name}.R` | ||
| - DO NOT USE `devtools::test_active_file()` | ||
| - All testing functions automatically load code; you don't need to | ||
| - All new code should have an accompanying test | ||
| - If there are existing tests, place new tests next to similar existing tests | ||
|
|
||
| ### Documentation | ||
|
|
||
| - Run `devtools::document()` after changing any roxygen2 docs | ||
| - Every user facing function should be exported and have roxygen2 documentation | ||
| - Whenever you add a new documentation file, make sure to also add the topic name to `_pkgdown.yml` | ||
| - Run `pkgdown::check_pkgdown()` to check that all topics are included in the reference index | ||
| - Use sentence case for all headings | ||
| - Any user facing changes should be briefly described in a bullet point at the top of `NEWS.md`, following the tidyverse style guide (https://style.tidyverse.org/news.html) | ||
| - Any bullets added to `NEWS.md` should always be placed at the top, just under `# {package} (development version)` | ||
|
|
||
| ### Code style | ||
|
|
||
| - Use newspaper style/high-level first function organisation. Main logic at the top and helper functions should come below | ||
| - Don't define functions inside of functions unless they are very brief | ||
| - Error messages should use `cli::cli_abort()` and follow the tidyverse style guide (https://style.tidyverse.org/errors.html) | ||
|
|
||
| ## Architecture | ||
|
|
||
| ### Key Design Patterns | ||
|
|
||
| #### Standalone Imports | ||
| - Self-contained utility functions | ||
| - `import-standalone-*.R` files reduce dependencies | ||
| - Imported from other tidyverse packages | ||
|
|
||
| ## Key Files | ||
|
|
||
| ### Core Implementation | ||
| - `R/` - R implementation files | ||
| - `src/` - C/C++ implementation files | ||
|
|
||
| ### Testing and Quality | ||
| - `tests/testthat/` - Test suite | ||
| - `vignettes/` - Documentation and examples | ||
| - `.github/workflows/` - CI/CD with R CMD check | ||
|
|
||
| ## Development Notes | ||
|
|
||
| ### Testing Strategy | ||
| - Snapshot testing for output validation | ||
| - Separate test files for each major component | ||
|
|
||
| ### Code Organization | ||
| - Utility functions grouped by purpose (`utils-*.R`) | ||
| - Standalone imports minimize external dependencies | ||
|
|
||
| ### Documentation | ||
| - Roxygen2 comments for all exported functions | ||
| - Vignettes demonstrate key use cases | ||
| - pkgdown site provides comprehensive documentation | ||
| - Examples use realistic but safe API interactions | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,5 +1,13 @@ | ||
| # dplyr (development version) | ||
|
|
||
| * `if_any()` and `if_all()` are now more consistent in all use cases (#7059, #7077, #7746, @jrwinget). In particular: | ||
|
|
||
| * When called with zero inputs, `if_any()` returns `FALSE` and `if_all()` returns `TRUE`. | ||
|
|
||
| * When called with one input, both now return logical vectors rather than the original column. | ||
|
|
||
| * The result of applying `.fns` now must be a logical vector. | ||
|
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a breaking change, but hopefully not a common issue? We really do want the result of The vctrs functions for pany/pall will probably also be pretty strict about requiring logical inputs. |
||
|
|
||
| * Empty `rowwise()` list-column elements now resolve to `logical()` rather than a random logical of length 1 (#7710). | ||
|
|
||
| * `last_dplyr_warnings()` no longer prevents objects from being garbage collected (#7649). | ||
|
|
@@ -97,10 +105,6 @@ | |
| * Fixed an issue where duckplyr's ALTREP data frames were being materialized | ||
| early due to internal usage of `ncol()` (#7049). | ||
|
|
||
| * `if_any()` and `if_all()` are now fully consistent with `any()` and `all()`. | ||
| In particular, when called with empty inputs `if_any()` returns `FALSE` and | ||
| `if_all()` returns `TRUE` (#7059, @jrwinget). | ||
|
|
||
| # dplyr 1.1.4 | ||
|
|
||
| * `join_by()` now allows its helper functions to be namespaced with `dplyr::`, | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -349,31 +349,87 @@ across <- function(.cols, .fns, ..., .names = NULL, .unpack = FALSE) { | |
| if_any <- function(.cols, .fns, ..., .names = NULL) { | ||
| context_local("across_if_fn", "if_any") | ||
| context_local("across_frame", current_env()) | ||
| if_across(`|`, across({{ .cols }}, .fns, ..., .names = .names)) | ||
| df <- across({{ .cols }}, .fns, ..., .names = .names) | ||
| x <- dplyr_new_list(df) | ||
| size <- vec_size(df) | ||
| dplyr_list_pany(x, size = size) | ||
| } | ||
|
|
||
| #' @rdname across | ||
| #' @export | ||
| if_all <- function(.cols, .fns, ..., .names = NULL) { | ||
| context_local("across_if_fn", "if_all") | ||
| context_local("across_frame", current_env()) | ||
| if_across(`&`, across({{ .cols }}, .fns, ..., .names = .names)) | ||
| df <- across({{ .cols }}, .fns, ..., .names = .names) | ||
| x <- dplyr_new_list(df) | ||
| size <- vec_size(df) | ||
| dplyr_list_pall(x, size = size) | ||
| } | ||
|
|
||
| if_across <- function(op, df) { | ||
| n <- nrow(df) | ||
| dplyr_list_pall <- function( | ||
| x, | ||
| ..., | ||
| size = NULL, | ||
| error_call = caller_env() | ||
| ) { | ||
| dplyr_list_pany_pall(x, "all", ..., size = size, error_call = error_call) | ||
| } | ||
|
|
||
| dplyr_list_pany <- function( | ||
| x, | ||
| ..., | ||
| size = NULL, | ||
| error_call = caller_env() | ||
| ) { | ||
| dplyr_list_pany_pall(x, "any", ..., size = size, error_call = error_call) | ||
| } | ||
|
|
||
| dplyr_list_pany_pall <- function( | ||
|
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm hoping to just remove this in favor of the vctrs versions soon, but I needed to get the semantics of it right enough to be able to add all the tests here |
||
| x, | ||
| variant, | ||
| ..., | ||
| size = NULL, | ||
| error_call = caller_env() | ||
| ) { | ||
| check_dots_empty0(...) | ||
|
|
||
| obj_check_list(x, arg = "", call = error_call) | ||
|
|
||
| if (!length(df)) { | ||
| return(TRUE) | ||
| # Doesn't allow `NULL`, doesn't do type casting | ||
| for (i in seq_along(x)) { | ||
| check_logical( | ||
| x[[i]], | ||
| arg = (vec_names(x) %||% paste0("..", seq_len(vec_size(x))))[[i]], | ||
| call = error_call | ||
| ) | ||
| } | ||
|
|
||
| combine <- function(x, y) { | ||
| if (is_null(x)) { | ||
| y | ||
| # Doesn't do recycling, asserts that all inputs are the same size | ||
| if (is.null(size)) { | ||
| if (length(x) == 0L) { | ||
| size <- 0L | ||
| } else { | ||
| op(x, y) | ||
| size <- vec_size(x[[1L]]) | ||
| } | ||
| } | ||
| reduce(df, combine, .init = NULL) | ||
| list_check_all_size(x, size, arg = "", call = error_call) | ||
|
|
||
| switch( | ||
| variant, | ||
| any = { | ||
| op <- `|` | ||
| init <- FALSE | ||
| }, | ||
| all = { | ||
| op <- `&` | ||
| init <- TRUE | ||
| }, | ||
| abort("Unreachable", .internal = TRUE) | ||
| ) | ||
|
|
||
| init <- vec_rep(init, times = size) | ||
|
|
||
| reduce(x, op, .init = init) | ||
|
Comment on lines
+430
to
+432
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Having an |
||
| } | ||
|
|
||
| #' Combine values from multiple columns | ||
|
|
@@ -608,79 +664,67 @@ dplyr_quosures <- function(...) { | |
| quosures | ||
| } | ||
|
|
||
| # When mutate() or summarise() have an unnamed call to across() at the top level, e.g. | ||
| # summarise(across(<...>)) or mutate(across(<...>)) | ||
| # | ||
| # a call to top_across(<...>) is evaluated instead. | ||
| # top_across() returns a flattened list of expressions along with some | ||
| # information about the "current column" for each expression | ||
| # in the "columns" attribute: | ||
| # | ||
| # For example with: summarise(across(c(x, y), mean, .names = "mean_{.col}")) top_across() will return | ||
| # something like: | ||
| # | ||
| # structure( | ||
| # list(mean_x = expr(mean(x)), mean_y = expr(mean(y))) | ||
| # columns = c("x", "y") | ||
| # ) | ||
|
|
||
| # Technically this always returns a single quosure but we wrap it in a | ||
| # list to follow the pattern in `expand_across()` | ||
| # Always guaranteed to be 1 quosure in, 1 quosure out, unlike `expand_across()` | ||
| expand_if_across <- function(quo) { | ||
| quo_data <- attr(quo, "dplyr:::data") | ||
| if (!quo_is_call(quo, c("if_any", "if_all"), ns = c("", "dplyr"))) { | ||
| return(list(quo)) | ||
| if (quo_is_call(quo, "if_any", ns = c("", "dplyr"))) { | ||
| variant <- "any" | ||
| } else if (quo_is_call(quo, "if_all", ns = c("", "dplyr"))) { | ||
| variant <- "all" | ||
| } else { | ||
| # Refuse to expand | ||
| return(quo) | ||
| } | ||
|
|
||
| # `definition` is the same between the two for the purposes of `match.call()` | ||
| definition <- if_any | ||
|
|
||
| call <- match.call( | ||
| definition = if_any, | ||
| definition = definition, | ||
| call = quo_get_expr(quo), | ||
| expand.dots = FALSE, | ||
| envir = quo_get_env(quo) | ||
| ) | ||
|
|
||
| if (!is_null(call$...)) { | ||
| return(list(quo)) | ||
| # Refuse to expand | ||
| return(quo) | ||
| } | ||
|
|
||
| if (is_call(call, "if_any")) { | ||
| op <- "|" | ||
| if (variant == "any") { | ||
| if_fn <- "if_any" | ||
| empty <- FALSE | ||
| dplyr_fn <- "dplyr_list_pany" | ||
| } else { | ||
| op <- "&" | ||
| if_fn <- "if_all" | ||
| empty <- TRUE | ||
| dplyr_fn <- "dplyr_list_pall" | ||
| } | ||
|
|
||
| # `expand_across()` will always expand at this point given that we bailed on | ||
| # `...` usage early on, which is the only case that would stop expansion. | ||
| # | ||
| # Set frame here for backtrace truncation. But override error call via | ||
| # `local_error_call()` so it refers to the function we're expanding, e.g. | ||
| # `if_any()` and not `expand_if_across()`. | ||
| context_local("across_if_fn", if_fn) | ||
|
|
||
| # Set frame here for backtrace truncation. But override error call | ||
| # via `local_error_call()` so it refers to the function we're | ||
| # expanding, e.g. `if_any()` and not `expand_if_across()`. | ||
| context_local("across_frame", current_env()) | ||
| local_error_call(call(if_fn)) | ||
|
|
||
| call[[1]] <- quote(across) | ||
| quos <- expand_across(quo_set_expr(quo, call)) | ||
|
|
||
| # Select all rows if there are no inputs for if_all(), | ||
| # but select no rows if there are no inputs for if_any(). | ||
| if (!length(quos)) { | ||
| return(list(quo(!!empty))) | ||
| } | ||
| expr <- expr({ | ||
| x <- list(!!!quos) | ||
| ns <- asNamespace("dplyr") | ||
|
|
||
| combine <- function(x, y) { | ||
| if (is_null(x)) { | ||
| y | ||
| } else { | ||
| call(op, x, y) | ||
| } | ||
| } | ||
| expr <- reduce(quos, combine, .init = NULL) | ||
| # In the evaluation path, `across()` automatically recycles to common size, | ||
| # so we must here as well for compatibility. `across()` also returns a 0 | ||
| # col, 1 row data frame in the case of no inputs so that it will recycle to | ||
| # the group size, which we also do here. | ||
| size <- ns[["dplyr_list_size_common"]](x, absent = 1L, call = call(!!if_fn)) | ||
| x <- ns[["dplyr_list_recycle_common"]](x, size = size, call = call(!!if_fn)) | ||
|
|
||
| ns[[!!dplyr_fn]](x, size = size, error_call = call(!!if_fn)) | ||
| }) | ||
|
|
||
| # Use `as_quosure()` instead of `new_quosure()` to avoid rewrapping | ||
| # quosure in case of single input | ||
| list(as_quosure(expr, env = baseenv())) | ||
| new_quosure(expr, env = baseenv()) | ||
| } | ||
|
|
||
| expand_across <- function(quo) { | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Plucked from ellmer, with modifications. I'm just trying it out.