Skip to content
Open

v1.4 #55

Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: qualcontrol
Title: Quality Control
Version: 1.3.8
Version: 1.3.8.9000
Authors@R:
person("Vegard", "Lysne", , "vegard.lysne@helsedir.no", role = c("aut", "cre"),
comment = c(ORCID = "0000-0002-0816-5075"))
Expand Down
31 changes: 31 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,34 @@
# qualcontrol (development version)

## Bugfix and other changes
1. In `plot_timeseries()`, the input data is limited to rows with actual data, which was causing a bug if all geo was missing data on specific years. These years was omitted from the plots.
- Now the data is rectangularized on year to ensure all years are included.
- The mechanism to limit number of years plotted filtered out 2 more years than requested, which is now fixed.
2. `check_friskvik()` changes
- gains a comment column
- Implemented `fix_case_insensitive_match` to fix case-insensitive matching of column names (e.g. aar vs AAR)
- Implemented `read_friskvik_spec()` to read spec file if it exists
- When checking if cube file is in publication folder, ignore file extension.
- Include a check of whether the indicator is included on page 4 of the profiles ([issue #54](https://github.com/helseprofil/qualcontrol/issues/54))
- UNGDATA-indicators excluded from periode_bm/nn-check
3. `plot_boxplot`
- More effective calculation of baseplotdata using collapse
- Calculates device size based on file with max number of panels, insted of always using the first file ([issue #53](https://github.com/helseprofil/qualcontrol/issues/53))
4. Options are now checked (and updated) when the package is attached, not just when loaded ([issue #51](https://github.com/helseprofil/qualcontrol/issues/51))
5. In `comparecube_summary`, new_prikk and expired_prikk are now only reported for SPVFLAGG, and based on 0 vs non-0 ([issue #41](https://github.com/helseprofil/qualcontrol/issues/41))
6. `plot_timeseries_bydel`
- Optimized using the same strategy as other plotting functions. Unneccessary panel text is removed and caption added ([issue #47](https://github.com/helseprofil/qualcontrol/issues/47)).
- Strata with only 1 observations is filtered out ([issue #52](https://github.com/helseprofil/qualcontrol/issues/52))
7. `aggregate_cube` now uses collapse, and keeps unique and max values of columns that cannot be aggregated.
8. `read_files` now collects naboprikk-columns into one 0|1-column, and adds missing censor-columns
9. `identify_coltypes` now collects censor-columns, and does not include `any_diffs`/`newrow`/`exprow` as values
10. New function `explore_different_censoring`, filtering out strata with differing censoring across the new and old file.
11. New function `diffvals_which_levels`, which identify the levels of a dimension which are different on a specific value column. Used to e.g. identify GEO-codes with TELLER/NEVNER-diffs.

X. Code cleaning
- Removed all use of get/mget
- Implemented the env-argument in data.table for cleaner syntax

# qualcontrol 1.3.8 (2026-01-20)

## Changes
Expand Down
4 changes: 2 additions & 2 deletions R/attack-secondary-censoring.R
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,14 @@ attack_naboprikk <- function(cubefile = NULL, cubepath = NULL){
naboprikkdims <- grep("nabopr", names(specs), value = T)
naboprikkdims <- setNames(naboprikkdims, toupper(gsub("Stata_nabopr([^_]+).*$", "\\1", naboprikkdims)))

out_pvern1 <- data.table::copy(d[0, ..dims])
out_pvern1 <- data.table::copy(d[0, .SD, .SDcols = dims])

for(dim in names(naboprikkdims)){
restdims <- setdiff(dims, dim)
triangles <- decode_triangles(dim = dim, naboprikkdims = naboprikkdims, specs = specs)

for(i in 1:length(triangles)){
d_subset <- d[get(dim) %in% triangles[[i]]]
d_subset <- d[x %in% triangles[[i]], env = list(x = dim)]
pvern1 <- d_subset[, .(n_pvern = sum(pvern, na.rm = T), n_prikkok = sum(prikket_ok, na.rm = T)), by = restdims][, (dim) := paste0("{", paste(triangles[[i]], collapse = ","), "}")][n_pvern > 0 & n_prikkok == 1][, let(n_pvern = NULL, n_prikkok = NULL)]
if(ncol(pvern1) > 0) out_pvern1 <- data.table::rbindlist(list(out_pvern1, pvern1), use.names = T)
}
Expand Down
4 changes: 2 additions & 2 deletions R/barometer.R
Original file line number Diff line number Diff line change
Expand Up @@ -91,10 +91,10 @@ check_barometer <- function(type = c("FHP", "OVP"),
}
cat("... \n")
withVar <- c("Aar", indV1)
bar[ind, (withVar) := mget(withVar), on = c(stedskode_string = "Sted_kode", LPnr = "LPnr")]
bar <- collapse::join(bar, ind, on = c(stedskode_string = "Sted_kode", LPnr = "LPnr"), verbose = 0, overid = 2)[, .SD, .SDcols = c(names(bar), withVar)]

verdiCol <- ifelse(geo == "fylke", "Verdi_mellomGeonivaa", "Verdi_lavesteGeonivaa")
outDT <- bar[get(verdiCol) == Verdi_referansenivaa, ][
outDT <- bar[x == Verdi_referansenivaa, env = list(x = verdiCol)][
!is.na(roede) | !is.na(groenne) | !is.na(hvitMprikk)]

return(outDT)
Expand Down
39 changes: 35 additions & 4 deletions R/check-censoring.R
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ check_censoring <- function(dt = newcube){
} else {
cat(paste0("\nTELLER variable controlled: ", tellerval))
cat(paste0("\nCriteria: No values <= ", lim_teller))
notcensored_teller <- dt[SPVFLAGG == 0 & get(tellerval) <= lim_teller]
notcensored_teller <- dt[SPVFLAGG == 0 & x <= lim_teller, env = list(x = tellerval)]
}

if(!is.null(notcensored_teller)){
Expand All @@ -61,7 +61,7 @@ check_censoring <- function(dt = newcube){
} else {
cat(paste0("\nNEVNER variable controlled: ", nevnerval))
cat(paste0("\nCriteria: No values <= ", lim_nevner))
notcensored_nevner <- dt[SPVFLAGG == 0 & get(nevnerval) <= lim_nevner]
notcensored_nevner <- dt[SPVFLAGG == 0 & x <= lim_nevner, env = list(x = nevnerval)]
}

if(!is.null(notcensored_nevner)){
Expand All @@ -75,6 +75,37 @@ check_censoring <- function(dt = newcube){
}
}

#' @title explore_different_censoring
#' @description filters out strata with different censoring and prints out censor-info columns to identify the reason for differences.
#' @param compare comparecube
#' @param dt_new newcube
#' @param dt_old oldcube
#' @export
explore_different_censoring <- function(compare = comparecube, dt_new = newcube, dt_old = oldcube){
if(is.null(dt_old)){
message("No old cube, check not possible")
return(invisible(NULL))
}
colinfo <- identify_coltypes(dt_new, dt_old)
dims <- colinfo$commondims
diffs <- compare[(SPVFLAGG_new == 0 & SPVFLAGG_old > 0) | (SPVFLAGG_new > 0 & SPVFLAGG_old == 0), .SD, .SDcols = c(dims, "SPVFLAGG_new", "SPVFLAGG_old")]
colorder <- names(diffs)
censor_new <- colinfo$censor.new

diffs <- collapse::join(diffs, dt_new, on = dims, overid = 2, verbose = 0)[, .SD, .SDcols = c(names(diffs), censor_new)]
data.table::setnames(diffs, censor_new, paste0(censor_new, "_new"))

censor_old <- colinfo$censor.old
if(length(censor_old) > 0){
diffs <- collapse::join(diffs, dt_old, on = dims, overid = 2, verbose = 0)[, .SD, .SDcols = c(names(diffs), censor_old)]
data.table::setnames(diffs, censor_old, paste0(censor_old, "_old"))
for(col in intersect(censor_new, censor_old)) colorder <- c(colorder, paste0(col, c("_new", "_old")))
}

data.table::setcolorder(diffs, colorder)
return(diffs)
}

#' @@title compare_censoring
#' @description
#' Calculate number of censored observations and calculate absolute and relative difference
Expand Down Expand Up @@ -168,8 +199,8 @@ compare_censoring_timeseries <- function(cube.new = newcube,
}

if(!is.null(cube.old)){
d <- data.table::rbindlist(list(data.table::copy(cube.new)[!grepl("99$", GEO), mget(colinfo$commoncols)][, cube := "New"],
data.table::copy(cube.old)[!grepl("99$", GEO), mget(colinfo$commoncols)][, cube := "Old"]))
d <- data.table::rbindlist(list(data.table::copy(cube.new)[!grepl("99$", GEO), .SD, .SDcols = colinfo$commoncols][, cube := "New"],
data.table::copy(cube.old)[!grepl("99$", GEO), .SD, .SDcols = colinfo$commoncols][, cube := "Old"]))

groupdims <- grep("^AAR$", c(colinfo$commondims), invert = T, value = T)
d <- d[, .(N_censored = sum(SPVFLAGG != 0, na.rm = T)), by = c("cube", groupdims)]
Expand Down
43 changes: 34 additions & 9 deletions R/check-comparecube.R
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,7 @@
#'
#' @return a DT output table
#' @export
comparecube_summary <- function(dt = comparecube,
save = TRUE){
comparecube_summary <- function(dt = comparecube, save = TRUE){
if(is.null(dt)){
cat("comparecube is NULL, no check performed")
return(invisible(NULL))
Expand Down Expand Up @@ -129,8 +128,7 @@ plot_diff_timetrends <- function(dt = comparecube,
d <- data.table::copy(dt[newrow == 0 & SPVFLAGG_new == 0 & SPVFLAGG_old == 0]) |>
translate_geoniv()

d[, let(Absolute = get(diffval),
Relative = get(reldiffval))]
d[, let(Absolute = x, Relative = y), env = list(x = diffval, y = reldiffval)]
d <- data.table::melt(d, measure.vars = c("Absolute", "Relative"))[, .(GEOniv, AAR, variable, value)]
allyears <- get_all_combinations(d, c("GEOniv", "AAR", "variable"))
d <- d[!(variable == "Absolute" & value == 0 | variable == "Relative" & value == 1)]
Expand All @@ -147,6 +145,28 @@ plot_diff_timetrends <- function(dt = comparecube,
}
}

#' @title diffvals_which_levels
#' @description Prints a list of unique values of a dimensions with diffs of a specified value
#' @param dt comparecube
#' @param dim dimension to list levels with diffs
#' @param val value column to identify diffs
#' @export
diffvals_which_levels <- function(dt = comparecube, dim = "GEO", val = "sumTELLER_uprikk"){
if(is.null(dt)){
cat("comparecube is NULL, no check performed")
return(invisible(NULL))
}
colinfo <- identify_coltypes(comparecube)
available_vals <- unique(sub("_new$|_old$", "", grep("_diff$|_reldiff$", colinfo$vals.new, invert = T, value = T)))

if(dim %notin% colinfo$dims.new) stop("dim må være en dimensjon i filen: ", paste(colinfo$dims.new, collapse = ", "))
if(val %notin% available_vals) stop("val må være en verdikolonne i filen: ", paste(available_vals, collapse = ", "))

diffval <- paste0(val, "_diff")
cat("Unike verdier for ", dim, " som har differ for ", val, ":\n")
qc_round(dt)[any_diffs == 1 & x != 0, unique(y), env = list(x = diffval, y = dim)]
}

## ---- HELPER FUNCTIONS ----

#' @title summarise_diffvals
Expand All @@ -169,18 +189,23 @@ summarise_diffvals <- function(out,
reldiff <- paste0(value, "_reldiff")
calculate_reldiff <- reldiff %in% names(subset)

identical <- subset[get(diff) == 0, .N]
different <- subset[get(diff) != 0, .N]
newprikk <- subset[is.na(get(new)) & !is.na(get(old)), .N]
expprikk <- subset[!is.na(get(new)) & is.na(get(old)), .N]
identical <- subset[x == 0, .N, env = list(x = diff)]
different <- subset[x != 0, .N, env = list(x = diff)]

newprikk <- expprikk <- NA_real_
if(value == "SPVFLAGG"){
newprikk <- subset[x > 0 & y == 0, .N, env = list(x = new, y = old)]
expprikk <- subset[x == 0 & y > 0, .N, env = list(x = new, y = old)]
}


out[GEOniv == geolevel & Value == value, let(Identical = identical,
Different = different,
New_prikk = newprikk,
Expired_prikk = expprikk)]

if(different > 0){
diffdata <- subset[get(diff) != 0 & !is.na(get(new)) & !is.na(get(old))]
diffdata <- subset[x != 0 & !is.na(y) & !is.na(z), env = list(x = diff, y = new, z = old)]
out[GEOniv == geolevel & Value == value, let(Mean_diff = round(mean(diffdata[[diff]], na.rm = T), 3),
Min_diff = round(min(diffdata[[diff]], na.rm = T), 3),
Max_diff = round(max(diffdata[[diff]], na.rm = T), 3))]
Expand Down
20 changes: 11 additions & 9 deletions R/check-geolevel.R
Original file line number Diff line number Diff line change
Expand Up @@ -54,11 +54,11 @@ compare_geolevels <- function(dt = newcube,
by <- c("GEOniv", groupdims)

# Estimate sum and diffs
d <- d[, .("sum" = collapse::fsum(get(teller_val))), keyby = by]
d <- d[, .("sum" = sum(x)), by = by, env = list(x = teller_val)]
d <- data.table::dcast(d, ... ~ GEOniv, value.var = "sum")
data.table::setcolorder(d, c(groupdims, outcols[1], outcols[2]))
d[, let(Absolute = get(outcols[1])-get(outcols[2]),
Relative = round(get(outcols[1])/get(outcols[2]), 3))]
d[, let(Absolute = GEOh-GEOl, Relative = round(GEOh/GEOl, 3)),
env = list(GEOh = outcols[[1]], GEOl = outcols[[2]])]

# Format output
d[, (groupdims) := lapply(.SD, as.factor), .SDcols = groupdims]
Expand Down Expand Up @@ -93,6 +93,11 @@ unknown_bydel <- function(dt = newcube,
return(invisible(NULL))
}

# Filter relevant rows
d <- d[GEO %in% c(301, 1103, 4601, 5001) | GEOniv == "B"]
contains_bydel <- d[GEOniv == "B" & SPVFLAGG == 0, unique(AAR)]
d <- d[AAR %in% contains_bydel]

cubefile <- get_cubefilename(dt)
savepath <- get_table_savefolder(get_cubename(dt))
suffix <- paste0("unknown_bydel_", type)
Expand All @@ -110,12 +115,9 @@ unknown_bydel <- function(dt = newcube,
return(invisible(NULL))
}

# Filter and format data
d <- d[GEO %in% c(301, 1103, 4601, 5001) | GEOniv == "B"]
contains_bydel <- d[GEOniv == "B" & SPVFLAGG == 0, unique(AAR)]
d <- d[AAR %in% contains_bydel]
# Format data
add_kommune(d)
d <- d[, mget(c("KOMMUNE", "GEOniv", colinfo$dims.new, targets, "SPVFLAGG"))]
d <- d[, .SD, .SDcols = c("KOMMUNE", "GEOniv", colinfo$dims.new, targets, "SPVFLAGG")]
d <- data.table::melt(d, measure.vars = targets, variable.name = "TARGET")
d <- get_complete_strata(d, c("KOMMUNE", "TARGET", bydims), type = type, "value")

Expand All @@ -129,7 +131,7 @@ unknown_bydel <- function(dt = newcube,
d <- data.table::dcast(d, ... ~ GEOniv, value.var = "sum")
d[, UNKNOWN := round(100*(1 - B/K), 2)]
d[B == 0 & K == 0, UNKNOWN := NA_real_]
d <- d[order(-UNKNOWN)]
data.table::setorder(d, -UNKNOWN, na.last = T)
data.table::setcolorder(d, c("KOMMUNE", bydims, "TARGET", "K", "B", "UNKNOWN"))
data.table::setnames(d, c("K", "B", "UNKNOWN"), c("Kommune", "Bydel", "UNKNOWN, %"))

Expand Down
Loading