Skip to content

Commit 93a9e8c

Browse files
committed
custom header, dropping arx_classifier smooth-qr
1 parent e638288 commit 93a9e8c

File tree

6 files changed

+242
-844
lines changed

6 files changed

+242
-844
lines changed

R/arx_classifier.R

Lines changed: 101 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,106 @@
11
#' Direct autoregressive classifier with covariates
22
#'
3-
#' This is an autoregressive classification model for
4-
#' [epiprocess::epi_df][epiprocess::as_epi_df] data. It does "direct" forecasting, meaning
5-
#' that it estimates a class at a particular target horizon.
3+
#'
4+
#' @description
5+
#' This is an autoregressive classification model for continuous data. It does
6+
#' "direct" forecasting, meaning that it estimates a class at a particular
7+
#' target horizon.
8+
#'
9+
#' @details
10+
#' The `arx_classifier()` is an autoregressive classification model for `epi_df`
11+
#' data that is used to predict a discrete class for each case under
12+
#' consideration. It is a direct forecaster in that it estimates the classes
13+
#' at a specific horizon or ahead value.
14+
#'
15+
#' To get a sense of how the `arx_classifier()` works, let's consider a simple
16+
#' example with minimal inputs. For this, we will use the built-in
17+
#' `covid_case_death_rates` that contains confirmed COVID-19 cases and deaths
18+
#' from JHU CSSE for all states over Dec 31, 2020 to Dec 31, 2021. From this,
19+
#' we'll take a subset of data for five states over June 4, 2021 to December
20+
#' 31, 2021. Our objective is to predict whether the case rates are increasing
21+
#' when considering the 0, 7 and 14 day case rates:
22+
#'
23+
#' ```{r}
24+
#' jhu <- covid_case_death_rates %>%
25+
#' filter(
26+
#' time_value >= "2021-06-04",
27+
#' time_value <= "2021-12-31",
28+
#' geo_value %in% c("ca", "fl", "tx", "ny", "nj")
29+
#' )
30+
#'
31+
#' out <- arx_classifier(jhu, outcome = "case_rate", predictors = "case_rate")
32+
#'
33+
#' out$predictions
34+
#' ```
35+
#'
36+
#' The key takeaway from the predictions is that there are two prediction
37+
#' classes: (-Inf, 0.25] and (0.25, Inf). This is because for our goal of
38+
#' classification the classes must be discrete. The discretization of the
39+
#' real-valued outcome is controlled by the `breaks` argument, which defaults
40+
#' to 0.25. Such breaks will be automatically extended to cover the entire
41+
#' real line. For example, the default break of 0.25 is silently extended to
42+
#' breaks = c(-Inf, .25, Inf) and, therefore, results in two classes: [-Inf,
43+
#' 0.25] and (0.25, Inf). These two classes are used to discretize the
44+
#' outcome. The conversion of the outcome to such classes is handled
45+
#' internally. So if discrete classes already exist for the outcome in the
46+
#' `epi_df`, then we recommend to code a classifier from scratch using the
47+
#' `epi_workflow` framework for more control.
48+
#'
49+
#' The `trainer` is a `parsnip` model describing the type of estimation such
50+
#' that `mode = "classification"` is enforced. The two typical trainers that
51+
#' are used are `parsnip::logistic_reg()` for two classes or
52+
#' `parsnip::multinom_reg()` for more than two classes.
53+
#'
54+
#' ```{r}
55+
#' workflows::extract_spec_parsnip(out$epi_workflow)
56+
#' ```
57+
#'
58+
#' From the parsnip model specification, we can see that the trainer used is
59+
#' logistic regression, which is expected for our binary outcome. More
60+
#' complicated trainers like `parsnip::naive_Bayes()` or
61+
#' `parsnip::rand_forest()` may also be used (however, we will stick to the
62+
#' basics in this gentle introduction to the classifier).
63+
#'
64+
#' If you use the default trainer of logistic regression for binary
65+
#' classification and you decide against using the default break of 0.25, then
66+
#' you should only input one break so that there are two classification bins
67+
#' to properly dichotomize the outcome. For example, let's set a break of 0.5
68+
#' instead of relying on the default of 0.25. We can do this by passing 0.5 to
69+
#' the `breaks` argument in `arx_class_args_list()` as follows:
70+
#'
71+
#' ```{r}
72+
#' out_break_0.5 <- arx_classifier(
73+
#' jhu,
74+
#' outcome = "case_rate",
75+
#' predictors = "case_rate",
76+
#' args_list = arx_class_args_list(
77+
#' breaks = 0.5
78+
#' )
79+
#' )
80+
#'
81+
#' out_break_0.5$predictions
82+
#' ```
83+
#' Indeed, we can observe that the two `.pred_class` are now (-Inf, 0.5] and
84+
#' (0.5, Inf). See `help(arx_class_args_list)` for other available
85+
#' modifications.
86+
#'
87+
#' Additional arguments that may be supplied to `arx_class_args_list()` include
88+
#' the expected `lags` and `ahead` arguments for an autoregressive-type model.
89+
#' These have default values of 0, 7, and 14 days for the lags of the
90+
#' predictors and 7 days ahead of the forecast date for predicting the
91+
#' outcome. There is also `n_training` to indicate the upper bound for the
92+
#' number of training rows per key. If you would like some practice with using
93+
#' this, then remove the filtering command to obtain data within "2021-06-04"
94+
#' and "2021-12-31" and instead set `n_training` to be the number of days
95+
#' between these two dates, inclusive of the end points. The end results
96+
#' should be the same. In addition to `n_training`, there are `forecast_date`
97+
#' and `target_date` to specify the date that the forecast is created and
98+
#' intended, respectively. We will not dwell on such arguments here as they
99+
#' are not unique to this classifier or absolutely essential to understanding
100+
#' how it operates. The remaining arguments will be discussed organically, as
101+
#' they are needed to serve our purposes. For information on any remaining
102+
#' arguments that are not discussed here, please see the function
103+
#' documentation for a complete list and their definitions.
6104
#'
7105
#' @inheritParams arx_forecaster
8106
#' @param outcome A character (scalar) specifying the outcome (in the

README.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -234,7 +234,7 @@ four_week_ahead <- arx_forecaster(
234234
four_week_ahead
235235
#> ══ A basic forecaster of type ARX Forecaster ════════════════════════════════
236236
#>
237-
#> This forecaster was fit on 2025-03-04 12:12:54.
237+
#> This forecaster was fit on 2025-03-03 14:43:07.
238238
#>
239239
#> Training data was an <epi_df> with:
240240
#> • Geography: state,
@@ -298,12 +298,12 @@ four_week_ahead$predictions |>
298298
#> # A tibble: 20 × 5
299299
#> geo_value values quantile_levels forecast_date target_date
300300
#> <chr> <dbl> <dbl> <date> <date>
301-
#> 1 ca 0.199 0.1 2021-08-01 2021-08-29
302-
#> 2 ca 0.285 0.25 2021-08-01 2021-08-29
303-
#> 3 ca 0.345 0.5 2021-08-01 2021-08-29
304-
#> 4 ca 0.405 0.75 2021-08-01 2021-08-29
305-
#> 5 ca 0.491 0.9 2021-08-01 2021-08-29
306-
#> 6 ma 0.0285 0.1 2021-08-01 2021-08-29
301+
#> 1 ca 0.0425 0.1 2021-08-01 2021-08-29
302+
#> 2 ca 0.0803 0.25 2021-08-01 2021-08-29
303+
#> 3 ca 0.115 0.5 2021-08-01 2021-08-29
304+
#> 4 ca 0.150 0.75 2021-08-01 2021-08-29
305+
#> 5 ca 0.187 0.9 2021-08-01 2021-08-29
306+
#> 6 ma 0 0.1 2021-08-01 2021-08-29
307307
#> # ℹ 14 more rows
308308
```
309309

_pkgdown.yml

Lines changed: 19 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -4,21 +4,27 @@ development:
44
mode: devel
55

66
template:
7+
light-switch: true
78
package: delphidocs
89

9-
articles:
10-
- title: Get started
11-
navbar: ~
12-
contents:
13-
- epipredict
14-
- custom_epiworkflows
15-
- backtesting
16-
- update
17-
- title: Advanced methods
18-
contents:
19-
- arx-classifier
20-
- articles/smooth-qr
21-
- panel-data
10+
navbar:
11+
structure:
12+
left: [intro, workflows, backtesting, reference, articles, news]
13+
right: [search, github, lightswitch]
14+
components:
15+
workflows:
16+
text: Epiworkflows
17+
href: articles/custom_epiworkflows.html
18+
backtesting:
19+
text: Backtesting
20+
href: articles/backtesting.html
21+
articles:
22+
text: Articles
23+
menu:
24+
- text: Using the add/update/remove/adjust functions
25+
href: articles/update.html
26+
- text: Using epipredict on non-epidemic panel data
27+
href: articles/panel-data.html
2228

2329
home:
2430
links:

man/arx_classifier.Rd

Lines changed: 115 additions & 3 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)