Skip to content

Commit 50a6c90

Browse files
committed
clean up the vignette, add news, better readme
1 parent 29eb15d commit 50a6c90

File tree

5 files changed

+238
-118
lines changed

5 files changed

+238
-118
lines changed

DESCRIPTION

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,5 +43,6 @@ Suggests:
4343
rmarkdown,
4444
testthat (>= 3.0.0)
4545
Config/testthat/edition: 3
46-
URL: https://github.com/cmu-delphi/epipredict/
46+
URL: https://github.com/cmu-delphi/epipredict/,
47+
https://cmu-delphi.github.io/epiprocess
4748
VignetteBuilder: knitr

NEWS.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
# epipredict (development)
2+
3+
* Move to R6 interface
4+
5+
6+
7+
# epipredict 0.0.0.9000
8+
9+
* Publish public for easy navigation
10+
* Two simple forecasters as test beds
11+
* Working vignette

README.Rmd

Lines changed: 71 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -30,15 +30,79 @@ You can install the development version of epipredict from [GitHub](https://gith
3030
devtools::install_github("cmu-delphi/epipredict")
3131
```
3232

33-
## Example
33+
## Documentation
34+
35+
You can view documentation for the `main` branch at <https://cmu-delphi.github.io/epipredict>.
36+
37+
38+
## Goals for `epipredict`
39+
40+
**We hope to provide:**
41+
42+
1. A set of basic, easy-to-use forecasters that work out of the box. You should be able to do a reasonably limited amount of customization on them. (Any serious customization happens with point number 2.) For the basic forecasters, we should provide, at least:
43+
* Baseline flat-line forecaster
44+
* Autoregressive forecaster
45+
* Autoregressive classifier
46+
2. A framework for creating custom forecasters out of modular components. There are four types of components:
47+
* Preprocessor: do things to the data before model training
48+
* Trainer: train a model on data, resulting in a fitted model object
49+
* Predictor: make predictions, using a fitted model object
50+
* Postprocessor: do things to the predictions before returning
51+
52+
**Target audience:**
53+
54+
* Basic. Has data, calls forecaster with default arguments.
55+
* Intermediate. Wants to examine changes to the arguments, take advantage of built in flexibility.
56+
* Advanced. Wants to write their own forecasters. Maybe willing to build up from some components that we write.
57+
58+
The Advanced user should find their task to be relatively easy (and we'll show them how).
59+
60+
**Example:**
61+
During a quiet period, a user decides they want to first predict whether a surge is about to occur, say using variant information from GISAID. Then for surging locations, they want to train an AR model using past surges in the same location. Everywhere else, they predict a flat line. We should be able to do this in a few lines of code.
62+
63+
Delphi's own forecasts have been produced/evaluated in this way for a while now, but the code base is scattered and evolving. We want to consolidate, generalize, and simplify to allow others to benefit as well.
64+
65+
The basic framework should allow for something like the following. This would
66+
feel very familiar to anyone working in `R`+`{tidyverse}`.
3467

35-
This is a basic example which shows you how to solve a common problem:
68+
**Simple linear autoregressive model with scaling (modular)**
3669

37-
```{r example}
38-
library(epipredict)
39-
## basic example code
70+
```{r ideal-framework, eval=FALSE}
71+
my_fcaster = new_epi_predictor() %>%
72+
add_preprocessor(scaler, var = cases, by = pop) %>%
73+
add_preprocessor(lagger, var = dv_cli, lags = c(0, 7, 14)) %>%
74+
add_trainer(lm) %>%
75+
add_predictor(lm.predict) %>%
76+
add_postprocessor(scaler, by = 1/pop)
4077
```
4178

42-
## Documentation
79+
Then you could run this on an `epi_df` with one line.
4380

44-
You can view documentation for the `main` branch at <https://cmu-delphi.github.io/epipredict>.
81+
```{r run-ideal, eval=FALSE}
82+
my_fcaster(lead(cases, 7) ~ ., epi_df, key_vars, time_vars)
83+
```
84+
85+
The hypothetical example of first classifying, then fitting different models would also fit into this framework. And this isn't far from our current production models.
86+
87+
### Why doesn't this exist
88+
89+
Closest neighbor is [`{fable}`](https://fable.tidyverts.org/). It does some of what we want but has a few major downsides:
90+
91+
1. Small set of standard Time Series models.
92+
* Small modifications are hard (e.g. can't "just use" `glmnet` instead of `lm`) in an AR model.
93+
1. Multi-period forecasting is model-based only.
94+
* This is "iterative" forecasting, and is very bad in epidemiology.
95+
* Much better with simple models to use "direct" forecasting.
96+
1. Confidence bands are model-based only.
97+
* In epi tasks, these dramatically under-cover.
98+
1. Layering is not possible/natural
99+
1. Can't use methods that aren't already implemented.
100+
101+
The forecasts we did above can't be produced with `{fable}`.
102+
103+
104+
**However:** The developers behind `{fable}` wrote a package called `{fabletools}` that powers model creation (based on `R6`). We can almost certainly borrow some of that technology to lever up.
105+
106+
### What this isn't
107+
108+
This is not a framework for SIR models. We intend to create some simple versions, but advanced models---those that use variants, hospitalizations, different types of immunity, age stratification, etc.---cannot be compartmentalized in the same way (though see [pypm](https://pypm.github.io/home/)). These types of models also are better at scenario modeling than short term forecasts unless they are quite complicated.

README.md

Lines changed: 97 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -21,16 +21,106 @@ You can install the development version of epipredict from
2121
devtools::install_github("cmu-delphi/epipredict")
2222
```
2323

24-
## Example
24+
## Documentation
25+
26+
You can view documentation for the `main` branch at
27+
<https://cmu-delphi.github.io/epipredict>.
28+
29+
## Goals for `epipredict`
30+
31+
**We hope to provide:**
32+
33+
1. A set of basic, easy-to-use forecasters that work out of the box.
34+
You should be able to do a reasonably limited amount of
35+
customization on them. (Any serious customization happens with point
36+
number 2.) For the basic forecasters, we should provide, at least:
37+
- Baseline flat-line forecaster
38+
- Autoregressive forecaster
39+
- Autoregressive classifier
40+
2. A framework for creating custom forecasters out of modular
41+
components. There are four types of components:
42+
- Preprocessor: do things to the data before model training
43+
- Trainer: train a model on data, resulting in a fitted model
44+
object
45+
- Predictor: make predictions, using a fitted model object
46+
- Postprocessor: do things to the predictions before returning
47+
48+
**Target audience:**
49+
50+
- Basic. Has data, calls forecaster with default arguments.
51+
- Intermediate. Wants to examine changes to the arguments, take
52+
advantage of built in flexibility.
53+
- Advanced. Wants to write their own forecasters. Maybe willing to
54+
build up from some components that we write.
55+
56+
The Advanced user should find their task to be relatively easy (and
57+
we’ll show them how).
58+
59+
**Example:**
60+
During a quiet period, a user decides they want to first predict whether
61+
a surge is about to occur, say using variant information from GISAID.
62+
Then for surging locations, they want to train an AR model using past
63+
surges in the same location. Everywhere else, they predict a flat line.
64+
We should be able to do this in a few lines of code.
65+
66+
Delphi’s own forecasts have been produced/evaluated in this way for a
67+
while now, but the code base is scattered and evolving. We want to
68+
consolidate, generalize, and simplify to allow others to benefit as
69+
well.
70+
71+
The basic framework should allow for something like the following. This
72+
would feel very familiar to anyone working in `R`+`{tidyverse}`.
2573

26-
This is a basic example which shows you how to solve a common problem:
74+
**Simple linear autoregressive model with scaling (modular)**
2775

2876
``` r
29-
library(epipredict)
30-
## basic example code
77+
my_fcaster = new_epi_predictor() %>%
78+
add_preprocessor(scaler, var = cases, by = pop) %>%
79+
add_preprocessor(lagger, var = dv_cli, lags = c(0, 7, 14)) %>%
80+
add_trainer(lm) %>%
81+
add_predictor(lm.predict) %>%
82+
add_postprocessor(scaler, by = 1/pop)
3183
```
3284

33-
## Documentation
85+
Then you could run this on an `epi_df` with one line.
3486

35-
You can view documentation for the `main` branch at
36-
<https://cmu-delphi.github.io/epipredict>.
87+
``` r
88+
my_fcaster(lead(cases, 7) ~ ., epi_df, key_vars, time_vars)
89+
```
90+
91+
The hypothetical example of first classifying, then fitting different
92+
models would also fit into this framework. And this isn’t far from our
93+
current production models.
94+
95+
### Why doesn’t this exist
96+
97+
Closest neighbor is [`{fable}`](https://fable.tidyverts.org/). It does
98+
some of what we want but has a few major downsides:
99+
100+
1. Small set of standard Time Series models.
101+
- Small modifications are hard (e.g. can’t “just use” `glmnet`
102+
instead of `lm`) in an AR model.
103+
2. Multi-period forecasting is model-based only.
104+
- This is “iterative” forecasting, and is very bad in
105+
epidemiology.
106+
- Much better with simple models to use “direct” forecasting.
107+
3. Confidence bands are model-based only.
108+
- In epi tasks, these dramatically under-cover.
109+
4. Layering is not possible/natural
110+
5. Can’t use methods that aren’t already implemented.
111+
112+
The forecasts we did above can’t be produced with `{fable}`.
113+
114+
**However:** The developers behind `{fable}` wrote a package called
115+
`{fabletools}` that powers model creation (based on `R6`). We can almost
116+
certainly borrow some of that technology to lever up.
117+
118+
### What this isn’t
119+
120+
This is not a framework for SIR models. We intend to create some simple
121+
versions, but advanced models—those that use variants, hospitalizations,
122+
different types of immunity, age stratification, etc.—cannot be
123+
compartmentalized in the same way (though see
124+
[pypm](https://pypm.github.io/home/)). These types of models also are
125+
better at scenario modeling than short term forecasts unless they are
126+
quite complicated.

0 commit comments

Comments
 (0)