|
17 | 17 | acknowledgements: | |
18 | 18 | We'd like to thank the Delphi engineering team for making this data available. |
19 | 19 | And we'd like to thank Roni Rosenfeld and Ryan Tibshirani for posing the |
20 | | - challenge of making counterfactual predictions. And thanks to Alex Reinhart |
21 | | - for getting our post into the appropriate format. |
| 20 | + challenge of making counterfactual predictions. We thank Rob Tibshirani for |
| 21 | + suggesting several improvements on this post, and Alex Reinhart for getting |
| 22 | + our post into the appropriate format. |
22 | 23 | related: |
23 | 24 | - 2020-08-28-api |
24 | 25 | output: |
|
55 | 56 | for this task. The important thing is that we can’t just use standard |
56 | 57 | prediction methods. We need to use specialized methods that were designed |
57 | 58 | for causal inference. In this post we will discuss our work on |
58 | | -the causal effect of social mobility on deaths from COVID.</p> |
| 59 | +the causal effect of social mobility on deaths from COVID. The social mobility |
| 60 | +variable we will use is the proportion of people staying home. We can think of |
| 61 | +this of anti-mobility, and expect that higher values of this variable will lead |
| 62 | +to fewer deaths from COVID.</p> |
59 | 63 | <p>Let’s start with a brief introduction to causal inference.</p> |
60 | 64 | <div id="causal-inference" class="section level2"> |
61 | 65 | <h2>Causal Inference</h2> |
@@ -131,18 +135,30 @@ <h2>Causal Inference</h2> |
131 | 135 | <p><span class="math display">\[ |
132 | 136 | \hat{\mathbb{E}(Y^a)} = \frac{1}{n}\sum_i \hat\mu(X_i,a). |
133 | 137 | \]</span></p> |
134 | | -<p>This is called the plug-in estimator. |
| 138 | +<p>This is called the plug-in estimator. (Note that for prediction we would not use |
| 139 | +this formula. We would just use <span class="math inline">\(\hat \mu(X, A)\)</span>.) |
135 | 140 | There are often better estimators, |
136 | 141 | but we won’t get into that here. |
137 | 142 | The important thing |
138 | 143 | is: |
139 | 144 | there is a formula for the causal effect |
140 | 145 | and we can estimate it.</p> |
| 146 | +<p>The first plot below shows an example where we would predict |
| 147 | +higher values of <span class="math inline">\(Y\)</span> when <span class="math inline">\(A\)</span> is large. For pure prediction, this is |
| 148 | +the correct conclusion. The second plot shows that once we |
| 149 | +account for <span class="math inline">\(X = \text{age}\)</span> (corresponding to different colors) there is |
| 150 | +a negative relationship between <span class="math inline">\(Y\)</span> and <span class="math inline">\(A\)</span>. In this case, age is a |
| 151 | +confounder and the <span class="math inline">\(g\)</span>-formula would correctly recover the negative |
| 152 | +relationship. For causal inference, this is the correct conclusion.</p> |
| 153 | +<p><img src="/blog/images/causal-simple-confounder.svg" /></p> |
141 | 154 | <p>Things get trickier |
142 | 155 | when there are time varying variables. |
143 | 156 | Consider weekly mobility and death data |
144 | 157 | <span class="math inline">\((A_1,Y_1),\dots, (A_T,Y_T)\)</span> |
145 | 158 | in one state. |
| 159 | +For simplicity, we’ll assume that there are no <span class="math inline">\(X\)</span> variables. But we’ll see that at time <span class="math inline">\(t\)</span>, the |
| 160 | +variables <span class="math inline">\(Y_1, \dots, Y_{t-1}\)</span> are confounding variables for the causal effect of |
| 161 | +mobility on <span class="math inline">\(Y_t\)</span>. |
146 | 162 | Define |
147 | 163 | <span class="math inline">\(\overline{A}_t = (A_1,\dots, A_t)\)</span> and |
148 | 164 | <span class="math inline">\(\overline{Y}_t = (Y_1,\dots, Y_t)\)</span> |
|
0 commit comments