You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
title: "Futility bounds at design and analysis under non-proportional hazards"
3
-
author: "Keaven M. Anderson"
3
+
author: "Keaven M. Anderson and Yujie Zhao"
4
4
output:
5
5
rmarkdown::html_document:
6
6
toc: true
@@ -25,19 +25,17 @@ library(ggplot2)
25
25
26
26
# Overview
27
27
28
-
We set up futility bounds under a non-proportional hazards assumption.
29
-
We consider methods presented by @korn2018interim for setting such bounds and then consider an alternate futility bound based on $\beta-$spending under a delayed or crossing treatment effect to simplify implementation.
30
-
Finally, we show how to update this $\beta-$spending bound based on blinded interim data.
31
-
We will consider an example to reproduce a line of @korn2018interim Table 1 with the alternative futility bounds considered.
28
+
This vignette demonstrates possible ways to set up futility bounds in clinical trial designs under the assumption of non-proportional hazards.
29
+
We review the methods proposed by @wieand1994stoppingand @korn2018interim.
30
+
To be more consistent with common practice, we propose a futility bound based on $\beta$-spending that automatically accounts for
31
+
non-proportional hazards as assumed in the design.
32
32
33
-
## Initial design set-up for fixed analysis
34
-
35
-
@korn2018interim considered delayed effect scenarios and proposed a futility bound that is a modification of an earlier method proposed by @wieand1994stopping.
36
-
We begin with the enrollment and failure rate assumptions which @korn2018interim based on an example by @chen2013statistical.
33
+
We start by specifying the enrollment and failure rate assumptions, following the example used by @korn2018interim (based on @chen2013statistical).
37
34
38
35
```{r}
39
36
# Enrollment assumed to be 680 patients over 12 months with no ramp-up
## Study duration was 34.8 in Korn & Freidlin Table 1
52
-
## We change to 34.86 here to obtain 512 expected events more precisely
51
+
## We change to 34.86 here to obtain 512 expected events they presented
53
52
study_duration <- 34.86
54
-
```
55
53
56
-
We now derive a fixed sample size based on these assumptions.
57
-
Ideally, we would allow a targeted event count and variable follow-up in `fixed_design_ahr()` so that the study duration will be computed automatically.
54
+
# randomization ratio (exp:control)
55
+
ratio <- 1
56
+
```
58
57
58
+
In this example, with 680 subjects enrolled over 12 months, we expect 512 events to occur within 34.86 months,
59
+
yielding approximately 90.45% power if no interim analyses are performed.
59
60
```{r}
60
61
fixedevents <- fixed_design_ahr(
61
-
alpha = 0.025, power = NULL,
62
+
alpha = 0.025, power = NULL, ratio = ratio,
62
63
enroll_rate = enroll_rate,
63
64
fail_rate = fail_rate,
64
65
study_duration = study_duration
@@ -72,47 +73,61 @@ fixedevents |>
72
73
fmt_number(columns = 5:6, decimals = 3)
73
74
```
74
75
75
-
# Modified Wieand futility bound
76
-
77
-
The @wieand1994stopping rule recommends stopping after 50% of planned events accrue if the observed HR > 1.
78
-
kornfreidlin2018 modified this by adding a second interim analysis after 75% of planned events and stop if the observed HR > 1
79
-
This is implemented here by requiring a trend in favor of control with a direction $Z$-bound at 0 resulting in the *Nominal p* bound being 0.5 for interim analyses in the table below.
80
-
A fixed bound is specified with the `gs_b()` function for `upper` and `lower` and its corresponding parameters `upar` for the upper (efficacy) bound and `lpar` for the lower (futility) bound.
81
-
The final efficacy bound is for a 1-sided nominal p-value of 0.025; the futility bound lowers this to 0.0247 as noted in the lower-right-hand corner of the table below.
82
-
It is < 0.025 since the probability is computed with the binding assumption.
83
-
This is an arbitrary convention; if the futility bound is ignored,
84
-
this computation yields 0.025.
85
-
In the last row under *Alternate hypothesis* below we see the power is 88.44%.
86
-
@korn2018interim computed 88.4% power for this design with 100,000 simulations which estimate the standard error for the power calculation to be `r paste(100 * round(sqrt(.884 * (1 - .884) / 100000), 4), "%", sep = "")`.
87
-
88
-
```{r}
89
-
wieand <- gs_power_ahr(
90
-
enroll_rate = enroll_rate, fail_rate = fail_rate,
91
-
upper = gs_b, upar = c(rep(Inf, 2), qnorm(.975)),
92
-
lower = gs_b, lpar = c(0, 0, -Inf),
93
-
event = 512 * c(.5, .75, 1)
94
-
)
95
-
wieand |>
96
-
summary() |>
97
-
as_gt(
98
-
title = "Group sequential design with futility only at interim analyses",
99
-
subtitle = "Wieand futility rule stops if HR > 1"
100
-
)
101
-
```
102
-
103
76
# Beta-spending futility bound with AHR
104
77
105
-
Need to summarize here.
78
+
Beta-spending allocates the Type II error rate ($\beta$) across interim analyses in a group sequential design.
79
+
At each interim analysis, a portion of the total allowed $\beta$ is spent to determine the futility boundary.
80
+
The cumulative $\beta$ spent up to each analysis is specified by a beta-spending function ($\beta(t)$ with $\beta(0) = 0$ and $\beta(1) = \beta$).
81
+
The AHR model for the NPH alternate hypothesis accounts for the assumed early lack of benefit.
82
+
83
+
Methodology, the futility bound of IA1 (denoted as $a_1$) is
84
+
$$
85
+
a_1 = \left\{
86
+
a_1
87
+
:
88
+
\text{Pr}
89
+
\left(
90
+
\underbrace{Z_1 \leq a_1}_{\text{fail at IA1}} \; | \; H_1
91
+
\right)
92
+
=
93
+
\beta(t_1)
94
+
\right\}.
95
+
$$
96
+
The futility bound at IA2 (denoted as $a_2$) is
97
+
$$
98
+
a_2
99
+
=
100
+
\left\{
101
+
a_2
102
+
:
103
+
\text{Pr}
104
+
\left(
105
+
\underbrace{Z_2 \leq a_2}_{\text{fail at IA2}} \;
106
+
\text{ and }
107
+
\underbrace{a_1 < Z_i < b_1}_{\text{continue at IA1}}
108
+
\; | \;
109
+
H_1
110
+
\right)
111
+
=
112
+
\beta(t_2) - \beta(t_1)
113
+
\right\}.
114
+
$$
115
+
The futility bound after IA2 can be derived in the similar logic.
106
116
117
+
In this example, the group sequential design with the $\beta$-spending of AHR can be derived as below.
The @wieand1994stopping rule recommends stopping the trial if the observed HR exceeds 1 after 50% of planned events. @korn2018interim extends this approach by adding a second interim analysis at 75% of planned events, also stopping if HR > 1.
145
+
146
+
Here, we implement these futility rules by setting a Z-bound at 0, corresponding to a nominal p-value bound of approximately 0.5 at interim analyses. Fixed bounds are specified via the `gs_b()` function for both efficacy and futility boundaries.
147
+
148
+
The final efficacy bound is for a 1-sided nominal p-value of 0.025; the futility bound lowers this to 0.0247 as noted in the lower-right-hand corner of the table below. It is < 0.025 since the probability is computed with the binding assumption. This is an arbitrary convention; if the futility bound is ignored, this computation yields 0.025.
149
+
150
+
The design has 88.44% power. This closely matches the 88.4% power from @korn2018interim with 100,000 simulations which estimate the standard error for the power calculation to be `r paste(100 * round(sqrt(.884 * (1 - .884) / 100000), 4), "%", sep = "")`.
151
+
152
+
```{r}
153
+
wieand <- gs_power_ahr(
154
+
enroll_rate = enroll_rate, fail_rate = fail_rate,
155
+
ratio = ratio,
156
+
# 2 IAs + 1 FA
157
+
event = 512 * c(.5, .75, 1),
158
+
# efficacy bound
159
+
upper = gs_b, upar = c(rep(Inf, 2), qnorm(.975)),
160
+
# futility bound
161
+
lower = gs_b, lpar = c(0, 0, -Inf),
162
+
163
+
)
164
+
165
+
wieand |>
166
+
summary() |>
167
+
as_gt(
168
+
title = "Group sequential design with futility only at interim analyses",
169
+
subtitle = "Wieand futility rule stops if HR > 1"
170
+
)
171
+
```
128
172
129
-
A classical $\beta-$spending bound would assume a constant treatment effect over time using the proportional hazards assumption. We use the average hazard ratio at the fixed design analysis for this purpose.
130
173
131
174
# Korn and Freidlin futility bound
132
175
133
-
The @korn2018interim futility bound is set *when at least 50% of the expected events have occurred and at least two thirds of the observed events have occurred later than 3 months from randomization*.
134
-
The expected timing for this is demonstrated below.
176
+
@korn2018interim addressed scenarios with delayed treatment effects by modifying the futility rule proposed by @wieand1994stopping. Their approach sets the futility bound when at least 50% of expected events have occurred, and at least two-thirds of these events happened after 3 months from randomization.
177
+
178
+
To illustrate this, we analyze the accumulation of events over time by `gsDesign2::expected_event()` , distinguishing between
179
+
+ events occurring during the initial 3-month no-effect period and
180
+
+ event accumulation through the 34.86 months planned trial duration.
We consider the accumulation of events over time that occur during the no-effect interval for the first 3 months after randomization and events after this time interval.
139
-
This is done for the overall trial without dividing out by treatment group using the `gsDesign2::AHR()` function.
140
-
We consider monthly accumulation of events through the 34.86 months planned trial duration.
141
-
We note in the summary of early expected events below that all events during the first 3 months on-study are expected prior to the first interim analysis.
ggtitle("Proportion of expected events occuring 3 months after study start") +
270
+
theme(axis.title.x = element_text(size = 18),
271
+
axis.title.y = element_text(size = 18),
272
+
axis.text.x = element_text(size = 20),
273
+
axis.text.y = element_text(size = 18),
274
+
plot.title = element_text(size = 12))
275
+
276
+
# plot p1 and p2 together
277
+
cowplot::plot_grid(p1, p2, nrow = 2,
278
+
rel_heights = c(0.5, 0.5))
279
+
```
280
+
281
+
As shown by the above plot, the IA1 analysis time is `r ia1_time |> round(2)`.
282
+
With the IA1 analysis time known, we now derive the group sequential design with the futility bound by @korn2018interim.
143
283
```{r}
144
-
event_accumulation <- pw_info(
145
-
enroll_rate = enroll_rate,
284
+
kf <- gs_power_ahr(
285
+
enroll_rate = enroll_rate,
146
286
fail_rate = fail_rate,
147
-
total_duration = c(1:34, 34.86),
148
-
ratio = 1
149
-
)
150
-
head(event_accumulation, n = 7) |> gt()
287
+
ratio = ratio,
288
+
# 2 IAs + 1 FA
289
+
event = 512 * c(.5, .75, 1),
290
+
analysis_time = c(ia1_time,
291
+
ia1_time + 0.01,
292
+
ia1_time + 0.02),
293
+
# efficacy bound
294
+
upper = gs_b,
295
+
upar = c(Inf, Inf, qnorm(.975)),
296
+
# futility bound
297
+
lower = gs_b,
298
+
lpar = c(0, 0, -Inf))
299
+
300
+
kf |>
301
+
summary() |>
302
+
as_gt(title = "Group sequential design with futility only",
303
+
subtitle = "Korn and Freidlin futility rule stops if HR > 1")
151
304
```
152
305
153
-
We can look at the proportion of events after the first 3 months as follows:
154
306
307
+
308
+
# Classical beta-spending futility bound
309
+
310
+
A classical $\beta-$spending bound would assume a constant treatment effect over time using the proportional hazards assumption. We use the average hazard ratio at the fixed design analysis for this purpose.
For the @korn2018interim bound the targeted timing is when both 50% of events have occurred and at least 2/3 are more than 3 months after enrollment with 3 months being the delayed effect period.
164
-
We see above that about 1/3 of events are still within 3 months of enrollment at month 20.
338
+
# Conclusion
165
339
166
-
## Korn and Freidlin bound
340
+
As an alternative ad hoc methods to account for delayed effects as proposed by @wieand1994stopping and @korn2018interim,
341
+
we propose a method for $\beta$-spending that automatically accounts for delayed effects.
342
+
We have shown that results compare favorably to the ad hoc methods, but control Type II error
343
+
and adapt to the timing and distribution of event times at the time of interim analysis.
0 commit comments