From 949d60447e42ffc2fd3fa60fdb8be6817ec550fc Mon Sep 17 00:00:00 2001
From: LittleBeannie <yujie.zhao@merck.com>
Date: Tue, 26 Aug 2025 16:22:52 -0400
Subject: [PATCH 1/7] Revise the vignette by adding an introduction of beta
 spending

---
 vignettes/articles/story-nph-futility.Rmd | 82 +++++++++++++++++------
 1 file changed, 60 insertions(+), 22 deletions(-)

diff --git a/vignettes/articles/story-nph-futility.Rmd b/vignettes/articles/story-nph-futility.Rmd
index f22e51afa..863c5ccf9 100644
--- a/vignettes/articles/story-nph-futility.Rmd
+++ b/vignettes/articles/story-nph-futility.Rmd
@@ -102,7 +102,41 @@ wieand %>%
 
 # Beta-spending futility bound with AHR
 
-Need to summarize here.
+Beta-spending allocates the Type II error rate ($\beta$) across interim analyses in a group sequential design. At each interim analysis, a portion of the total allowed $\beta$ is spent to determine the futility boundary. The cumulative $\beta$ spent up to each analysis is specified by a beta-spending function ($\beta(t)$ with $\beta(0) = 0$ and $\beta(1) = \beta$).
+
+The futility bound of IA1 $a_1$ is
+$$
+  a_1 = \left\{
+  a_1
+  :
+  \text{Pr}
+  \left(
+     \underbrace{Z_1 \leq a_1}_{\text{fail at IA1}} \; | \; H_1
+  \right)
+  =
+  \beta(t_1) 
+  \right\}.
+$$
+The futility bound at IA2 $a_2$ is 
+$$
+  a_2
+  =
+  \left\{
+  a_2
+  :
+  \text{Pr}
+  \left(
+    \underbrace{Z_2 \leq a_2}_{\text{fail at IA2}} \;
+    \text{ and }
+    \underbrace{a_1 < Z_i < b_1}_{\text{continue at IA1}}
+    \; | \;
+    H_1
+  \right)
+  =
+  \beta(t_2) - \beta(t_1)
+  \right\}.
+$$
+The futility bound after IA2 can be derived in the similar logic. 
 
 ```{r}
 betaspending <- gs_power_ahr(
@@ -124,40 +158,44 @@ betaspending %>%
   )
 ```
 
-# Classical beta-spending futility bound
-
-A classical $\beta-$spending bound would assume a constant treatment effect over time using the  proportional hazards assumption. We use the average hazard ratio at the fixed design analysis for this purpose.
-
 # Korn and Freidlin futility bound
 
 The @korn2018interim futility bound is set *when at least 50% of the expected events have occurred and at least two thirds of the observed events have occurred later than 3 months from randomization*.
 The expected timing for this is demonstrated below.
 
-## Accumulation of events by time interval
-
-We consider the accumulation of events over time that occur during the no-effect interval for the first 3 months after randomization and events after this time interval.
-This is done for the overall trial without dividing out by treatment group using the `gsDesign2::AHR()` function.
+We consider the accumulation of events over time that occur during the no-effect interval for the first 3 months after randomization and events after this time interval by 
+This is done for the overall trial without dividing out by treatment group using the `gsDesign2::ahr()` function.
 We consider monthly accumulation of events through the 34.86 months planned trial duration.
 We note in the summary of early expected events below that all events during the first 3 months on-study are expected prior to the first interim analysis.
 
 ```{r}
-event_accumulation <- pw_info(
-  enroll_rate = enroll_rate,
-  fail_rate = fail_rate,
-  total_duration = c(1:34, 34.86),
-  ratio = 1
-)
-head(event_accumulation, n = 7) %>% gt()
+find_ia_time <- function(t) {
+  
+  e_event0 <- expected_event(
+    enroll_rate = betaspending$enroll_rate %>% mutate(rate = rate / 2), 
+    fail_rate = betaspending$fail_rate |> select(stratum, fail_rate, duration, dropout_rate),
+    total_duration = t, simple = FALSE)
+  
+  e_event1 <- expected_event(
+    enroll_rate = betaspending$enroll_rate %>% mutate(rate = rate / 2), 
+    fail_rate = betaspending$fail_rate %>% 
+      mutate(fail_rate = fail_rate * hr) |>
+      select(stratum, fail_rate, duration, dropout_rate), 
+    total_duration = t, simple = FALSE)
+  
+  total_event <- sum(e_event0$event) + sum(e_event1$event)
+  first3m_event <- sum(e_event0$event[1]) + sum(e_event1$event[1])
+    
+  return(2 / 3 * total_event - (total_event - first3m_event))
+} 
+
+ia1_time <- uniroot(find_ia_time, interval = c(1, 50))$root
 ```
 
-We can look at the proportion of events after the first 3 months as follows:
+The analysis time (in months) when at least 2/3 of the expected events occurred later than 3 months from randomization is 
 
 ```{r}
-event_accumulation %>%
-  group_by(time) %>%
-  summarize(`Total events` = sum(event), "Proportion early" = first(event) / `Total events`) %>%
-  ggplot(aes(x = time, y = `Proportion early`)) +
-  geom_line()
+ia1_time |> round(2)
 ```
 
 For the @korn2018interim bound the targeted timing is when both 50% of events have occurred and at least 2/3 are more than 3 months after enrollment with 3 months being the delayed effect period.

From 27c61fd5a42b53753e1acce03fd18f624ec297cf Mon Sep 17 00:00:00 2001
From: LittleBeannie <yujie.zhao@merck.com>
Date: Wed, 10 Sep 2025 13:29:43 -0400
Subject: [PATCH 2/7] get all code in

---
 vignettes/articles/story-nph-futility.Rmd | 220 ++++++++++++++++------
 1 file changed, 164 insertions(+), 56 deletions(-)

diff --git a/vignettes/articles/story-nph-futility.Rmd b/vignettes/articles/story-nph-futility.Rmd
index 863c5ccf9..72916c7ba 100644
--- a/vignettes/articles/story-nph-futility.Rmd
+++ b/vignettes/articles/story-nph-futility.Rmd
@@ -1,6 +1,6 @@
 ---
 title: "Futility bounds at design and analysis under non-proportional hazards"
-author: "Keaven M. Anderson"
+author: "Keaven M. Anderson and Yujie Zhao"
 output:
   rmarkdown::html_document:
     toc: true
@@ -26,18 +26,14 @@ library(ggplot2)
 # Overview
 
 We set up futility bounds under a non-proportional hazards assumption.
-We consider methods presented by @korn2018interim for setting such bounds and then consider an alternate futility bound based on $\beta-$spending under a delayed or crossing treatment effect to simplify implementation.
-Finally, we show how to update this $\beta-$spending bound based on blinded interim data.
-We will consider an example to reproduce a line of @korn2018interim Table 1 with the alternative futility bounds considered.
+We consider methods presented by @wieand1994stopping and @korn2018interim for setting such bounds and then consider an alternate futility bound based on $\beta-$spending under a delayed or crossing treatment effect to simplify implementation.
 
-## Initial design set-up for fixed analysis
-
-@korn2018interim considered delayed effect scenarios and proposed a futility bound that is a modification of an earlier method proposed by @wieand1994stopping.
 We begin with the enrollment and failure rate assumptions which @korn2018interim based on an example by @chen2013statistical.
 
 ```{r}
 # Enrollment assumed to be 680 patients over 12 months with no ramp-up
 enroll_rate <- define_enroll_rate(duration = 12, rate = 680 / 12)
+
 # Failure rates
 ## Control exponential with median of 12 mos
 ## Delayed effect with HR = 1 for 3 months and HR = .693 thereafter
@@ -48,17 +44,22 @@ fail_rate <- define_fail_rate(
   hr = c(1, .693),
   dropout_rate = 0
 )
+
 ## Study duration was 34.8 in Korn & Freidlin Table 1
 ## We change to 34.86 here to obtain 512 expected events more precisely
 study_duration <- 34.86
+
+# randomization ratio (exp:control)
+ratio <- 1
 ```
 
 We now derive a fixed sample size based on these assumptions.
 Ideally, we would allow a targeted event count and variable follow-up in `fixed_design_ahr()` so that the study duration will be computed automatically.
 
+As shown by the following fixed design, 512 events are expected over 34.86 months of study duration given the 680 subjects with a power around 90.45%. 
 ```{r}
 fixedevents <- fixed_design_ahr(
-  alpha = 0.025, power = NULL,
+  alpha = 0.025, power = NULL, ratio = ratio,
   enroll_rate = enroll_rate,
   fail_rate = fail_rate,
   study_duration = study_duration
@@ -88,9 +89,14 @@ In the last row under *Alternate hypothesis* below we see the power is 88.44%.
 ```{r}
 wieand <- gs_power_ahr(
   enroll_rate = enroll_rate, fail_rate = fail_rate,
+  ratio = ratio,
+  # 2 IAs + 1 FA
+  event = 512 * c(.5, .75, 1),
+  # efficacy bound
   upper = gs_b, upar = c(rep(Inf, 2), qnorm(.975)),
+  # futility bound
   lower = gs_b, lpar = c(0, 0, -Inf),
-  event = 512 * c(.5, .75, 1)
+  
 )
 wieand %>%
   summary() %>%
@@ -100,6 +106,149 @@ wieand %>%
   )
 ```
 
+
+# Korn and Freidlin futility bound
+
+@korn2018interim considered delayed effect scenarios and proposed a futility bound that is a modification of an earlier method proposed by @wieand1994stopping.
+
+The @korn2018interim futility bound is set *when at least 50% of the expected events have occurred and at least two thirds of the observed events have occurred later than 3 months from randomization*.
+The expected timing for this is demonstrated below.
+
+We consider the accumulation of events over time that occur during the no-effect interval for the first 3 months after randomization and events after this time interval by 
+This is done for the overall trial without dividing out by treatment group using the `gsDesign2::ahr()` function.
+We consider monthly accumulation of events through the 34.86 months planned trial duration.
+We note in the summary of early expected events below that all events during the first 3 months on-study are expected prior to the first interim analysis.
+
+```{r}
+find_ia_time <- function(t) {
+  
+  e_event0 <- expected_event(
+    enroll_rate = betaspending$enroll_rate %>% mutate(rate = rate / (1 + ratio)), 
+    fail_rate = betaspending$fail_rate |> select(stratum, fail_rate, duration, dropout_rate),
+    total_duration = t, simple = FALSE)
+  
+  e_event1 <- expected_event(
+    enroll_rate = betaspending$enroll_rate %>% mutate(rate = rate / (1 + ratio) * ratio), 
+    fail_rate = betaspending$fail_rate %>% 
+      mutate(fail_rate = fail_rate * hr) |>
+      select(stratum, fail_rate, duration, dropout_rate), 
+    total_duration = t, simple = FALSE)
+  
+  total_event <- sum(e_event0$event) + sum(e_event1$event)
+  first3m_event <- sum(e_event0$event[1]) + sum(e_event1$event[1])
+    
+  return(2 / 3 * total_event - (total_event - first3m_event))
+} 
+
+ia1_time <- uniroot(find_ia_time, interval = c(1, 50))$root
+```
+
+The analysis time (in months) when at least 2/3 of the expected events occurred later than 3 months from randomization is 
+```{r}
+ia1_time |> round(2)
+```
+
+We now visualize the expected events accumulation over time.
+
+```{r}
+# expected total events
+e_event_overtime <- sapply(1:betaspending$analysis$time[3], function(t){
+  e_event0 <- expected_event(
+    enroll_rate = betaspending$enroll_rate %>% mutate(rate = rate / (1 + ratio)), 
+    fail_rate = betaspending$fail_rate |> select(stratum, fail_rate, duration, dropout_rate),
+    total_duration = t, simple = TRUE)
+  
+  e_event1 <- expected_event(
+    enroll_rate = betaspending$enroll_rate %>% mutate(rate = rate / (1 + ratio) * ratio), 
+    fail_rate = betaspending$fail_rate %>% 
+      mutate(fail_rate = fail_rate * hr) |>
+      select(stratum, fail_rate, duration, dropout_rate), 
+    total_duration = t, simple = TRUE)
+  
+  sum(e_event0) + sum(e_event1)
+}) %>% unlist()
+
+
+# visualization of expected total events
+p1 <- ggplot(data = data.frame(time = 1:betaspending$analysis$time[3],
+                               event = e_event_overtime), 
+             aes(x = time, y = event)) +
+  geom_line(linewidth = 1.2) +
+  geom_vline(xintercept = ia1_time, linetype = "dashed", 
+             color = "red", linewidth = 1.2) +
+  scale_x_continuous(breaks = c(0, 6, 12, 18, 24, 30, 36),
+                     labels = c("0", "6", "12", "18", "24", "30", "36")) +
+  labs(x = "Months", y =  "Events") +
+  ggtitle("Expected events since study starts") +
+  theme(axis.title.x = element_text(size = 18),
+        axis.title.y = element_text(size = 18),
+        axis.text.x = element_text(size = 20),
+        axis.text.y = element_text(size = 18),
+        plot.title = element_text(size = 20))
+
+# expected events occur first 3 months
+e_event_first3m <- sapply(1:betaspending$analysis$time[3], function(t){
+  expected_event(enroll_rate = betaspending$enroll_rate %>% mutate(rate = rate / (1 + ratio)), 
+                 fail_rate = betaspending$fail_rate |> select(stratum, fail_rate, duration, dropout_rate), 
+                 total_duration = t, simple = FALSE)$event[1] +
+  expected_event(enroll_rate = betaspending$enroll_rate %>% mutate(rate = rate / (1 + ratio) * ratio), 
+                 fail_rate = betaspending$fail_rate %>% 
+                   mutate(fail_rate = fail_rate * hr) |>
+                   select(stratum, fail_rate, duration, dropout_rate), 
+                 total_duration = t, simple = FALSE)$event[1]
+})
+
+# visualization of expected events occur first 3 months
+p2 <- ggplot(data = data.frame(time = 1:betaspending$analysis$time[3],
+                               prop = (e_event_overtime - e_event_first3m) / e_event_overtime * 100), 
+             aes(x = time, y = prop)) +
+  geom_line(linewidth = 1.2) +
+  scale_x_continuous(breaks = c(6, 12, 18, 24, 30, 36),
+                     labels = c("6", "12", "18", "24", "30", "36")) +
+  scale_y_continuous(breaks = c(10, 30, 50, 70, 90, 100),
+                     labels = c("10%", "30%", "50%", "70%", "90%", "100%")) +
+  geom_hline(yintercept = 2/3*100, linetype = "dashed", 
+             color = "red", linewidth = 1.2) +
+  geom_vline(xintercept = ia1_time, linetype = "dashed", 
+             color = "red", linewidth = 1.2) +
+  labs(x = "Months", 
+       y = "Proportion") +
+  ggtitle("Proportion of expected events occuring 3 months after study start") +
+  theme(axis.title.x = element_text(size = 18),
+        axis.title.y = element_text(size = 18),
+        axis.text.x = element_text(size = 20),
+        axis.text.y = element_text(size = 18),
+        plot.title = element_text(size = 16))
+
+# plot p1 and p2 together
+cowplot::plot_grid(p1, p2, nrow = 2,
+                   rel_heights = c(0.5, 0.5))
+```
+
+The bound proposed by @korn2018interim is implemented below.
+```{r}
+kf <- gs_power_ahr(
+  enroll_rate = enroll_rate, 
+  fail_rate = fail_rate,
+  ratio = ratio,
+  # 2 IAs + 1 FA
+  event = 512 * c(.5, .75, 1),
+  analysis_time = c(19.1, 
+                    19.1 + 0.01, 
+                    19.1 + 0.02),
+  # efficacy bound
+  upper = gs_b, 
+  upar = c(Inf, Inf, qnorm(.975)),
+  # futility bound
+  lower = gs_b, 
+  lpar = c(0, 0, -Inf))
+
+kf |>
+  summary() |>
+  as_gt(title = "Group sequential design with futility only",
+        subtitle = "Korn and Freidlin futility rule stops if HR > 1") 
+```
+
 # Beta-spending futility bound with AHR
 
 Beta-spending allocates the Type II error rate ($\beta$) across interim analyses in a group sequential design. At each interim analysis, a portion of the total allowed $\beta$ is spent to determine the futility boundary. The cumulative $\beta$ spent up to each analysis is specified by a beta-spending function ($\beta(t)$ with $\beta(0) = 0$ and $\beta(1) = \beta$).
@@ -138,15 +287,20 @@ $$
 $$
 The futility bound after IA2 can be derived in the similar logic. 
 
+We consider a beta-spending function that spends 2.5% of Type II error rate at the final analysis.
 ```{r}
 betaspending <- gs_power_ahr(
   enroll_rate = enroll_rate,
   fail_rate = fail_rate,
+  ratio = ratio,
+  # 2 IAs + 1 FA
+  event = 512 * c(.5, .75, 1),
+  # efficacy bound
   upper = gs_b,
   upar = c(rep(Inf, 2), qnorm(.975)),
+  # futility bound
   lower = gs_spending_bound,
   lpar = list(sf = gsDesign::sfLDOF, total_spend = 0.025, param = NULL, timing = NULL),
-  event = 512 * c(.5, .75, 1),
   test_lower = c(TRUE, TRUE, FALSE)
 )
 
@@ -158,51 +312,5 @@ betaspending %>%
   )
 ```
 
-# Korn and Freidlin futility bound
-
-The @korn2018interim futility bound is set *when at least 50% of the expected events have occurred and at least two thirds of the observed events have occurred later than 3 months from randomization*.
-The expected timing for this is demonstrated below.
-
-We consider the accumulation of events over time that occur during the no-effect interval for the first 3 months after randomization and events after this time interval by 
-This is done for the overall trial without dividing out by treatment group using the `gsDesign2::ahr()` function.
-We consider monthly accumulation of events through the 34.86 months planned trial duration.
-We note in the summary of early expected events below that all events during the first 3 months on-study are expected prior to the first interim analysis.
-
-```{r}
-find_ia_time <- function(t) {
-  
-  e_event0 <- expected_event(
-    enroll_rate = betaspending$enroll_rate %>% mutate(rate = rate / 2), 
-    fail_rate = betaspending$fail_rate |> select(stratum, fail_rate, duration, dropout_rate),
-    total_duration = t, simple = FALSE)
-  
-  e_event1 <- expected_event(
-    enroll_rate = betaspending$enroll_rate %>% mutate(rate = rate / 2), 
-    fail_rate = betaspending$fail_rate %>% 
-      mutate(fail_rate = fail_rate * hr) |>
-      select(stratum, fail_rate, duration, dropout_rate), 
-    total_duration = t, simple = FALSE)
-  
-  total_event <- sum(e_event0$event) + sum(e_event1$event)
-  first3m_event <- sum(e_event0$event[1]) + sum(e_event1$event[1])
-    
-  return(2 / 3 * total_event - (total_event - first3m_event))
-} 
-
-ia1_time <- uniroot(find_ia_time, interval = c(1, 50))$root
-```
-
-The analysis time (in months) when at least 2/3 of the expected events occurred later than 3 months from randomization is 
-
-```{r}
-ia1_time |> round(2)
-```
-
-For the @korn2018interim bound the targeted timing is when both 50% of events have occurred and at least 2/3 are more than 3 months after enrollment with 3 months being the delayed effect period.
-We see above that about 1/3 of events are still within 3 months of enrollment at month 20.
-
-## Korn and Freidlin bound
-
-The bound proposed by @korn2018interim
 
 # References

From 1adf02cd1c9f5c244e166677d134320b0b372363 Mon Sep 17 00:00:00 2001
From: LittleBeannie <yujie.zhao@merck.com>
Date: Wed, 10 Sep 2025 14:03:21 -0400
Subject: [PATCH 3/7] Revise per Coplite's wording suggestions

---
 vignettes/articles/story-nph-futility.Rmd | 170 ++++++++++++----------
 1 file changed, 91 insertions(+), 79 deletions(-)

diff --git a/vignettes/articles/story-nph-futility.Rmd b/vignettes/articles/story-nph-futility.Rmd
index 72916c7ba..295924424 100644
--- a/vignettes/articles/story-nph-futility.Rmd
+++ b/vignettes/articles/story-nph-futility.Rmd
@@ -25,10 +25,9 @@ library(ggplot2)
 
 # Overview
 
-We set up futility bounds under a non-proportional hazards assumption.
-We consider methods presented by @wieand1994stopping and @korn2018interim for setting such bounds and then consider an alternate futility bound based on $\beta-$spending under a delayed or crossing treatment effect to simplify implementation.
+This vignette demonstrates possible ways to set up futility bounds in clinical trial designs under the assumption of non-proportional hazards. We introduce the methods proposed by @wieand1994stopping and @korn2018interim, and also cover an alternative futility bound based on $\beta$-spending .
 
-We begin with the enrollment and failure rate assumptions which @korn2018interim based on an example by @chen2013statistical.
+We start by specifying the enrollment and failure rate assumptions, following the example used by @korn2018interim (based on @chen2013statistical).
 
 ```{r}
 # Enrollment assumed to be 680 patients over 12 months with no ramp-up
@@ -53,10 +52,7 @@ study_duration <- 34.86
 ratio <- 1
 ```
 
-We now derive a fixed sample size based on these assumptions.
-Ideally, we would allow a targeted event count and variable follow-up in `fixed_design_ahr()` so that the study duration will be computed automatically.
-
-As shown by the following fixed design, 512 events are expected over 34.86 months of study duration given the 680 subjects with a power around 90.45%. 
+In this example, with 680 subjects enrolled over 12 months, we expect 512 events to occur within 34.86 months, yielding approximately 90.45% power.
 ```{r}
 fixedevents <- fixed_design_ahr(
   alpha = 0.025, power = NULL, ratio = ratio,
@@ -72,19 +68,78 @@ fixedevents %>%
   fmt_number(columns = 3:4, decimals = 2) %>%
   fmt_number(columns = 5:6, decimals = 3)
 ```
+# Beta-spending futility bound with AHR
+
+Beta-spending allocates the Type II error rate ($\beta$) across interim analyses in a group sequential design. At each interim analysis, a portion of the total allowed $\beta$ is spent to determine the futility boundary. The cumulative $\beta$ spent up to each analysis is specified by a beta-spending function ($\beta(t)$ with $\beta(0) = 0$ and $\beta(1) = \beta$).
+
+Methodology, the futility bound of IA1 (denoted as $a_1$) is
+$$
+  a_1 = \left\{
+  a_1
+  :
+  \text{Pr}
+  \left(
+     \underbrace{Z_1 \leq a_1}_{\text{fail at IA1}} \; | \; H_1
+  \right)
+  =
+  \beta(t_1) 
+  \right\}.
+$$
+The futility bound at IA2  (denoted as $a_2$) is 
+$$
+  a_2
+  =
+  \left\{
+  a_2
+  :
+  \text{Pr}
+  \left(
+    \underbrace{Z_2 \leq a_2}_{\text{fail at IA2}} \;
+    \text{ and }
+    \underbrace{a_1 < Z_i < b_1}_{\text{continue at IA1}}
+    \; | \;
+    H_1
+  \right)
+  =
+  \beta(t_2) - \beta(t_1)
+  \right\}.
+$$
+The futility bound after IA2 can be derived in the similar logic. 
+
+In this example, the group sequential design with the $\beta$-spending of AHR can be derived as below. 
+```{r}
+betaspending <- gs_power_ahr(
+  enroll_rate = enroll_rate,
+  fail_rate = fail_rate,
+  ratio = ratio,
+  # 2 IAs + 1 FA
+  event = 512 * c(.5, .75, 1),
+  # efficacy bound
+  upper = gs_b,
+  upar = c(rep(Inf, 2), qnorm(.975)),
+  # futility bound
+  lower = gs_spending_bound,
+  lpar = list(sf = gsDesign::sfLDOF, total_spend = 0.025, param = NULL, timing = NULL),
+  test_lower = c(TRUE, TRUE, FALSE)
+)
+
+betaspending %>%
+  summary() %>%
+  as_gt(
+    title = "Group sequential design with futility only",
+    subtitle = "Beta-spending futility bound"
+  )
+```
 
 # Modified Wieand futility bound
 
-The @wieand1994stopping rule recommends stopping after 50% of planned events accrue if the observed HR > 1.
-kornfreidlin2018 modified this by adding a second interim analysis after 75% of planned events and stop if the observed HR > 1
-This is implemented here by requiring a trend in favor of control with a direction $Z$-bound at 0 resulting in the *Nominal p* bound being 0.5 for interim analyses in the table below.
-A fixed bound is specified with the `gs_b()` function for `upper` and `lower` and its corresponding parameters `upar` for the upper (efficacy) bound and `lpar` for the lower (futility) bound.
-The final efficacy bound is for a 1-sided nominal p-value of 0.025; the futility bound lowers this to 0.0247 as noted in the lower-right-hand corner of the table below.
-It is < 0.025 since the probability is computed with the binding assumption.
-This is an arbitrary convention; if the futility bound is ignored,
-this computation yields 0.025.
-In the last row under *Alternate hypothesis* below we see the power is 88.44%.
-@korn2018interim computed 88.4% power for this design with 100,000 simulations which estimate the standard error for the power calculation to be `r paste(100 * round(sqrt(.884 * (1 - .884) / 100000), 4), "%", sep = "")`.
+The @wieand1994stopping rule recommends stopping the trial if the observed HR exceeds 1 after 50% of planned events. @korn2018interim extends this approach by adding a second interim analysis at 75% of planned events, also stopping if HR > 1.
+
+Here, we implement these futility rules by setting a Z-bound at 0, corresponding to a nominal p-value bound of approximately 0.5 at interim analyses. Fixed bounds are specified via the `gs_b()` function for both efficacy and futility boundaries.
+
+The final efficacy bound is for a 1-sided nominal p-value of 0.025; the futility bound lowers this to 0.0247 as noted in the lower-right-hand corner of the table below. It is < 0.025 since the probability is computed with the binding assumption. This is an arbitrary convention; if the futility bound is ignored, this computation yields 0.025.
+
+The design has 88.44% power. This closely matches the 88.4% power from @korn2018interim with 100,000 simulations which estimate the standard error for the power calculation to be `r paste(100 * round(sqrt(.884 * (1 - .884) / 100000), 4), "%", sep = "")`.
 
 ```{r}
 wieand <- gs_power_ahr(
@@ -109,15 +164,11 @@ wieand %>%
 
 # Korn and Freidlin futility bound
 
-@korn2018interim considered delayed effect scenarios and proposed a futility bound that is a modification of an earlier method proposed by @wieand1994stopping.
+@korn2018interim addressed scenarios with delayed treatment effects by modifying the futility rule proposed by @wieand1994stopping. Their approach sets the futility bound when at least 50% of expected events have occurred, and at least two-thirds of these events happened after 3 months from randomization.
 
-The @korn2018interim futility bound is set *when at least 50% of the expected events have occurred and at least two thirds of the observed events have occurred later than 3 months from randomization*.
-The expected timing for this is demonstrated below.
-
-We consider the accumulation of events over time that occur during the no-effect interval for the first 3 months after randomization and events after this time interval by 
-This is done for the overall trial without dividing out by treatment group using the `gsDesign2::ahr()` function.
-We consider monthly accumulation of events through the 34.86 months planned trial duration.
-We note in the summary of early expected events below that all events during the first 3 months on-study are expected prior to the first interim analysis.
+To illustrate this, we analyze the accumulation of events over time by `gsDesign2::expected_event()` , distinguishing between
++ events occurring during the initial 3-month no-effect period and 
++ event accumulation through the 34.86 months planned trial duration.
 
 ```{r}
 find_ia_time <- function(t) {
@@ -143,13 +194,6 @@ find_ia_time <- function(t) {
 ia1_time <- uniroot(find_ia_time, interval = c(1, 50))$root
 ```
 
-The analysis time (in months) when at least 2/3 of the expected events occurred later than 3 months from randomization is 
-```{r}
-ia1_time |> round(2)
-```
-
-We now visualize the expected events accumulation over time.
-
 ```{r}
 # expected total events
 e_event_overtime <- sapply(1:betaspending$analysis$time[3], function(t){
@@ -218,14 +262,15 @@ p2 <- ggplot(data = data.frame(time = 1:betaspending$analysis$time[3],
         axis.title.y = element_text(size = 18),
         axis.text.x = element_text(size = 20),
         axis.text.y = element_text(size = 18),
-        plot.title = element_text(size = 16))
+        plot.title = element_text(size = 12))
 
 # plot p1 and p2 together
 cowplot::plot_grid(p1, p2, nrow = 2,
                    rel_heights = c(0.5, 0.5))
 ```
 
-The bound proposed by @korn2018interim is implemented below.
+As shown by the above plot, the IA1 analysis time is `r ia1_time |> round(2)`. 
+With the IA1 analysis time known, we now derive the group sequential design with the futility bound by @korn2018interim.
 ```{r}
 kf <- gs_power_ahr(
   enroll_rate = enroll_rate, 
@@ -233,9 +278,9 @@ kf <- gs_power_ahr(
   ratio = ratio,
   # 2 IAs + 1 FA
   event = 512 * c(.5, .75, 1),
-  analysis_time = c(19.1, 
-                    19.1 + 0.01, 
-                    19.1 + 0.02),
+  analysis_time = c(ia1_time, 
+                    ia1_time + 0.01, 
+                    ia1_time + 0.02),
   # efficacy bound
   upper = gs_b, 
   upar = c(Inf, Inf, qnorm(.975)),
@@ -249,49 +294,18 @@ kf |>
         subtitle = "Korn and Freidlin futility rule stops if HR > 1") 
 ```
 
-# Beta-spending futility bound with AHR
 
-Beta-spending allocates the Type II error rate ($\beta$) across interim analyses in a group sequential design. At each interim analysis, a portion of the total allowed $\beta$ is spent to determine the futility boundary. The cumulative $\beta$ spent up to each analysis is specified by a beta-spending function ($\beta(t)$ with $\beta(0) = 0$ and $\beta(1) = \beta$).
 
-The futility bound of IA1 $a_1$ is
-$$
-  a_1 = \left\{
-  a_1
-  :
-  \text{Pr}
-  \left(
-     \underbrace{Z_1 \leq a_1}_{\text{fail at IA1}} \; | \; H_1
-  \right)
-  =
-  \beta(t_1) 
-  \right\}.
-$$
-The futility bound at IA2 $a_2$ is 
-$$
-  a_2
-  =
-  \left\{
-  a_2
-  :
-  \text{Pr}
-  \left(
-    \underbrace{Z_2 \leq a_2}_{\text{fail at IA2}} \;
-    \text{ and }
-    \underbrace{a_1 < Z_i < b_1}_{\text{continue at IA1}}
-    \; | \;
-    H_1
-  \right)
-  =
-  \beta(t_2) - \beta(t_1)
-  \right\}.
-$$
-The futility bound after IA2 can be derived in the similar logic. 
+# Classical beta-spending futility bound
 
-We consider a beta-spending function that spends 2.5% of Type II error rate at the final analysis.
+A classical $\beta-$spending bound would assume a constant treatment effect over time using the  proportional hazards assumption. We use the average hazard ratio at the fixed design analysis for this purpose.
 ```{r}
-betaspending <- gs_power_ahr(
+betaspending_classic <- gs_power_ahr(
   enroll_rate = enroll_rate,
-  fail_rate = fail_rate,
+  fail_rate = define_fail_rate(duration = Inf, 
+                               fail_rate = -log(.5) / 12,
+                               hr = fixedevents$analysis$ahr,
+                               dropout_rate = 0),
   ratio = ratio,
   # 2 IAs + 1 FA
   event = 512 * c(.5, .75, 1),
@@ -304,13 +318,11 @@ betaspending <- gs_power_ahr(
   test_lower = c(TRUE, TRUE, FALSE)
 )
 
-betaspending %>%
+betaspending_classic %>%
   summary() %>%
   as_gt(
     title = "Group sequential design with futility only",
-    subtitle = "Beta-spending futility bound"
+    subtitle = "Classical beta-spending futility bound"
   )
 ```
-
-
 # References

From d9a41da32da4c1176a15f7902014a18a164d8f8a Mon Sep 17 00:00:00 2001
From: LittleBeannie <yujie.zhao@merck.com>
Date: Wed, 10 Sep 2025 14:30:18 -0400
Subject: [PATCH 4/7] add `cowplot` to DESCRIPTION

---
 DESCRIPTION | 1 +
 1 file changed, 1 insertion(+)

diff --git a/DESCRIPTION b/DESCRIPTION
index 360a4ebd9..327cc9eaa 100644
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -57,6 +57,7 @@ Imports:
 Suggests:
     covr,
     ggplot2,
+    cowplot,
     kableExtra,
     knitr,
     rmarkdown,

From cb817095ec69cf99e3979ee3b0c7d1e06b1fc37b Mon Sep 17 00:00:00 2001
From: LittleBeannie <yujie.zhao@merck.com>
Date: Wed, 10 Sep 2025 14:30:35 -0400
Subject: [PATCH 5/7] `%>%` -> `|>`

---
 vignettes/articles/story-nph-futility.Rmd | 28 +++++++++++------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/vignettes/articles/story-nph-futility.Rmd b/vignettes/articles/story-nph-futility.Rmd
index 47e4951cd..6afecb2a2 100644
--- a/vignettes/articles/story-nph-futility.Rmd
+++ b/vignettes/articles/story-nph-futility.Rmd
@@ -155,8 +155,8 @@ wieand <- gs_power_ahr(
   
 )
 
-wieand %>%
-  summary() %>%
+wieand |>
+  summary() |>
   as_gt(
     title = "Group sequential design with futility only at interim analyses",
     subtitle = "Wieand futility rule stops if HR > 1"
@@ -176,13 +176,13 @@ To illustrate this, we analyze the accumulation of events over time by `gsDesign
 find_ia_time <- function(t) {
   
   e_event0 <- expected_event(
-    enroll_rate = betaspending$enroll_rate %>% mutate(rate = rate / (1 + ratio)), 
+    enroll_rate = betaspending$enroll_rate |> mutate(rate = rate / (1 + ratio)), 
     fail_rate = betaspending$fail_rate |> select(stratum, fail_rate, duration, dropout_rate),
     total_duration = t, simple = FALSE)
   
   e_event1 <- expected_event(
-    enroll_rate = betaspending$enroll_rate %>% mutate(rate = rate / (1 + ratio) * ratio), 
-    fail_rate = betaspending$fail_rate %>% 
+    enroll_rate = betaspending$enroll_rate |> mutate(rate = rate / (1 + ratio) * ratio), 
+    fail_rate = betaspending$fail_rate |> 
       mutate(fail_rate = fail_rate * hr) |>
       select(stratum, fail_rate, duration, dropout_rate), 
     total_duration = t, simple = FALSE)
@@ -200,19 +200,19 @@ ia1_time <- uniroot(find_ia_time, interval = c(1, 50))$root
 # expected total events
 e_event_overtime <- sapply(1:betaspending$analysis$time[3], function(t){
   e_event0 <- expected_event(
-    enroll_rate = betaspending$enroll_rate %>% mutate(rate = rate / (1 + ratio)), 
+    enroll_rate = betaspending$enroll_rate |> mutate(rate = rate / (1 + ratio)), 
     fail_rate = betaspending$fail_rate |> select(stratum, fail_rate, duration, dropout_rate),
     total_duration = t, simple = TRUE)
   
   e_event1 <- expected_event(
-    enroll_rate = betaspending$enroll_rate %>% mutate(rate = rate / (1 + ratio) * ratio), 
-    fail_rate = betaspending$fail_rate %>% 
+    enroll_rate = betaspending$enroll_rate |> mutate(rate = rate / (1 + ratio) * ratio), 
+    fail_rate = betaspending$fail_rate |> 
       mutate(fail_rate = fail_rate * hr) |>
       select(stratum, fail_rate, duration, dropout_rate), 
     total_duration = t, simple = TRUE)
   
   sum(e_event0) + sum(e_event1)
-}) %>% unlist()
+}) |> unlist()
 
 
 # visualization of expected total events
@@ -234,11 +234,11 @@ p1 <- ggplot(data = data.frame(time = 1:betaspending$analysis$time[3],
 
 # expected events occur first 3 months
 e_event_first3m <- sapply(1:betaspending$analysis$time[3], function(t){
-  expected_event(enroll_rate = betaspending$enroll_rate %>% mutate(rate = rate / (1 + ratio)), 
+  expected_event(enroll_rate = betaspending$enroll_rate |> mutate(rate = rate / (1 + ratio)), 
                  fail_rate = betaspending$fail_rate |> select(stratum, fail_rate, duration, dropout_rate), 
                  total_duration = t, simple = FALSE)$event[1] +
-  expected_event(enroll_rate = betaspending$enroll_rate %>% mutate(rate = rate / (1 + ratio) * ratio), 
-                 fail_rate = betaspending$fail_rate %>% 
+  expected_event(enroll_rate = betaspending$enroll_rate |> mutate(rate = rate / (1 + ratio) * ratio), 
+                 fail_rate = betaspending$fail_rate |> 
                    mutate(fail_rate = fail_rate * hr) |>
                    select(stratum, fail_rate, duration, dropout_rate), 
                  total_duration = t, simple = FALSE)$event[1]
@@ -320,8 +320,8 @@ betaspending_classic <- gs_power_ahr(
   test_lower = c(TRUE, TRUE, FALSE)
 )
 
-betaspending_classic %>%
-  summary() %>%
+betaspending_classic |>
+  summary() |>
   as_gt(
     title = "Group sequential design with futility only",
     subtitle = "Classical beta-spending futility bound"

From 160d431cb5440e6fbae3edb393ff15d5deca4cb2 Mon Sep 17 00:00:00 2001
From: Keaven <temzq@yahoo.com>
Date: Fri, 12 Sep 2025 06:51:49 -0400
Subject: [PATCH 6/7] Update story-nph-futility.Rmd

Ongoing edits
---
 vignettes/articles/story-nph-futility.Rmd | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/vignettes/articles/story-nph-futility.Rmd b/vignettes/articles/story-nph-futility.Rmd
index 6afecb2a2..76180775d 100644
--- a/vignettes/articles/story-nph-futility.Rmd
+++ b/vignettes/articles/story-nph-futility.Rmd
@@ -25,7 +25,10 @@ library(ggplot2)
 
 # Overview
 
-This vignette demonstrates possible ways to set up futility bounds in clinical trial designs under the assumption of non-proportional hazards. We introduce the methods proposed by @wieand1994stopping and @korn2018interim, and also cover an alternative futility bound based on $\beta$-spending .
+This vignette demonstrates possible ways to set up futility bounds in clinical trial designs under the assumption of non-proportional hazards. 
+We review the methods proposed by @wieand1994stopping and @korn2018interim.
+To be more consistent with common practice, we propose a futility bound based on $\beta$-spending that automatically accounts for
+non-proportional hazards as assumed in the design.
 
 We start by specifying the enrollment and failure rate assumptions, following the example used by @korn2018interim (based on @chen2013statistical).
 
@@ -45,14 +48,15 @@ fail_rate <- define_fail_rate(
 )
 
 ## Study duration was 34.8 in Korn & Freidlin Table 1
-## We change to 34.86 here to obtain 512 expected events more precisely
+## We change to 34.86 here to obtain 512 expected events they presented
 study_duration <- 34.86
 
 # randomization ratio (exp:control)
 ratio <- 1
 ```
 
-In this example, with 680 subjects enrolled over 12 months, we expect 512 events to occur within 34.86 months, yielding approximately 90.45% power.
+In this example, with 680 subjects enrolled over 12 months, we expect 512 events to occur within 34.86 months, 
+yielding approximately 90.45% power if no interim analyses are performed.
 ```{r}
 fixedevents <- fixed_design_ahr(
   alpha = 0.025, power = NULL, ratio = ratio,
@@ -71,7 +75,9 @@ fixedevents |>
 
 # Beta-spending futility bound with AHR
 
-Beta-spending allocates the Type II error rate ($\beta$) across interim analyses in a group sequential design. At each interim analysis, a portion of the total allowed $\beta$ is spent to determine the futility boundary. The cumulative $\beta$ spent up to each analysis is specified by a beta-spending function ($\beta(t)$ with $\beta(0) = 0$ and $\beta(1) = \beta$).
+Beta-spending allocates the Type II error rate ($\beta$) across interim analyses in a group sequential design. 
+At each interim analysis, a portion of the total allowed $\beta$ is spent to determine the futility boundary. 
+The cumulative $\beta$ spent up to each analysis is specified by a beta-spending function ($\beta(t)$ with $\beta(0) = 0$ and $\beta(1) = \beta$).
 
 Methodology, the futility bound of IA1 (denoted as $a_1$) is
 $$

From ff3e79132047a8cc591dab2253215cc5369eda88 Mon Sep 17 00:00:00 2001
From: Keaven <temzq@yahoo.com>
Date: Fri, 12 Sep 2025 09:42:12 -0400
Subject: [PATCH 7/7] Further rationale for NPH futility method vignette

---
 vignettes/articles/story-nph-futility.Rmd | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/vignettes/articles/story-nph-futility.Rmd b/vignettes/articles/story-nph-futility.Rmd
index 76180775d..3f08890e3 100644
--- a/vignettes/articles/story-nph-futility.Rmd
+++ b/vignettes/articles/story-nph-futility.Rmd
@@ -78,6 +78,7 @@ fixedevents |>
 Beta-spending allocates the Type II error rate ($\beta$) across interim analyses in a group sequential design. 
 At each interim analysis, a portion of the total allowed $\beta$ is spent to determine the futility boundary. 
 The cumulative $\beta$ spent up to each analysis is specified by a beta-spending function ($\beta(t)$ with $\beta(0) = 0$ and $\beta(1) = \beta$).
+The AHR model for the NPH alternate hypothesis accounts for the assumed early lack of benefit.
 
 Methodology, the futility bound of IA1 (denoted as $a_1$) is
 $$
@@ -333,4 +334,13 @@ betaspending_classic |>
     subtitle = "Classical beta-spending futility bound"
   )
 ```
+
+# Conclusion
+
+As an alternative ad hoc methods to account for delayed effects as proposed by @wieand1994stopping and @korn2018interim,
+we propose a method for $\beta$-spending that automatically accounts for delayed effects.
+We have shown that results compare favorably to the ad hoc methods, but control Type II error 
+and adapt to the timing and distribution of event times at the time of interim analysis.
+
+
 # References