Skip to content

bplapply can get stuck #40

@halterc

Description

@halterc

Related to issue: carmonalab/HiTME#42

scGate can get stuck if run on a list of seurat objects.

Specifically the second bplapply progress bar gets stuck on the last worker typically, so it has to be this function call in scGate:

preds <- bplapply(X = names(model), BPPARAM = BPPARAM, FUN = function(m) {
  col.id <- paste0(output.col.name, "_", m)
  x <- run_scGate_singlemodel(data, model = model[[m]], 
    k.param = k.param, smooth.decay = smooth.decay, 
    smooth.up.only = smooth.up.only, param_decay = param_decay, 
    pca.dim = pca.dim, nfeatures = nfeatures, min.cells = min.cells, 
    assay = assay, slot = slot, genes.blacklist = genes.blacklist, 
    pos.thr = pos.thr, neg.thr = neg.thr, verbose = verbose, 
    reduction = reduction, colname = col.id, save.levels = save.levels)
  n_pure <- sum(x[, col.id] == "Pure")
  frac.to.keep <- n_pure/nrow(x)
  mess <- sprintf("\n### Detected a total of %i pure '%s' cells (%.2f%% of total)", 
    n_pure, m, 100 * frac.to.keep)
  message(mess)
  x
})

I tried to set the bparam parameter, including the built-in "timeout" parameter but it did not work.

bparam = BiocParallel::MulticoreParam(workers = ncores,
                                       progressbar = TRUE,
                                       timeout = timeout)

I don't know why it does not work for the bplapply in scGate because it did work on a simple example like this:

# This runs:
bplapply(X = 1:3, BPPARAM = MulticoreParam(workers = 3, progressbar = T, timeout = 3), FUN = function(m) {
  Sys.sleep(2)
  m+1
})

# This returns a timeout error:
bplapply(X = 1:3, BPPARAM = MulticoreParam(workers = 3, progressbar = T, timeout = 3), FUN = function(m) {
  Sys.sleep(5)
  m+1
})

I also tried to wrap it in a tryCatch and withTimeout, in order to evaluate a timeout outside of the whole function call. It does timeout but the problem is that if a worker gets stuck it's never shut down, so if I try again to Run.HiTME scGate does not start. I assume that bplapply cannot be started again due to the stuck zombie process that was not closed and keeps lingering around in the background (as seen in the activity monitor):

# Retry 5 times
for (a in 1:5) {
  cat(paste("Attempt", a, "to Run.HiTME ...\n"))
  result <- tryCatch({
    # The 'withTimeout' function will stop the execution after x seconds
    withTimeout({
      bplapply(X = 1:3, BPPARAM = MulticoreParam(workers = 3, progressbar = T, timeout = 3), FUN = function(m) {
        Sys.sleep(2)
        m+1
      })
      "Complete"
    }, timeout = 10)
  }, TimeoutException = function(te) {
    # This block runs if a timeout occurs
    cat("  Timeout occurred, retrying...\n")
    NULL # Return NULL to signify a failure
  }, error = function(er) {
    # This block runs if an error occurs
    cat("  Error occurred, retrying...\n")
    NULL # Return NULL to signify a failure
  })
  
  # If the result is not NULL, the function finished successfully
  if (!is.null(result)) {
    cat(paste("  Success on attempt", a, "with result:", result, "\n"))
    result <- NULL
    break # Exit the loop
  }
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions