Skip to content

ensemble_twas_weights.ipynb review #2

@gaow

Description

@gaow

Again, eyeballing but:

  1. This notebook can later be moved to https://github.com/StatFunGen/pecotmr/tree/main/inst/prototype. Unit tests can be move into pecotmr unit tests; contents in README.md can be merged into this notebook or R codes in pecotmr @danielnachun
  2. For the evaluation metric, as i mentioned in ensemble_weights.R review issues #1 , after we remove the biased in-sample metric we will only keep the ground truth and out of sample Y predictions
  3. Also mentioned in ensemble_weights.R review issues #1 , this codes can be simplified after we also create a ensemble_twas_pipeline function
  4. I don't think we should set max_cv_variants to 500. It can be unlimited. I think previoulsy we did so for practical reasons? Here we don't have to
  5. For failed pipeline, it is best to save the data to the results of failure regardless if we have save_data, so we can debug later what's going on. That is, for the code below we add data to it. And you don't have to make it is.null
    if (is.null(res)) {
        cat("Pipeline failed — exiting non-zero so SoS flags this replicate\n")
        saveRDS(list(results = NULL, ensemble = NULL,
                     truth = list(rep_id = "${_replicates['rep_id']}",
                                  seed = ${_replicates['seed']},
                                  error = TRUE)),
                "${_output[0]}")
        quit(save = "no", status = 1)
    }
  1. For this, we should also handel the failure same as above, if it fail, record data and quit:
    ens <- tryCatch(
        ensemble_weights(
            cv_results = res$twas_cv_result,
            Y = y,
            twas_weight_list = res$twas_weights
        ),
        error = function(e) {
            cat("Ensemble error:", conditionMessage(e), "\n")
            NULL
        }
    )
  1. Fix metric as I specified above to only keep a few
  2. Fix documentation to make it consistent with what we actually do, in terms of the metric

@jaempawi we don't have to make any changes until this function is part of pecotmr hopefully some time next week before Broad run this in large scale. But I think once the code is in, they can run it and then we just take our time to change the benchmark pipeline. I will coordinate it on Slack

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions