ensemble_twas_weights.ipynb review

Again, eyeballing but:

1. This notebook can later be moved to https://github.com/StatFunGen/pecotmr/tree/main/inst/prototype. Unit tests can be move into pecotmr unit tests; contents in README.md can be merged into  this notebook or R codes in pecotmr @danielnachun
2. For  the evaluation metric, as i mentioned in #1 , after we remove the biased in-sample metric we will only keep the ground truth and out of sample Y predictions
3. Also mentioned in #1 , this codes can be simplified after we also create a `ensemble_twas_pipeline` function
4. I don't think we should set max_cv_variants to 500. It can be unlimited. I think previoulsy we did so for practical reasons? Here we don't have to
5. For failed pipeline, it is best to save the data to the results of failure regardless if we have save_data, so we can debug later what's going on. That is, for the code below we add data to it. And you don't have to make it `is.null`

```r
    if (is.null(res)) {
        cat("Pipeline failed — exiting non-zero so SoS flags this replicate\n")
        saveRDS(list(results = NULL, ensemble = NULL,
                     truth = list(rep_id = "${_replicates['rep_id']}",
                                  seed = ${_replicates['seed']},
                                  error = TRUE)),
                "${_output[0]}")
        quit(save = "no", status = 1)
    }
```
6. For this, we should also handel the failure same as above, if it fail, record data and quit:

```r
    ens <- tryCatch(
        ensemble_weights(
            cv_results = res$twas_cv_result,
            Y = y,
            twas_weight_list = res$twas_weights
        ),
        error = function(e) {
            cat("Ensemble error:", conditionMessage(e), "\n")
            NULL
        }
    )
```
7. Fix metric as I specified above to only keep a few
8. Fix documentation to make it consistent with what we actually do, in terms of the metric

@jaempawi we don't have to make any changes until this function is part of `pecotmr` hopefully some time next week before Broad run this in large scale. But I think once the code is in, they can run it and then we just take our time to change the benchmark pipeline. I will coordinate it on Slack

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ensemble_twas_weights.ipynb review #2

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

ensemble_twas_weights.ipynb review #2

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions