The review is done by eyeballing so some comments may not be correct. Further AI assisted review might help:
- We need to make sure Y aligns with each fold of data by sample name rather than assuming they follow default order --- at this point I am not sure how your Y connects to the cv_results output. If the connection is not water tight then perhaps we need to output some quantities from
pecotmr's CV codes to ensure this match is fine?
- I don't think it is such good idea to output some of the "performance metric" because these are in-sample metric and overfitted. in our CV results we output metric because they are out of sample and reliable. I would remove these metrics
- The final weight is based only on the first data-set of input? That is unfair ... we should base it on the proper way in Dai et al considering all data-sets and all methods?
- I think we should add a
ensemble_twas_pipeline function to run the CV from pecotmr then combine with ensembe_twas_weights . There, we can provide default (cheap) ensemble method and compare that with the full set
The review is done by eyeballing so some comments may not be correct. Further AI assisted review might help:
pecotmr's CV codes to ensure this match is fine?ensemble_twas_pipelinefunction to run the CV from pecotmr then combine withensembe_twas_weights. There, we can provide default (cheap) ensemble method and compare that with the full set