forked from BaderLab/netDx
-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Labels
enhancementNew feature or requestNew feature or request
Description
We need a pipeline for preprocessing steps in assessing data quality and data cleaning before running the predictor. Currently there is no such mechanism in place. Operations pipeline would run:
identify structure in missingness of data
identify and flag outlier samples
run some unsupervised analyses on the samples. e.g. pca, hierarchical clustering
For continuous-valued data, compare several similarity metrics to find one which best separates classes. e.g. RNAcorr.R written by SP for PanCancer
Hierarchical clustering of classes and PCA, following same idea.
Running univariate test to prune matrix of variables that goes into netDx.
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request