Once you obtain your gene expression matrix
-
Similarity Measures
-
Calculate PCC or Spearman's rank (Similarity measures folder):
python 01_PCC_pandas_20180918.py <expression matrix> python 01_spearman_pandas_20180918.py <expression matrix> -
Get random background for gene pairs (Clustering/PCC-SP folder):
python get_random_distr_from_PCCmat.py <pcc_matrix> <number of random draws> <threshold>
-
-
Clustering
For PCC/SP:
python get_clust_from_PCCmat_multipr.py <pcc_matrix> <random_threshold> <number of cpu nodes>Other clustering techniques include:
kmeans, hclust (ward, complete and average linkages), cmeans, akkmeans, WGCNAFor mutual rank see:
-
Calculate pathway EC and random pathway ECs (for pathways only):
EC=expression coherence=(# of gene pairs with PCC > PCC95)/total # of gene pairs) For a pathway with n genes, total number of gene pairs is taken as: [n*(n-1)]/2, without the self-pairs -
Visualize clusters
-
Normalize expression matrix: all values are normalized from 0 to 1 per gene
python normalization.py <expression matrix> <row or col> -
Combine the expression matrix with each cluster
python combine_exressionmatrix.py <cluster file> <normalized expression file> -
Get visualized expression cluster
input is the output from step 2. This script is meant to run locally on your computer.
coexpression_profile_from_cluster_plot_loop.R
-