Skip to content

bmmoore43/Gene_coexpression_scripts

 
 

Repository files navigation

Gene_coexpression_scripts

Once you obtain your gene expression matrix

  1. Similarity Measures

    1. Calculate PCC or Spearman's rank (Similarity measures folder):

           python 01_PCC_pandas_20180918.py <expression matrix>
      
           python 01_spearman_pandas_20180918.py <expression matrix>
      
    2. Get random background for gene pairs (Clustering/PCC-SP folder):

           python get_random_distr_from_PCCmat.py <pcc_matrix> <number of random draws> <threshold>
      
  2. Clustering

    For PCC/SP:

         python get_clust_from_PCCmat_multipr.py <pcc_matrix> <random_threshold> <number of cpu nodes>
    

    Other clustering techniques include:

         kmeans, hclust (ward, complete and average linkages), cmeans, akkmeans, WGCNA
    

    For mutual rank see:

    https://github.com/bmmoore43/Gene_coexpression_MR

  3. Calculate pathway EC and random pathway ECs (for pathways only):

         EC=expression coherence=(# of gene pairs with PCC > PCC95)/total # of gene pairs)
    
         For a pathway with n genes, total number of gene pairs is taken as: [n*(n-1)]/2, without the self-pairs 
    
  4. Visualize clusters

    1. Normalize expression matrix: all values are normalized from 0 to 1 per gene

           python normalization.py <expression matrix> <row or col>
      
    2. Combine the expression matrix with each cluster

           python combine_exressionmatrix.py <cluster file> <normalized expression file>
      
    3. Get visualized expression cluster

      input is the output from step 2. This script is meant to run locally on your computer.

           coexpression_profile_from_cluster_plot_loop.R
      

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 49.5%
  • R 41.8%
  • Perl 5.7%
  • MATLAB 2.7%
  • Shell 0.3%