This software is for determining Evolutionary Rate Covariation (ERC) between pairs of genes, using R. Details on ERC and its use can be found here:
Little et al. ERC 2.0 - evolutionary rate covariation update improves inference of functional interactions across large phylogenies. 2025. Genome Research. 35: 2041-2051.
Little et al. Evolutionary rate covariation is a reliable predictor of co-functional interactions but not necessarily physical interactions. 2024. eLife. https://doi.org/10.7554/eLife.93333.3
Clark et al. Evolutionary rate covariation reveals shared functionality and coexpression of genes. 2012. 22: 714-720.
You will need a .tre file containing Newick format trees of the genes you want to examine. Part of the workflow is to create a master tree of all the species and their corresponding branch lengths. In light of this, optionally, you can provide a master tree, but if you do not have one the program will make one from the list of trees you provide in your .tre file.
The code will output a correlation residual matrix and a matrix with branch values. The residual matrix has numbers representing ERC correlations; The higher the value is, the more correlated the gene pair's relative evolutionary rates are.
Check our installation page for more details.
Here we provide a simple explanation of how to use ERC, but you can also check out our step-by-step tutorial .
The script you should open and use is runERC.R. This contains the workflow, which is comprised of several data-processing (and time-consuming) subfunctions.
For an overview of what these functions input, output, and do, please visit our functions page.
First, set the tree and output file names, on lines 11 and 12. You may also need to add the full path to the other R files in this package, such as ERC_functions.R (further up in the code). Finally, you're ready to run the code!
At the end of execution, you will have a data object called "corres" which will also be saved as your output .RDS file. Use corres[["cor"]] to see and operate on the correlation matrix. corres[["count"]] represents the number of observations/branches that went into each correlation. To further examine the output, use the following analysis functions.
You will also have ft_data, the Fisher transformed matrix of data. This is what we recommend operating on.
Analysis functions to evaluate your ERC matrices are found at the end of the ERC_functions.R file. Here is what they do:
clean_list(list, names): makes sure that every item inlistis found innamespair_list(list, erc_matrix, na.val = -2): creates vector list of gene x gene ERC values, with no repeats or self x self entries. Optional NA replacement valuemake_symmetric(erc_matrix): makes input ERC matrix symmetrical.permTestMat(list,matrix,perms=10000): tests forpermspermutatinos whetherlistgene list is significantly different from genes selected by random chancefishertransformed_updated(erc_matrix): used to fisher transform the values from computeERCbetweencomplex(list1,list2, erc_matrix): returns subset matrix based on two gene lists
- Install dependencies (visit install page)
- Download package (either download from github or use
git clone) - Open runERC.R. This is the file that you will need to execute the pipeline.
- Set input and output file names on lines 11-12 of runERC.R
- Run the program
- You may need to be explicit about source file paths if the code gives you an error.
- If using Rstudio, go to Session>Set Working Directory>To Source File Location to set the file location to runERC.R's containing folder
- Manipulate corres matrices
# 10 sample genes
genes = c("NSE5_1", "NSE6_3", "CSE1_3", "CSE1_1", "EXO70_1",
"MCM2_4", "MDY2_1", "ATP1_2", "MCM5_1", "SEC8_2")
# You could also generate a random sample with:
# genes = colnames(ft_data)[sample(1:length(ft_data), 10, replace=FALSE)]
# makes a matrix of the 10 genes against themselves
# (it can be against different genes too)
ft_filtered = betweencomplex(genes,genes,sym_ft)
#output
ft_filtered