Prediction of Saccharomyces cerevisiae fitness in different environments and cross-environment prediction of fitness using transfer learning.
Data
- Genetic interactions (Costanzo): Data Dryad
- Whole genome RNA-seq data for 1,000 isolates: SRA
- Genetic marker data (.gvcf): here
- Phenotype data: 35 conditions (YPD standard is control media used to normalize fitness values), 4 replicates, fitness = colony size normalized
Literature
- Peter et al. 2018: https://doi.org/10.1038/s41586-018-0030-5 (genetic markers, phenotype, whole genome RNA-seq)
- Costanzo et al. 2016: https://doi.org/10.1126/science.aaf1420 (genetic interactions)
| File/Directory | Description |
|---|---|
| Data | Datasets from the literature |
| Costanzo_S1/ | Data File S1. Raw genetic interaction datasets: Pair-wise interaction format |
| Costanzo_S2/ | Data File S2. Raw genetic interaction datasets: Matrix format |
| Peter_2018/ | Yeast diploid isolates' bi-allelic SNP and fitness data for 35 growth environments |
| S288C_reference_genome_R64-2-1_20150113/ | Reference yeast genome S288C files |
| All_genes_and_pathways_in_S._cerevisiae_S288c.txt | Yeast (S288C) genes and which pathways they belong to |
| All_pathways_S._cerevisiae_S288c.txt | Pathways and which yeast (S288C) genes are in them |
| Scripts | Code for various statistical and machine learning algorithms |
| 06_classify_SNPs_switchgrass.py | Peipei Wang's original code for classifying Switchgrass SNPs |
| 06_classify_SNPs_yeast.ipynb | Jupyter notebook for development purposes |
| 06_classify_SNPs_yeast.py | Adapted from Peipei's code to classify Yeast SNPs |
| External_software | See the following section |
| Job_Submission_Scripts | Contains SLURM job submission scripts for each prediction model |
| yeast_rrBLUP_results | Input and output files and figures for rrBLUP modelling |
| yeast_RF_results | Output files and figures for RF modelling |
| Software | Description |
|---|---|
| fastPHASE | Executable for imputation of missing genotypes from population data |
| Genomic_prediction_in_Switchgrass/ | Peipei Wang's code for rrBLUP |
| GWAS_NN | Code for "Gene-Gene Interaction Detection with Deep Learning" |
| ML-Pipeline/ | Shiu Lab Machine Learning Pipeline (RF code) |
| phase.2.1.1.linux | PHASE source code https://stephenslab.uchicago.edu/software.html |
| tasseladmin-tassel-5-standalone-8b0f83692ccb | TASSEL5 for kinship and linkage disequilibrium analysis |
Google Docs with information about all scripts and their development:
The google drive path to the file is Segura Abá_ShiuLab/Projects/Yeast GI Network/.