This repository contains the codebase for the undergraduate thesis titled Modeling Tissue-Specific Aging Using Machine Learning, conducted by Wasif Jalal and Mubasshira Musarrat at the Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, under the supervision of Dr. M. Sohel Rahman.
Place the necessary tissue-specific expression data from Adult GTEx v10 in the following directories:
- TPM Expression Data:
gtex/datav10 - Read Count Data:
gtex/data
To process the data into our study's format, run proc/gtexv10_to_organage.sh.
Run the following scripts:
gcttoCsv.py- Scripts in
clustering/all_organandclustering/per_organ
true_age_interpolation.pytrue_age_output_view.py
pick_genes.py
deg_thresholding.pydeg_thresh_finetune.py- Set the optimal threshold from finetuning results in
pick_deg_optim.py
identify_organ_enriched_genes.py- Note: Artery Coronary and Aorta are treated as one organ.
stratified_split_dthhrdy.py
train_gtex_all_<regr>.pytest_gtex_train.pytissue_agegap_analytics_multi.py- Run:
stf_sp_train_test_multi.sh
- Run:
lpo_coeff_multi.sh all_agegap_analytics_multi.pyagegap_lpo_stats.py
To reproduce the experiments, follow the steps outlined above in the correct order. Adjust script parameters as needed for specific analyses.
For any questions, feel free to open an issue or reach out!