Using the existing nature methods GEP dataset, we want to analyze the post-transplant CD34+ dataset. The resulting dataset should show the usage scores for each of the GEPs.
- Visualizing plots on R and python (match UMAP from sc package)
- 40k reference not working for starCAT
- Negative values in cNMF data
- starCAT not matching all genes in reference
- 2k gene/GEP reference
- 40k gene/GEP reference
- Errors converting seurat R to python (h5ad)
- Code/Repo readability
postTrans_cd34
├───.gitignore
├───README.md
├───postTrans_v1
│ ├───postTrans_coordinates.csv (UMAP coordinates for python)
│ ├───postTrans_starCAT.rf_usage_normalized.txt (starCAT first run)
│ ├───postTrans_common_starCAT.rf_usage_normalized.txt (starCAT second run with common genes)
│ ├───starcat_analysis.ipynb (analysis notebook, R)
│ └───starcat_visualization.ipynb (visualization notebook, python)
├───postTrans_v2
│ ├───postTrans_coordinates.csv (UMAP coordinates for python)
│ ├───seu_common_starCAT.rf_usage_normalized.txt (starCAT run with common genes)
│ ├───starcat_analysis.ipynb (analysis notebook, R)
│ └───starcat_visualization.ipynb (visualization notebook, python)
└───References
├───cNMF4.gene_spectra_score.k_35.dt_0_15.csv (reference, 2k genes [csv file])
├───cNMF4.spectra.k_35.dt_0_15.consensus.txt (reference, 2k genes [txt file]) <= Used for starCAT analysis
├───cNMF4.spectra.k_35.dt_0_15.consensus.csv (reference, 40k genes [csv file])
└───cNMF4.spectra.k_35.dt_0_15.consensus.txt (reference, 40k genes [txt file])
- postTrans_v1: First run of starCAT analysis and data visualization on the preliminary dataset
- postTrans_v2: Second run of starCAT analysis and data visualization on the completed dataset
- References: GEP/gene references needed for starCAT analysis
- Immunogenomics. (n.d.). Immunogenomics/starcat: Implements *cellannotator (aka *cat/starcat), annotating scrna-seq with predefined gene expression programs. GitHub. https://github.com/immunogenomics/starCAT
- Kotliar, D., Curtis, M., Agnew, R., Weinand, K., Nathan, A., Baglaenko, Y., Zhao, Y., Sabeti, P. C., Rao, D. A., & Raychaudhuri, S. (2024). Reproducible Single Cell Annotation of Programs Underlying T-Cell Subsets, Activation States, and Functions. https://doi.org/10.1101/2024.05.03.592310
- Kotliar, D., Veres, A., Nagy, M. A., Tabrizi, S., Hodis, E., Melton, D. A., & Sabeti, P. C. (2019). Identifying gene expression programs of cell-type identity and cellular activity with single-cell RNA-seq. eLife, 8. https://doi.org/10.7554/elife.43803
- Li, H., Côté, P., Kuoch, M., Ezike, J., Frenis, K., Afanassiev, A., Greenstreet, L., Tanaka-Yano, M., Tarantino, G., Zhang, S., Whangbo, J., Butty, V. L., Moiso, E., Falchetti, M., Lu, K., Connelly, G. G., Morris, V., Wang, D., Chen, A. F., Bianchi, G., … Rowe, R. G. (2025). The dynamics of hematopoiesis over the human lifespan. Nature methods, 22(2), 422–434. https://doi.org/10.1038/s41592-024-02495-0
Research under the Li Lab, University of California, San Diego
Completed, bugs may appear