Skip to content

Latest commit

 

History

History
59 lines (39 loc) · 3.89 KB

File metadata and controls

59 lines (39 loc) · 3.89 KB

Cross-Ancestry Polygenic Risk Scores Enhance Alzheimer’s Disease Risk Prediction in Multiethnic Cohorts

Meri Okorie, Caroline Jonson, PhD, Alexis P. Oddi, Patricia A. Castruita, Brian Fulton-Howard, PhD, Kristine Yaffe, MD, Jennifer S. Yokoyama, PhD, Chinedu Udeh-Momoh, PhD, Shea J. Andrews, PhD for the Alzheimer’s Disease Sequencing Project and the Healthy Aging Brain Study - Health Disparities

Project Information

Evaluation of single-, multi-, and cross-ancestry approaches to Alzheimer's disease polygenic risk scores in diverse cohorts. Association analysis of PRS models and AD diagnosis and endophenotyes

Data Preprocessing

Quality control of the ADSP dataset

  1. Pre-filtering of low quality sample with DP and GQ (PLINK 1.9 availablt at https://www.cog-genomics.org/plink/)
  2. Defaul Genotools variant- and sample-level QC and filtering (https://github.com/dvitale199/GenoTools) + MAF filter
  3. Genetic ancestry estimation using pgsc_calc (pgsc_calc available at https://github.com/PGScatalog/pgsc_calc)
  4. Phenotype harmonization was done following NIAGADS phenotype harmonization protocol (found at https://github.com/NIAGADS/ADSPIntegratedPhenotypes), and the our adopted code can be found at workflow/scripts/https://github.com/makingphenofiles.qmd
  5. Admixture analysis was conducted using ADMIXTURE v1.3 available at https://github.com/NovembreLab/admixture/tree/master/releases

Quality control of the HABSHD

  1. Variant-level QC: exclude SNPs with call rate < 0.95 or HWE p < 1 × 10⁻⁶.
  2. Sample-level QC: remove samples with call rate < 0.95, sex discordance (X-chromosome heterozygosity), outlier heterozygosity, or cryptic relatedness (IBD > 0.1875 using KING).
  3. Imputation performed on the TOPMed Imputation Server (vR3) using Eagle (phasing) and Minimac3 (imputation).
  4. Post-imputation QC: remove variants with r² < 0.3 or MAF < 0.01; merge ancestry groups; remove poorly imputed variants (call rate < 95%).
  5. Genetic ancestry estimation using pgsc_calc (pgsc_calc available at https://github.com/PGScatalog/pgsc_calc)
  6. Excluded individuals with major neurological, psychiatric, or medical conditions affecting assessments.

PRS Construction

Pruning and thresholding (P+T) PRS models (EUR and MAMA) were constructed using PRSice (https://choishingwan.github.io/PRSice/) and PLINK 1.9 (https://www.cog-genomics.org/plink/) PRSCSx PRS models were constructed using PRSCSx (https://github.com/getian107/PRScsx.git)

1000 Genome Project reference LD panels from phase 3 were downloaded from the PRSCSx github repository (https://github.com/getian107/PRScsx.git) and were used for both P+T and PRSCSx models HapMap 3 reference SNPs (downloaded from https://www.broadinstitute.org/medical-and-population-genetics/hapmap-3) was used to select SNP sets for PRSCSx.

Data Structure

Project directory:

Tree diagram for data, code, outputs within the project directory.

project_directory # The working directory
└── workflow
    └── script
        ├── analyses_regression.qmd
        ├── data_harmonization.qmd
        ├── prscsx.py
        └── data_visualziation.qmd

Funding sources

C.J. is supported in part by the NIH Intramural Center for Alzheimer’s and Related Dementias (CARD), project NIH-NIA ZIAAG000534. J.S.Y. receives funding from NIH-NIA R01AG062588, R01AG057234, P30AG062422, P01AG019724, and U19AG079774; NIH-NINDS U54NS123985; the Rainwater Charitable Foundation; the Alzheimer’s Association; the Global Brain Health Institute; Genentech; the French Foundation; and the Mary Oakley Foundation. This work was conducted using the National Alzheimer’s Coordinating Center Uniform Dataset under application 10238; the Alzheimer’s Disease Neuroimaging Initiative under application SJA; and the Alzheimer’s Disease Sequencing Project under application 10050. SJA is supported by the National Alzheimer’s Coordinating Center New Investigator Award.