Skip to content

TPMI-Taiwan/tpmi

Repository files navigation

1. Phenotyping

Disease was defined by phecodes. Quantitative traits were extracted from the EMR, including anthropometric, vital signs and laboratory measurements. The flow charts of quality control for quantitative traits can see 1-Phenotyping.

2. Genotyping QC and imputation

2.1 Genotyping QC

The quality control for genotyping can see 2.1-Genotyping_QC.

2.2 Imputation

Phasing was conducted with SHAPEIT5. Genome imputation was carried out with IMPUTE5. See 2.2-Imputation for details.

3. Genome-wide association study

3.1 PC-AiR, PC-Relate, and PRIMUS

PC-AiR and PC-Relate (GENESIS package) were used for PCA and relatedness estimation and PRIMUS was used for identifying the maximum unrelated set. See the pedigree reconstruction in genotyping QC for details.

3.2 Generalized linear mixed model

SAIGE was applied for the mixed effect model GWAS (SAIGE.sh and SAIGE_qtrait.sh).

3.3 Generalized linear model

PLINK2 was applied for the generalized linear model GWAS (plink_for_ldsc.sh).

4. Post GWAS analysis

4.1 Known loci replication

To evaluate the performance of our GWAS, PGRM was used to calculate the overall and power-adjusted replication rates and actual over expected ratio. (PGRM.R)

4.2 Fine-mapping

SuSiE was conducted for summary statistics-based fine-mapping. (fine-mapping.sh)

4.3 Heritability estimation

LDSC and LSH was used to estimate the SNP-based heritability. (LDSC.sh and LSH.R)

4.4 Gene-level heritability estimation

h2gene analysis was conducted to partition SNP-based heritability to the gene level. (H2Gene.sh)

4.5 Colocalization

To examine whether there are shared common genetic causal variants between tissue-specific gene expression and traits of interest. coloc was used to evaluate colocalization between gene expression and the trait of interest, and expression quantitative traits locus (eQTL) resources from 49 tissues in GTEx v8 were used for testing. (Coloc.R)

4.6 Genetic correlation and Clustering (LDSC: phenome-wide; popcorn: across populations )

LDSC was performed to obtain pairwise genetic correlations. Popcorn was performed for the cross-population genetic correlation. ( genetic_correlation.sh and popcorn.sh)

5. PRS

5.1 Individual phenotype

Five popular PRS tools were used for the single traits PRS model building

5.2 Mult-phenotype

PRSmix+ was performed for the multiple traits PRS model building (run_PRSmix_by_phecode.R)

5.3 Performance evaluation

The explained variance (r2) was used to evaluate the performance of PRS for quantitative traits. Two indices, area under the receiver operating characteristic curve (AUC) and liability-scaled r2, were used for PRS of disease. (calc_auc.py and calc_r2.R)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors