Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,8 @@ Authors@R: c(
Maintainer: Xuewei Cao <xc2270@cumc.columbia.edu>
Description: A multi-task learning approach to variable selection regression with highly correlated predictors and sparse effects,
based on frequentist statistical inference. It provides statistical evidence to identify which subsets of predictors have non-zero
effects on which subsets of response variables, motivated and designed for colocalization analysis of multiple genetic association studies.
effects on which subsets of response variables, motivated and designed for colocalization analysis across genome-wide association studies (GWAS)
and quantitative trait loci (QTL) studies.
The ColocBoost model is described in Cao et. al. (2025) <doi:10.1101/2025.04.17.25326042>.
Encoding: UTF-8
LazyDataCompression: xz
Expand Down
3 changes: 2 additions & 1 deletion LICENSE
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
YEAR: 2025
COPYRIGHT HOLDER: StatFunGen Lab, Columbia University
COPYRIGHT HOLDER: Xuewei Cao, Haochen Sun, Ru Feng, Daniel Nachun, Kushal Dey, Gao Wang
ORGANIZATION: StatFunGen Lab, Columbia University
1 change: 0 additions & 1 deletion R/colocboost_init.R
Original file line number Diff line number Diff line change
Expand Up @@ -191,7 +191,6 @@ colocboost_init_model <- function(cb_data,
}
tmp$multi_correction <- multiple_testing_correction
tmp$multi_correction_univariate <- multiple_testing_correction
if (length(multiple_testing_correction) == 1) print(class(multiple_testing_correction))
if (all(multiple_testing_correction == 1)) {
tmp$stop_null <- 1
} else if (min(multiple_testing_correction) > multi_test_max) {
Expand Down
3 changes: 3 additions & 0 deletions R/colocboost_plot.R
Original file line number Diff line number Diff line change
Expand Up @@ -204,6 +204,9 @@ colocboost_plot <- function(cb_output, y = "log10p",
} else {
bottom <- 2
}
# - restore users' options
oldpar <- par(no.readonly = TRUE)
on.exit(par(oldpar))
if (!is.null(cb_plot_init$title)) {
par(mfrow = c(nrow, plot_cols), mar = c(bottom, 5, 2, 1), oma = c(0, 0, 3, 0))
} else {
Expand Down
15 changes: 15 additions & 0 deletions cran-comments.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
## R CMD check results

0 errors | 0 warnings | 1 note

* This is a new release.

## Addtional comments

* This package implements methods described in our paper "ColocBoost" (Cao et al., 2025), added in DESCRIPTION
* Fixed issues requested by CRAN in previous submission:
- Reduced tarball less than 5 MB
- Fixed reset users' options issues
- Added proper COPYRIGHT HOLDER and ORGANIZATION to LICENSE
- Added explanation of acronyms used in this package to inst/WORDLIST
* The examples and vignettes use small datasets to avoid long check times
178 changes: 92 additions & 86 deletions inst/WORDLIST
Original file line number Diff line number Diff line change
@@ -1,86 +1,92 @@
Biobank
Bioinformatics
CFB
COLOC
CoS
Codecov
ColocBoost
Colocalization
Colocalized
Conda
FineBoost
GTEx
GWAS
HyPrColoc
INDELs
Jager
KK
LD
MAF
Mazumder
Micromamba
NPC
Najar
Nealelab
PIPs
PLINK
Pre
Recalibrate
SEL
SuSiE
Sumstat
UKBB
VCP
VPA
Xcorr
YI
al
bioinformatics
chrom
cis
colocalization
colocalize
colocalized
conda
de
decayrate
doi
eQTL
et
grey
iteratively
jk
ld
lfsr
lth
maf
medRxiv
modularity
nd
npc
omics
phenotypes
pixi
pos
pre
probabilistically
pvalue
qc
rcond
recalibrate
recalibrated
reconciliate
repo
rss
sQTL
subsampled
sumstat
sumstats
tabix
uCoS
uS
ucos
uncolocalized
vcp
xQTL
xQTLs
# Analysis Tools and Methods
ColocBoost # Multiple trait colocalization algorithm we propsed
Colocalization # Method to identify shared genetic signals
COLOC # A pairwise colocalization method name
FineBoost # Single trait fine-mapping algorithm we proposed
HyPrColoc # Hypothesis Prioritization in multi-trait Colocalization method name
SuSiE # Sum of Single Effects regression model

# Statistical and Genetic Terms
eQTL # Expression Quantitative Trait Loci
GWAS # Genome-Wide Association Study
INDELs # Insertions and Deletions
LD # Linkage Disequilibrium
ld # Linkage Disequilibrium
MAF # Minor Allele Frequency
maf # Minor Allele Frequency
pvalue # Statistical p-value
sQTL # Splicing Quantitative Trait Loci
Sumstat # Summary Statistics
sumstat # Summary Statistics
sumstats # Summary Statistics
xQTL # Any molecular Quantitative Trait Loci

# Software, Platforms and Tools
Biobank # Refer to UK Biobank, a large-scale biomedical database and research resource
Bioinformatics # Computational biology field
bioinformatics # Computational biology field
Codecov # Code coverage testing tool
Conda # Package and environment management system
GTEx # Genotype-Tissue Expression project
Micromamba # Lightweight Conda implementation
Nealelab # Lab developing genetics GWAS summary statistics
PLINK # Whole genome association analysis toolset
tabix # Tool for indexing genomic data
UKBB # UK Biobank dataset
medRxiv # A preprint resource
pixi # An environment manager
conda # Package and environment management system

# Researcher Names
Jager # Philip L. de Jager
KK # Kushal K. Dey
Mazumder # Rahul Mazumder
Najar # Cristian F. B. Najar
CFB # Cristian F. B. Najar
YI # Yang I. Li
et # and more
al # and more

# Technical Terms
cis # Referring to nearby location of a regulatory element
chrom # Chromosome
decayrate # Decay rate, an input parameter in our package
doi # Digital Object Identifier
grey # One color name in R
iteratively # Performed through iterations
lfsr # Local False Sign Rate
lth # Lower threshold
modularity # Property of network structure
omics # Collective biological data fields
phenotypes # Observable traits
pos # Position in genome
probabilistically # Based on probability theory
qc # Quality Control
rcond # Reciprocal condition number
reconciliate # Process of resolving discrepancies
repo # Repository
rss # Residual Sum of Squares
subsampled # Analyzed using data subsets
uncolocalized # Not showing colocalization
uCoS # Trait-specific (uncolocalization) confidence set
ucos # Trait-specific (uncolocalization) confidence set
VCP # Variant colocalization probability
VPA # Variant probability of association
vcp # Variant colocalization probability
Xcorr # Cross-correlation matrix
Recalibrate # Process of adjusting statistical calibration
recalibrate # Process of adjusting statistical calibration
recalibrated # Process of adjusting statistical calibration
Colocalized # Having undergone colocalization analysis
colocalize # Having undergone colocalization analysis
colocalized # Having undergone colocalization analysis
CoS # Colocalization confidence set in our proposed ColocBoost method
NPC # Normalization probability of colocalization in our proposed ColocBoost method
PIPs # Posterior Inclusion Probabilities
SEL # Single-effect learner in our proposed ColocBoost method
uS # The number of uCoS
npc # Normalization probability of colocalization in our proposed ColocBoost method
Pre # Before
pre # Before
jk # Index used in ColocBoost
nd # Second
6 changes: 6 additions & 0 deletions vignettes/Interpret_ColocBoost_Output.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -234,19 +234,25 @@ plot(res$model_info$outcome_profile_loglik[[i]], type="p", col="#CC3333", lwd=2,


```{r objective-proximity}
# Save to restore default options
oldpar <- par(no.readonly = TRUE)
# Plotting trait-specific proximity objective
par(mfrow=c(2,3), mar=c(4,4,2,1))
for(i in 1:5){
plot(res$model_info$outcome_proximity_obj[[i]], type="p", col="#3366CC", lwd=2, xlab="", ylab="Trait-specific Objective", main = paste0("Trait ", i))
}
par(oldpar)
```

```{r objective-best}
# Save to restore default options
oldpar <- par(no.readonly = TRUE)
# Plotting trait-specific objective at the best update variant
par(mfrow=c(2,3), mar=c(4,4,2,1))
for(i in 1:5){
plot(res$model_info$outcome_coupled_best_update_obj[[i]], type="p", col="#CC3333", lwd=2, xlab="", ylab=paste0("Objective at best update variant"), main = paste0("Trait ", i))
}
par(oldpar)
```

### 3.5. Trait-specific effects information (**`ucos_details`**)
Expand Down