Hello! Thank you again for creating and maintaining this package. I am running into a potential bug. If I use the harmonize() function and then try to perform LDSC to account for test statistic inflation I get an error. My code is as follows:
sumstats.harmonize(
ref_seq=args.fasta,
ref_rsid_vcf=args.dbsnp,
ref_infer=args.popvcf, # Ancestry specific. Logic handled by Nextflow
ref_alt_freq="AF",
threads=args.threads, # pass threads from nextflow process,
sweep_mode=True
)
# Perform LDSC correction
sumstats_hapmap3 = sumstats.filter_hapmap3(inplace=False)
sumstats_hapmap3.estimate_h2_by_ldsc(ref_ld = args.ldsc, w_ld = args.ldsc)
if np.float64(sumstats_hapmap3.ldsc_h2['Intercept'][0]) > 1:
# Perform correction
Traceback (most recent call last):
File "ampregnall/tools/nf-meta-gwas/bin/munge_sumstats.py", line 43,
in <module> sumstats_hapmap3.estimate_h2_by_ldsc(ref_ld = args.ldsc, w_ld = args.ldsc)
File "/opt/conda/lib/python3.12/site-packages/gwaslab/g_Sumstats.py", line 1544,
in estimate_h2_by_ldsc self.ldsc_h2, self.ldsc_h2_results = _estimate_h2_by_ldsc(insumstats=insumstats,
File "/opt/conda/lib/python3.12/site-packages/gwaslab/qc/qc_decorator.py", line 241, in wrapper result = func(*args, **kwargs)
File "/opt/conda/lib/python3.12/site-packages/gwaslab/util/util_ex_ldsc.py", line 404,
in _estimate_h2_by_ldsc summary = estimate_h2(sumstats, args = default_kwargs, log = log)
File "/opt/conda/lib/python3.12/site-packages/gwaslab/extension/ldsc/ldsc_sumstats.py", line 331,
in estimate_h2 M_annot, w_ld_cname, ref_ld_cnames, sumstats, novar_cols = _read_ld_sumstats(
File "/opt/conda/lib/python3.12/site-packages/gwaslab/extension/ldsc/ldsc_sumstats.py", line 254,
in _read_ld_sumstats sumstats = _merge_and_log(ref_ld, sumstats, 'reference panel LD', log)
File "/opt/conda/lib/python3.12/site-packages/gwaslab/extension/ldsc/ldsc_sumstats.py", line 239,
in _merge_and_log raise ValueError(msg.format(N=len(sumstats), F=noun))
ValueError: -After merging with reference panel LD, 0 SNPs remain.
However, if I save the summary statistics from the harmonize step and call a separate script that loads that file and performs LDSC everything works. From my pipeline design perspective I think separating the processes is actually better, but something weird seems to be happening. I have attached my log file. Thanks!
Hello! Thank you again for creating and maintaining this package. I am running into a potential bug. If I use the harmonize() function and then try to perform LDSC to account for test statistic inflation I get an error. My code is as follows:
The error traceback is:
However, if I save the summary statistics from the harmonize step and call a separate script that loads that file and performs LDSC everything works. From my pipeline design perspective I think separating the processes is actually better, but something weird seems to be happening. I have attached my log file. Thanks!
gwaslab-logs.txt