Skip to content

FFPolish errors out midway #11

@MoyukhShabon

Description

@MoyukhShabon

I previously reported an issue with installation of FFPolish via conda.

I worked around this by looking into the build.sh script provided in the conda directory and following the recipe. Then I created a conda environment with all the required dependencies listed in meta.yaml. By doing this, I could get FFpolish to run:

$ ffpolish -h
usage: ffpolish [-h] [-ll {DEBUG,INFO,WARNING,ERROR,CRITICAL}]
                {filter,extract} ...

FFPolish - Filter Artifacts From FFPE Variant Calls
Version 0.1
Copyright (C) 2020 Matthew Nguyen

positional arguments:
  {filter,extract}      Choose an FFPolish tool
    filter              Filter variants
    extract             Extract features for re-training

optional arguments:
  -h, --help            show this help message and exit
  -ll {DEBUG,INFO,WARNING,ERROR,CRITICAL}, --loglevel {DEBUG,INFO,WARNING,ERROR,CRITICAL}
                        Set the logging level

However, when I tried filtering FFPE artifacts, I ran into this issue:

$ ffpolish filter -o ./ -p ffpe.ffpolish ../../../../ref/tp53_hg38.fasta ffpe.calls.vcf.gz ffpe.bam
02:35:55 AM (1102 ms) -> INFO: Running FFPolish prediction
02:35:55 AM (1103 ms) -> INFO: Converting VCF to bed file
02:35:55 AM (1117 ms) -> INFO: Preparing data
-----------------------------------------------------
Starting preprocessing

Creating sites tsv
Traceback (most recent call last):
  File "/home/moyukh/repo/FFPolish/bin/ffpolish", line 55, in <module>
    args.cores, args.seed, args.loglevel)
  File "/home/moyukh/repo/FFPolish/bin/filter.py", line 94, in filter
    prep_data = dp.PrepareData(prefix, bam, bed_file_path, ref, outdir)
  File "/home/moyukh/repo/FFPolish/bin/deepsvr_utils.py", line 428, in __init__
    self._run_bam_readcount(skip_readcount)
  File "/home/moyukh/repo/FFPolish/bin/deepsvr_utils.py", line 448, in _run_bam_readcount
    review = self._parse_bed_file(self.bed, sites_file_path, self.sample)
  File "/home/moyukh/repo/FFPolish/bin/deepsvr_utils.py", line 478, in _parse_bed_file
    manual_review = pd.read_csv(bed_file_path, sep='\t', header=None, index_col=None)
  File "/home/moyukh/miniconda3/envs/ffpolish/lib/python3.7/site-packages/pandas/io/parsers.py", line 686, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "/home/moyukh/miniconda3/envs/ffpolish/lib/python3.7/site-packages/pandas/io/parsers.py", line 452, in _read
    parser = TextFileReader(fp_or_buf, **kwds)
  File "/home/moyukh/miniconda3/envs/ffpolish/lib/python3.7/site-packages/pandas/io/parsers.py", line 946, in __init__
    self._make_engine(self.engine)
  File "/home/moyukh/miniconda3/envs/ffpolish/lib/python3.7/site-packages/pandas/io/parsers.py", line 1178, in _make_engine
    self._engine = CParserWrapper(self.f, **self.options)
  File "/home/moyukh/miniconda3/envs/ffpolish/lib/python3.7/site-packages/pandas/io/parsers.py", line 2008, in __init__
    self._reader = parsers.TextReader(src, **kwds)
  File "pandas/_libs/parsers.pyx", line 540, in pandas._libs.parsers.TextReader.__cinit__
pandas.errors.EmptyDataError: No columns to parse from file

I have double checked my gzipped VCF which was created from the original VCF through bcftools and indexed using GATK.

bcftools view ffpe.calls.vcf -Oz -o ffpe.calls.vcf.gz
gatk IndexFeatureFile -I ffpe.calls.vcf.gz

They are all fine.

Is there a reason for this error to arise? I would appreciate if anyone could let me know a solution to circumvent this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions