-
Notifications
You must be signed in to change notification settings - Fork 41
Open
Description
New Branch in GIT repn
- make a new branch
f1000_dev
on image
/home/ubuntu/scratch/ngseasy
Openstack VM
- space
- send key to amos
- 30+ CPU
- max RAM
- Volume : 4TB
Images
- build images
- build tool set
- build one image with all tools
Get Genomes
- hg19.fasta
- hs37d5.fasta
- GRCh38.p7.fasta
- hs38DH.fasta
- gatk resources bundles
17.05.2016
ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA_000001405.22_GRCh38.p7/GCA_000001405.22_GRCh38.p7_genomic.fna.gz
Get test data
- small 30-150x data set
Index Genomes
- bwa
- hg19.fasta
- hs37d5.fasta
- hs38DH.fasta
- snap
- hg19.fasta
- hs37d5.fasta
- hs38DH.fasta
- novoalign
- hg19.fasta
- hs37d5.fasta
- hs38DH.fasta
- bowtie2
- hg19.fasta
- hs37d5.fasta
- hs38DH.fasta
bwa
├── hs37d5.fasta
├── hs37d5.fasta.amb
├── hs37d5.fasta.ann
├── hs37d5.fasta.bwt
├── hs37d5.fasta.pac
├── hs37d5.fasta.sa
PLAN BY MONDAY 23rd
giab_data_indexes
https://github.com/genome-in-a-bottle/giab_data_indexes
Test Data
- 30x Exome
- 150x Exome
- 1x WGX at 30x min. (source better WGS data set as X10 is shit and messy)
GATK Gold Standard Run
- run bwa-realing-bsqr-haplotypecaller on all 3 data sets
This is the "Gold Standard". This will a week if no bugs.
The Glue
Open :-
- BASH done better than before
- logging
- read a user supplied config file (spreadsheet like)
- user specifies the pipeline
- SJN TO ADD CONFIG PARAMETER LIST
- consider converting to .yaml behind the scenes
- self checks : does input exist move on
RECON BY MONDAY NEXT WEEK
Metadata
Metadata
Assignees
Labels
No labels