- SSH to the HPC and make a project directory
- Do
git clone git@github.com:schumacherlab/neolution-prep.git(set up SSH keys or use http instead) - Required: In the
neolution-prepdirectory, create sub-directory1a_variants- in it, create sub-directory
vcfand copy VCF file(s) to it
- in it, create sub-directory
- Optional: Create directory
1b_rnaseq_datainneolution-prepfolder, then:- create sub-directory
processed_salmonorprocessedand copy expression level data to it (for Salmon or Cufflinks output, respectively) - create sub-directory
bamand copy BAM and BAI file(s) to it (necessary in case you want to determine expression of mutant allele)
- create sub-directory
- Edit the
runConfig.Rfile (some vars to check:commonPaths,runOptions$varcontext,runOptions$neolution) - Fill in the
sample_info.tsvfile. Make sure to leave tab separation between data! - Open
prepareNeolutionInput.R, edit the working directory and run interactively to prepare input files for neo-antigen predictions - If all files were generated successfully, start predictions by running the following command from the
neolution-prepdir in a shell:
nohup nice -n 10 Rscript runPredictions.R > nohup_preds.out &
This will start predictions in the background. stdout & stderr will be written tonohup_preds.out
| patient_id | dna_data_prefix | rna_data_prefix | hla_a_1 | hla_a_2 | hla_b_1 | hla_b_2 | hla_c_1 | hla_c_2 |
|---|---|---|---|---|---|---|---|---|
| TRIAL_ID #1 | 4152_1_CF8585_GATAGACA | 4153_1_CF8597_TTAGGCA | A0301 | A0101 | B0801 | B1601 | NA | NA |
| TRIAL_ID #2 | 4152_2_CF8714_GCCACATA | 4153_2_CF8716_ACTTGAA | A0201 | A0901 | B3603 | B5201 | NA | NA |
- The
_prefixcolumns should contain the (unique) beginnings of the dna and rna input filenames - Don't use special characters in HLA allele names (no asterix
*or colon:) - Empty cells or NAs can be used to exclude alleles