Usage Instructions

Follow these instructions to run the ATAC-seq QC analysis pipeline on your own data on your own computer.
Note for Windows users: Install git for windows and wherever these instructions say to run a command in your "command terminal" run that command in your "Git Bash" shell.

Install docker and/or verify that it's working

Carefully follow all the instructions here for your operating system: https://docs.docker.com/installation/
Do not proceed past this step until you have verified that docker is working for you. To test if docker works run the following command in your terminal. Continue to the next step only if the following command completes without any errors:

docker run hello-world

Clone this repository to your computer
In your command terminal, run:

git clone https://github.com/greysAcademicCode/docker-pipelines.git

Later on, if changes are made to this repository, you can update the files on your computer by running docker pull in your docker-pipelines folder. You'll now have a docker-pipelines folder on your computer. You should see a few files and folders in there including README.md.

Get the bowtie2 indices
In your command terminal run:

cd docker-pipelines
bash miscScripts/getBT2Index.sh

This will download ~15GB of bowtie2 index files for mm10, mm9, hg19 and hg38 and only needs to be done once.

Alternatively, you can trade this large download for another large download and a bunch of CPU time if you want to re-build the indices yourself by running miscScripts/makeBT2Index.sh. This requires that you have bowtie2-build in your PATH and optionally udr if you want super fast downloads, if you don't want to bother with getting udr, then edit that script so that SUPER_FAST_DOWNLOAD=false.

Get the latest Docker image
In your command terminal, run:

docker pull greyson/pipelines

If you get errors here, note the following: Access to this container is restricted on the request of Anshul Kundaje (it contains some of his group's code). To get access, you must do the following in order:

Create a docker user account
Email grey@christoforo.net a message containing your docker user name and that you'd like access to greyson/pipelines
Sign in as your user on the command line by running docker login
- This login is remembered for the future so it must be done only once per computer you wish to authorize

(Optional) Test the pipeline
In your terminal, make sure you're in the docker-pipelines directory from above and run:

bash runATACPipeline.sh

This will run the pipeline on some (very small) mouse and human example data I've included in the support file package. After waiting a bit, the result files should appear in a newly created folder called ATACPipeOutput. All of the pipeline analysis files will end up here. Look for the summary report in a file that ends with report.pdf

Analyze your data
If the above test worked out alright, you're ready to do the analysis on your own fastq data files. This is the same process as running the test above except that you must first name your data files in a special way and put them in the proper folder structure for the pipeline to find them:
In the docker-pipelines directory from above, a folder called inputData was created when you cloned the repo. This folder contains mouse and human subfolders which hold the data to be processed for that species. Inside the these species folders are data folders (named anything) that contain two fastq input data files. You can put as many data folders you like into the species folders, they will all be processed. Each of the two fastq files you put into the data folders must uniquely match the naming patterns *R1*fastq* and *R2*fastq*. So an example would be putting your two fastq input data files in a folder structure like this:

inputData/mouse/trialA/billyTheMouse_R1_brain.fastq.gz
inputData/mouse/trialA/billyTheMouse_R2_brain.fastq.gz

Then you simply run the wrapper script:

bash runATACPipeline.sh

and all of the data (for both mouse and human) you put into inputData will be processed sequentially with outputs appearing in ATACPipeOutput for each data folder you added. Look for the QC report files ending in report.pdf in a newly created folder called ATACPipeOutput/reports.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Usage Instructions

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally