MatrixEQTL.R

About

The bulk of this script consists of the MatrixEQTL package while the remainder is a lightweight wrapper and user options to interact more flexibly with this script. Please refer to the MatrixEQTL documentation for more information on how MatrixEQTL runs.

Important Options

Complete Options

      -sg or --snpgenotype
          file containing the sample genotypes for each snp
      -sl or --snplocaiton
          file containing the locations for each snp
      -ge or --geneexpression
          file containing genetic expression data for each sample
      -gl or --genelocation
          file containing the gene location data
      -t or --tag
          label for your output
      -o or --outputdir
          what directory would you like to output in
      --cis
          significance threshold to write out cis eQTLs. Default is 1.
      --cov
          file path to covariates file for MEQTL
      --trans
          Significance threshold to write out trans eQTLS. Default is 0.
      --window
          window to consider snps to be cis in. Any snp-gene pair found within the window will be considerd cis acting. Default is 1e6.

Defaults

By default the script outputs in the current working directory. default for cis maximum distance: 1e6 bp default threshold to write cis: pval <= 1 default threshold to write trans: pval <= 0

List Samples

sample list format

The sample list follows a fairly simple format.

There should be no headers
The first column should be a list of samples
One sample name per line
Sample names should not contain any file types (for example .fastq and/or .gz endings can be removed from the name e.g. sample1.fastq becomes sample1)
Sample names should not contain any path information (for example /home/data/sample1.fastq can be listed as sample1)
Sample names should not contain any of its end information (For example with paired end files you may have sample1_R1.fastq and sample1_R2.fastq. These may be reduced to simply sample1.)
The sample list should not contain any duplicates (For example with paired end files you may have sample1_R1.fastq and sample1_R2.fastq. These may be reduced to simply sample1)

Automatic name translation

Often times the sample names that fastq files are tagged with differ from how they need to be presented downstream. For example, genotype files may contain unique identifiers that differ from the fastq file names. With this in mind it is possible to introduce different names for your samples early on in the pipeline. To do this users can optionally introduce a second column of sample names to the sample list. The first column will still identify the correct input sample and outputs will simply be renamed to match the corresponding name in the second column. This name translation step will only take place during alignment steps. As such if users need to translate names they should do this right from the beginning Users only need to make this list once and it will not interfere with analysis downstream.

Generating the sample list

an easy and quick way to generate the sample list is using the unix command ls | cut -f 1 -d <delimeter> > sample_list.txt. This will cut file names into chunks that can be easily selected by the user. Please see the cut manual page for more details

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MatrixEQTL.R

About

Important Options

Complete Options

Defaults

List Samples

sample list format

Automatic name translation

Generating the sample list

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally