idaios/psmc
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Repository files navigation
This software package infers population size history from a diploid sequence
using the Pairwise Sequentially Markovian Coalescent (PSMC) model. The
detailed model is described in file `psmc.tex'.
To compile the binaries, you may run
make; (cd utils; make)
After that, you may try
utils/fq2psmcfa -q20 diploid.fq.gz > diploid.psmcfa
psmc -N25 -t15 -r5 -p "4+25*2+4+6" -o diploid.psmc diploid.psmcfa
utils/psmc2history.pl diploid.psmc | utils/history2ms.pl > ms-cmd.sh
utils/psmc_plot.pl diploid diploid.psmc
where `diploid.fq.gz' is typically the whole-genome diploid consensus sequence
of one human individual, which can be generated by, for example:
samtools mpileup -C50 -uf ref.fa aln.bam | bcftools view -c - \
| vcfutils.pl vcf2fq -D 100 | gzip > diploid.fq.gz
Program `fq2psmcfa' transforms the consensus sequence into a fasta-like format
where the i-th character in the output sequence indicates whether there is at
least one heterozygote in the bin [100i, 100i+100).
Program `psmc' infers the population size history. In particular, the `-p'
option specifies that there are 64 atomic time intervals and 28 (=1+25+1+1)
free interval parameters. The first parameter spans the first 4 atomic time
intervals, each of the next 25 parameters spans 2 intervals, the 27th spans 4
intervals and the last parameter spans the last 6 time intervals. The `-p' and
`-t' options are manually chosen such that after 20 rounds of iterations, at
least ~10 recombinations are inferred to occur in the intervals each parameter
spans. Impropriate settings may lead to overfitting. The command line in the
example above has been shown to be suitable for modern humans.
The `psmc' program infers the scaled mutation rate, the recombination rate and
the free population size parameters. All these parameters are scaled to 2N0. You
may run `psmc2history.pl' combined with `history2ms.pl' to generate the ms
command line that simulates the history inferred by PSMC, or visualize the result
with `psmc_plot.pl'.