Champuru/tutorial.html at gh-pages · eeg-ebe/Champuru · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"><head>
<title> Champuru 2 Tutorial </title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<h1> Tutorial for Champuru 2 </h1>

<p> Here is a detailed description of how the example dataset was
generated from the original chromatograms accessible from the main page.
 Please note, however, that the method outlined here is only one among
many others; other paths may be more suitable to your data. </p>

<p> First, the two traces were imported into a contig editor (namely,
Sequencher) and assembled using the following assembly parameters: </p>

<p><img src="imgs/assembly_parameters.jpg" width="25%" alt="Screenshot of Sequencher showing the assembly parameters"></p>

<p> Here is how the aligned chromatograms look like: </p>

<p><img src="imgs/initial_chromatograms.jpg" width="55%" alt="Screenshot from Sequencher showing the alignment of the forward and reverse chromatograms."></p>

<p> Many double peaks are not properly called, and it is always a good
idea to start by correcting as many calling mistakes as possible before
submitting the sequences to Champuru; this can save a lot of time in the
 following steps. A convenient and fast way to do so is to is by looking
 for positions where the forward and reverse chromatograms are
incompatible in the alignment (represented by black dots on the figure
below): </p>

<p><img src="imgs/mismatches_trimmed.jpg" width="55%" alt="Screenshot from Sequencher showing the incompatibilities between the forward and reverse chromatograms"></p>

<p> Once all base-calling mistakes visible as sequence incompatibilities
 are corrected in the first alignment position, one may want to find the
 second, alternative alignment in order to correct other mistakes. This
can be achieved onscreen be dragging one sequence relatively to the
other until finding another alignment that minimizes the number of
incompatible positions. The distance observed at the end of each
chromatogram between the two end peaks can also help to estimate the
length difference between the two haplotypes and how far away one
chromatogram needs to be dragged relative to the other in order to find
the alternative alignment position: </p>

<p><img src="imgs/endpeaks_trimmed.jpg" width="55%" alt="Screenshot from Sequencher showing the end of the forward chromatogram, with two endpeaks two bases apart."></p>

<p> Here the distance between the two end peaks shows unambiguously that
 the length difference between the two haplotype is 2 bp; hence, the
second alternative alignment can be obtained from the first one by
dragging one chromatogram 2 bp in one direction or the other relative to
 the other chromatogram. </p>

<p> The same information can also be obtained by submitting the
partially corrected sequences to Champuru. The output will look like: </p>

<p><img src="imgs/demoA_trimmed.jpg" width="55%" alt="First Champuru screenshot"></p>

<p> The offset between the forward and reverse chromatogram for the
first aligment (best compatibility score) being 31 bp, and 33 bp for the
 second alignment (second best compatibility score), one alignment can
be obtained from the other by dragging one chromatogram by 33 - 31 = 2
bp. Alternatively, one can use the offset information to position
directly the chromatograms correctly in front of each other: here is for
 the first alignment (offset = 31 bp) </p>

<p><img src="imgs/startpeaks_1_trimmed.jpg" width="55%" alt="Screenshot from Sequencher showing the first offset alignment."></p>

<p> and here is for the second (offset = 33 bp). </p>

<p><img src="imgs/startpeaks_2_trimmed.jpg" width="55%" alt="Screenshot from Sequencher showing the second offset alignment."></p>

<p> Once all easily detected mistakes have been corrected, it is now
time to submit the corrected sequences to Champuru; the output will look
 like: </p>

<p><img src="imgs/Champuru_demoB.jpg" width="55%" alt="Second Champuru screenshot."></p>

<p> Thanks to all the pre-cleaning of base-calling mistakes, there are
only 4 problems left to be corrected, one in the forward chromatogram and the other three in the reverse chromatogram. Let us have a look, for instance, at position 143 on the
reverse chromatogram: </p>

<p><img src="imgs/cleanup_trimmed.jpg" width="55%" alt="Screenshot from Sequencher showing the mistake in front of position 143 on the reverse chromatogram."></p>

<p> One may be suprised to see that the base calling on the reverse
chromatogram at that position is perfectly correct; actually, the
problem is located on the other chromatogram (M instead of A) due to the
 presence of an artefactual peak. The locations given by Champuru for
the problems it detects are only approximate, and one sometimes needs to
 look closely at the surroundings of the indicated location in order to
find out what the problem is. </p>
<p> After the last remaining problems are corrected, Champuru is run one
 last time on the corrected chromatogram sequences in order to produce
the reconstructed haplotype sequences. The corrected chromatogram
sequences are the one provided on the main page as example dataset. </p>

<p><img src="imgs/demoC_trimmed.jpg" width="55%" alt="Third Champuru Screenshot"></p>

<p> The haplotype sequences can finally be saved on disk in a single FASTA file, or copy/pasted into other applications. </p>

<p><strong> This detailed description of the process may look
complicated, but with a bit of training it it possible to clean a pair
of chromatogram and obtain the corresponding haplotype sequences in less
 than five minutes. <a href="index.html">Have a try!</a></strong></p>


</body></html>