-
Notifications
You must be signed in to change notification settings - Fork 0
Configuration
WebDiffLogo can be configured as follows. There are a number of properties for the mode of motif analysis as well as the visualization of the result.
The user is able to select the measure for the quantification of symbol distribution differences which determines the heigth of each symbol stack in the difference logo. Depending on the measure, different properties are emphasized. Please find the mathematical details in section 3.1 in Additional File 1 of the DiffLogo publication. There are the following measures.
Shannon-Divergence The Jensen–Shannon divergence is a measure for the difference of two probability distributions based on information theory. This measure is symmetric and limited to [0, 1]. This measure especially emphasizes large distribution differences. Please find mathematical details in section 3.1.1 in here.
Sum of absolute probability differences The Sum of absolute probability differences is a measure for the absolute change of probabilities between two probability distributions. The Sum of absolute probability differences is symmetric and limited to [0, 2]. This measure especially emphasizes large changes of probabilities. Please find mathematical details in section 3.1.4 in here.
Sum of absolute information content (IC) differences The Sum of absolute IC differences is a measure for the absolute change of information content between two probability distributions. The Sum of absolute IC differences is symmetric and limited to [0, 2*log2(alphabet size)]. This measure especially emphasizes large changes of information content. Please find mathematical details in section 3.1.2 in here.
Loss of absolute information content (IC) differences The Loss of absolute IC differences is a measure for the absolute change of information content relative to the average information content of two probability distributions. The Loss of absolute IC differences is limited to [0, 2*log2(alphabet size)]. This measure especially emphasizes large changes of information content relative to the information content of the given distributions. Please find mathematical details in section 3.1.3 in here.
The user is able to select the measure for the determination of symbol sizes in the difference logo. Depending on the measure, different properties are emphasized. Please find the mathematical details in section 3.2 in Additional File 1 of the DiffLogo publication. There are the following measures.
Normalized difference of probabilities The Normalized difference of probabilities is a measure for the change of symbol-specific probability relative to the sum of absolute symbol-specific probability differences of the given probability distributions. The Normalized difference of probabilities is antisymmetric and limited to [-1/2, 1/2]. This measure especially emphasizes a large change of symbol-probability. For each position of the difference logo, the height of the symbol stack with negative values is equal to the height of the symbol stack with positive values, because each gain of symbol--probability implies a loss of probability for the remaining symbols and vice versa. Please find mathematical details in section 3.2.1 in here.
Difference of information contents (ICs) The Difference of information contents is a measure for the symbol-specific change of information content relative to the sum of absolute symbol-specific differences of information content of the given probability distributions. The Difference of information contents is antisymmetric and limited to [−1, 1]. This measure especially emphasizes a large change of symbol-specific information content. Please find mathematical details in section 3.2.2 in here.
If checked, the sequence logos of the input motifs will be displayed on top of the table of difference logos. This option applies only in case of more than two input motifs.
If checked, p-values for the significance of motif differences are calculated and displayed. More specifically, for each pair of input motifs and for each aligned motif position a permutation test is applied to calculate the p-value for the null hypothesis that both sets of given symbols are sampled from the same symbol distribution. The significance of the according symbol stack of the according difference logo is displayed with an asterisk for a p-values smaller than 0.05, two asterisks for a p-value smaller than 0.01, and three asterisks for a p-value smaller than 0.001.
if checked, the cluster tree is displayed on top of the table of difference logos. This option applies only in case of more than two input motifs and if the clustering of motifs is enabled.
If checked, the input motifs will be clustered by their overall similarity. This option applies only in case of more than two input motifs. The clustering of motifs has the advantage that similar motifs are placed close to each other and dissimilar motifs are placed apart to each other. The resulting table of difference logos is more clearly arranged and it is possible to detect subsets of similar motifs by eye.
Citation: Nettling, Treutler, et al. "DiffLogo: a comparative visualization of sequence motifs." BMC bioinformatics 16.1 (2015): 1. DOI: 10.1186/s12859-015-0767-x