Skip to content

Commit 8e9bb8c

Browse files
authored
Merge pull request #3070 from chrishalcrow/unify-isi-violation-notation
Unify compute_isi_violation docs and add UltraMegaSort2000 citation
2 parents 211c222 + d1ea215 commit 8e9bb8c

File tree

3 files changed

+56
-36
lines changed

3 files changed

+56
-36
lines changed

doc/modules/qualitymetrics/isi_violations.rst

Lines changed: 29 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -10,46 +10,48 @@ Calculation
1010
Neurons have a refractory period after a spiking event during which they cannot fire again.
1111
Inter-spike-interval (ISI) violations refers to the rate of refractory period violations (as described by [Hill]_).
1212

13+
We aim to calculate the contamination rate :math:`C`, measuring the ratio of isi violations in the spike-train of a unit.
1314
The calculation works under the assumption that the contaminant events happen randomly or come from another neuron that is not correlated with our unit.
1415
A correlation will lead to an overestimation of the contamination, whereas an anti-correlation will lead to an underestimation.
1516

16-
Different formulas have been developed over the years.
17+
Different formulas have been developed, but all require:
1718

18-
Calculation from the [Hill]_ paper
19-
----------------------------------
19+
- :math:`T` the duration of the recording in seconds.
20+
- :math:`N` the number of spikes in the unit's spike train.
21+
- :math:`t_r` the duration of the unit's refractory period in seconds.
22+
23+
Calculation from the [UMS]_ package
24+
-----------------------------------
25+
26+
Originally implemented in the `rpv_contamination` calculation of the UltraMegaSort2000 package: `<https://github.com/danamics/UMS2K/blob/master/quality_measures/rpv_contamination.m>`_.
2027

21-
The following quantities are required:
28+
In this method the number of spikes whose refractory period are violated, denoted :math:`n_v`, is used.
2229

23-
- :math:`ISI_t` : biological threshold for ISI violation.
24-
- :math:`ISI_{min}`: minimum ISI threshold enforced by the data recording system used.
25-
- :math:`ISI_s` : the array of ISI violations which are observed in the unit's spike train.
26-
- :math:`\#`: denotes count.
30+
Here, the refactory period :math:`t_r` is adjusted to take account of the data recording system's minimum possible refactory
31+
period. E.g. if a system has a sampling rate of :math:`f \text{ Hz}`, the closest that two spikes from the same unit can possibly
32+
be is :math:`1/f \, \text{s}`. Hence the refactory period :math:`t_r` is the expected biological threshold minus this minimum possible
33+
threshold.
2734

28-
The threshold for ISI violations is the biological ISI threshold, :math:`ISI_t`, minus the minimum ISI threshold, :math:`ISI_{min}` enforced by the data recording system used.
29-
The array of inter-spike-intervals observed in the unit's spike train, :math:`ISI_s`, is used to identify the count (:math:`\#`) of observed ISI's below this threshold.
30-
For a recording with a duration of :math:`T_r` seconds, and a unit with :math:`N_s` spikes, the rate of ISI violations is:
35+
The contamination rate is calculated to be
3136

3237
.. math::
3338
34-
\textrm{ISI violations} = \frac{ \#( ISI_s < ISI_t) T_r }{ 2 N_s^2 (ISI_t - ISI_{min}) }
39+
C = \frac{ n_v T }{ 2 N^2 t_r }
3540
3641
Calculation from the [Llobet]_ paper
3742
------------------------------------
3843

39-
The following quantities are required:
40-
41-
- :math:`T` the duration of the recording.
42-
- :math:`N` the number of spikes in the unit's spike train.
43-
- :math:`t_r` the duration of the unit's refractory period.
44-
- :math:`n_v` the number of violations of the refractory period.
44+
In this method the number spikes which violate other spikes' refractory periods, denoted :math:`\tilde{n}_v`, is used.
4545

46-
The estimated contamination :math:`C` can be calculated with 2 extreme scenarios. In the first one, the contaminant spikes are completely random (or come from an infinite number of other neurons). In the second one, the contaminant spikes come from a single other neuron:
46+
The estimated contamination :math:`C` is calculated in 2 extreme scenarios. In the first, the contaminant spikes
47+
are completely random (or come from an infinite number of other neurons). In the second, the contaminant spikes
48+
come from a single other neuron. In these scenarios, the contamination rate is
4749

4850
.. math::
4951
5052
C = \frac{FP}{TP + FP} \approx \begin{cases}
51-
1 - \sqrt{1 - \frac{n_v T}{N^2 t_r}} \text{ for the case of random contamination} \\
52-
\frac{1}{2} \left( 1 - \sqrt{1 - \frac{2 n_v T}{N^2 t_r}} \right) \text{ for the case of 1 contaminant neuron}
53+
1 - \sqrt{1 - \frac{\tilde{n}_v T}{N^2 t_r}} \text{ for the case of random contamination} \\
54+
\frac{1}{2} \left( 1 - \sqrt{1 - \frac{2 \tilde{n}_v T}{N^2 t_r}} \right) \text{ for the case of 1 contaminant neuron}
5355
\end{cases}
5456
5557
Where :math:`TP` is the number of true positives (detected spikes that come from the neuron) and :math:`FP` is the number of false positives (detected spikes that don't come from the neuron).
@@ -58,7 +60,9 @@ Expectation and use
5860
-------------------
5961

6062
ISI violations identifies unit contamination - a high value indicates a highly contaminated unit.
61-
Despite being a ratio, ISI violations can exceed 1 (or become a complex number in the [Llobet]_ formula). This is usually due to the contaminant events being correlated with our neuron, and their number is greater than a purely random spike train.
63+
Despite being a ratio, the contamination can exceed 1 (or become a complex number in the [Llobet]_ formula).
64+
This is usually due to the contaminant events being correlated with our neuron, and their number is
65+
greater than a purely random spike train.
6266

6367
Example code
6468
------------
@@ -86,8 +90,8 @@ With SpikeInterface:
8690
References
8791
----------
8892

89-
Hill implementation (:code:`isi_violation`)
90-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
93+
UMS implementation (:code:`isi_violation`)
94+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9195

9296
.. autofunction:: spikeinterface.qualitymetrics.misc_metrics.compute_isi_violations
9397

@@ -160,5 +164,5 @@ Links to original implementations
160164
Literature
161165
----------
162166

163-
Introduced by [Hill]_ (2011).
167+
Introduced in UltraMegaSort2000 [UMS]_ (2011).
164168
Also described by [Llobet]_ (2022)

doc/references.rst

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -50,9 +50,11 @@ If you use the :code:`qualitymetrics` module, i.e. you use the :code:`analyzer.c
5050
or :code:`compute_quality_metrics()` methods, please include the citations for the :code:`metric_names` that were particularly
5151
important for your research:
5252

53-
- :code:`amplitude_cutoff` or :code:`isi_violation` [Hill]_
54-
- :code:`amplitude_median` or :code:`sliding_rp_violation` [IBL]_
53+
- :code:`amplitude_cutoff` [Hill]_
54+
- :code:`amplitude_median` [IBL]_
55+
- :code:`sliding_rp_violation` [IBL]_
5556
- :code:`drift` [Siegle]_
57+
- :code:`isi_violation` [UMS]_
5658
- :code:`rp_violation` [Llobet]_
5759
- :code:`sd_ratio` [Pouzat]_
5860
- :code:`snr` [Lemon]_ [Jackson]_
@@ -122,6 +124,8 @@ References
122124
123125
.. [Siegle] `Survey of Spiking in the Mouse Visual System Reveals Functional Hierarchy. 2021. <https://pubmed.ncbi.nlm.nih.gov/33473216/>`_
124126
127+
.. [UMS] `UltraMegaSort2000 - Spike sorting and quality metrics for extracellular spike data. 2011. <https://github.com/danamics/UMS2K>`_
128+
125129
.. [Varol] `Decentralized Motion Inference and Registration of Neuropixel Data. 2021. <https://ieeexplore.ieee.org/document/9414145>`_
126130
127131
.. [Windolf] `Robust Online Multiband Drift Estimation in Electrophysiology Data. 2022. <https://www.biorxiv.org/content/10.1101/2022.12.04.519043v2>`_

src/spikeinterface/qualitymetrics/misc_metrics.py

Lines changed: 21 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -241,7 +241,7 @@ def compute_isi_violations(sorting_analyzer, isi_threshold_ms=1.5, min_isi_ms=0,
241241
242242
It computes several metrics related to isi violations:
243243
* isi_violations_ratio: the relative firing rate of the hypothetical neurons that are
244-
generating the ISI violations. Described in [Hill]_. See Notes.
244+
generating the ISI violations. See Notes.
245245
* isi_violation_count: number of ISI violations
246246
247247
Parameters
@@ -261,22 +261,29 @@ def compute_isi_violations(sorting_analyzer, isi_threshold_ms=1.5, min_isi_ms=0,
261261
Returns
262262
-------
263263
isi_violations_ratio : dict
264-
The isi violation ratio described in [Hill]_.
264+
The isi violation ratio.
265265
isi_violation_count : dict
266266
Number of violations.
267267
268268
Notes
269269
-----
270-
You can interpret an ISI violations ratio value of 0.5 as meaning that contaminating spikes are
271-
occurring at roughly half the rate of "true" spikes for that unit.
272-
In cases of highly contaminated units, the ISI violations ratio can sometimes be greater than 1.
270+
The returned ISI violations ratio approximates the fraction of spikes in each
271+
unit which are contaminted. The formulation assumes that the contaminating spikes
272+
are statistically independent from the other spikes in that cluster. This
273+
approximation can break down in reality, especially for highly contaminated units.
274+
See the discussion in Section 4.1 of [Llobet]_ for more details.
275+
276+
This method counts the number of spikes whose isi is violated. If there are three
277+
spikes within `isi_threshold_ms`, the first and second are violated. Hence there are two
278+
spikes which have been violated. This is is contrast to `compute_refrac_period_violations`,
279+
which counts the number of violations.
273280
274281
References
275282
----------
276-
Based on metrics described in [Hill]_
283+
Based on metrics originally implemented in Ultra Mega Sort [UMS]_.
277284
278-
Originally written in Matlab by Nick Steinmetz (https://github.com/cortex-lab/sortingQuality)
279-
and converted to Python by Daniel Denman.
285+
This implementation is based on one of the original implementations written in Matlab by Nick Steinmetz
286+
(https://github.com/cortex-lab/sortingQuality) and converted to Python by Daniel Denman.
280287
"""
281288
res = namedtuple("isi_violation", ["isi_violations_ratio", "isi_violations_count"])
282289

@@ -324,7 +331,6 @@ def compute_refrac_period_violations(
324331
Calculate the number of refractory period violations.
325332
326333
This is similar (but slightly different) to the ISI violations.
327-
The key difference being that the violations are not only computed on consecutive spikes.
328334
329335
This is required for some formulas (e.g. the ones from Llobet & Wyngaard 2022).
330336
@@ -351,6 +357,12 @@ def compute_refrac_period_violations(
351357
-----
352358
Requires "numba" package
353359
360+
This method counts the number of violations which occur during the refactory period.
361+
For example, if there are three spikes within `refractory_period_ms`, the second and third spikes
362+
violate the first spike and the third spike violates the second spike. Hence there
363+
are three violations. This is in contrast to `compute_isi_violations`, which
364+
computes the number of spikes which have been violated.
365+
354366
References
355367
----------
356368
Based on metrics described in [Llobet]_

0 commit comments

Comments
 (0)