-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
ENH: Support for Multi-Wavelength (>2) NIRS/SNIRF Data Processing #13408
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Hello! 👋 Thanks for opening your first pull request here! ❤️ We will try to get back to you soon. 🚴 |
larsoner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a reasonable start to me!
|
Looks like style CIs are unhappy -- until they are, most tests won't run |
|
@zEdS15B3GCwq friendly ping on this one, still interested in moving this one forward? |
|
Yes, I'll get back to this ASAP. Sorry for the tardiness. |
|
I was quite busy with work, will focus on this PR more from now on. My apologies for appearing to have abandoned this. |
|
Next I'll look into why the docs build is failing. I might need some help here as I'm unfamiliar with how to interpret the minimal information circleci is providing me. I'll try to |
a36d15b to
926b064
Compare
|
I'll need to push a commit to get CircleCI to run, or you'll have to create a CircleCI account. Let me know when the other CIs are happy and I can do it. For now it looks like you have some test failures in Windows pip and pip-pre like: |
|
I know why the Multiple actual wavelengths stored in
|
| all_freqs = [info["chs"][ii]["loc"][9] for ii in picks_wave] | |
| if np.any(np.isnan(all_freqs)): | |
| picks = _throw_or_return_empty( | |
| f"NIRS channels is missing wavelength information in the " | |
| f'info["chs"] structure. The encoded wavelengths are {all_freqs}.', | |
| throw_errors, | |
| ) |
MNE inconsistent in what is stored in info
On the other hand, there are many indications in the code that this is not the expected behaviour, in fact, it is incorrect:
- To my understanding, MNE doesn't have a clear way of storing nominal and actual wavelengths. Keeping one in the channel name, and the other in
infowithout excplicit separation of their purpose seems to be a hack at best. Perhaps I misunderstand the intent behind MNE's current handling of the data, however... - Actual wavelength values are also supported by SNIRF, but the SNIRF reader does not even check for
wavelengthActualfields and only useswavelengthIndexvalues which select one of the nominal values. This is enforces that wavelength values stored ininfoand channel names are the same (except that the names are rounded to integers). So there's the confusion that in one format theinfostructure has nominal, in another it has actual values. This cannot/should not be the intention of MNE's design... or, well, we could say that theinfohas the most accurate available wavelength value. - A test in mne.preprocessing.nirs.tests.test_beer_lambert_law.py explicitly mentions that it's an error "if channel naming frequency doesn't match what is stored in loc[9]". The test didn't actually work as intended as there was no validation check to test this. I added a new validation check which corrects this, but causes the Hitachi code to fail. I also wonder why there's a test in
test_beer_lambert_law.pywhich actually tests the validation code innirs.pyand not BLL, but that's a discussion for another time. - The most important place where the wavelengths matter is in the Beer-Lambert Law calculation. Assuming that the actual wavelengths can be different for each channel (as they do in the Hitachi data), one should either use the nominal wavelengths to calculate overall absorption coefficients, or get individual coeffs for each channel based on the actual wavelengths. The code does neither, it incorrectly calculates absorption coefficients for all channels based only on the 1st pair of actual wavelengths. Not the nominal ones. This must be a mistake, as actual wavelengths can deviate from the expected value, and if the first pair deviates significantly, then calculation for all the channels will be off.
BLL calculation calculates absorption coefficients once, used for all channel pairs:
mne-python/mne/preprocessing/nirs/_beer_lambert_law.py
Lines 48 to 72 in 0a06818
| freqs = np.array([raw.info["chs"][pick]["loc"][9] for pick in picks], float) | |
| abs_coef = _load_absorption(freqs) | |
| distances = source_detector_distances(raw.info, picks="all") | |
| bad = ~np.isfinite(distances[picks]) | |
| bad |= distances[picks] <= 0 | |
| if bad.any(): | |
| warn( | |
| "Source-detector distances are zero on NaN, some resulting " | |
| "concentrations will be zero. Consider setting a montage " | |
| "with raw.set_montage." | |
| ) | |
| distances[picks[bad]] = 0.0 | |
| if (distances[picks] > 0.1).any(): | |
| warn( | |
| "Source-detector distances are greater than 10 cm. " | |
| "Large distances will result in invalid data, and are " | |
| "likely due to optode locations being stored in a " | |
| " unit other than meters." | |
| ) | |
| rename = dict() | |
| for ii, jj in zip(picks[::2], picks[1::2]): | |
| EL = abs_coef * distances[ii] * ppf | |
| iEL = pinv(EL) | |
| raw._data[[ii, jj]] = iEL @ raw._data[[ii, jj]] * 1e-3 |
Absorption coefficient calculation only returns values for the 1st frequency pair:
mne-python/mne/preprocessing/nirs/_beer_lambert_law.py
Lines 107 to 112 in 0a06818
| ext_coef = np.array( | |
| [ | |
| [interp_hbo(freqs[0]), interp_hb(freqs[0])], | |
| [interp_hbo(freqs[1]), interp_hb(freqs[1])], | |
| ] | |
| ) |
Tests fail because they expect same values in info and channel names
Finally, the tests are failing now because I assumed that wavelengths in the info field must be the same as in the channel names. I added a validation test that is based on that assumption:
https://github.com/zEdS15B3GCwq/mne-python/blob/926b064be5c564fd2fb489481908b1b1a6b223c4/mne/preprocessing/nirs/nirs.py#L167-L173
# test if the info structure has the same number of freqs as pair_vals
if len(pair_vals) != len(set(all_freqs)):
picks = _throw_or_return_empty(
f"The {error_word} in info must match the number of wavelengths, "
f"but the data contains {len(set(all_freqs))} wavelengths instead.",
throw_errors,
)This code compares pair_vals, nominal wavelength array calculated from channel names, to wavelengths in the info structure. Obviously, this check is pointless if info can contain arbitrary wavelengths. It is also wrong, as set() or np.uniqe is guaranteed to cause a mess with such data.
How to proceed (feedback please)
I can see several potential paths:
- Nominal and actual wavelengths have to be the same. This clarifies that
infostores nominal wavelengths. Hitachi reader has to be fixed to store nominal values ininfo, discard actual ones. The failing test will pass. - Allow actual wavelengths in info, use nominal ones in BLL. This keeps the current way of handling information, but fixes BLL. My new validation check will be removed, and the failing tests will pass. However, if actual wavelengths are not used anywhere, why bother with storing them in the first place?
- Allow actual wavelengths in info and use them in BLL. The current way of storing wavelengths in
infois preserved. BLL is fixed, and its accuracy is potentially improved in cases where actual wavelength information exists. In the implementation, nominal values will be used to define the size of channel groups, but for each channel the absorption coefficients will be based on their individual actual wavelength. The new validation check will be removed, the test will pass.
Even though I wrote that the new validation check can be removed, it might be useful to add some kind of validation for the values stored in info. As of now, it's left to the respective format readers to decide what to put in it.
Others
- Docs build locally without any issue on Ubuntu 24LTS on WSL2. You mentioned that you're going to make CircleCI work, so I'm assuming the docs are not an issue.
- Both on
mainand in this PR branch, I have some failing tests when the suite is run on Windows. Is that a problem?
> uv run pytest -m "not ultraslowtest" mne
...
6164 passed 2 failed
FAILED mne/viz/_brain/tests/test_brain.py::test_brain_screenshot[pyvistaqt] - assert (450, 525, 3) == (np.int64(600...int64(700), 3)
FAILED mne/viz/tests/test_epochs.py::test_plot_epochs_clicks[qt] - KeyError: 'MEG 0113'
ERROR mne/viz/_brain/tests/test_brain.py::test_brain_time_viewer[pyvistaqt] - AssertionError: []- When the tests are run on Ubuntu24LTS/WSL2,
mne/viz/tests/test_epochs.pypasses (failed on Window), but many tests inmne/vizfail, for example:
mne/viz/_brain/tests/test_brain.py X Error of failed request: BadWindow (invalid Window parameter)Do you have any suggestions how I should configure WSL2 for testing? I can run xeyes and xcalc, and several MNE tests successfully pop up windows and pass.
I would start with (2) as it's the path of least resistance and maintains backward compat. Then I'd add (3) as an option in Beer-Lambert -- ideally in a new PR to minimize diff! -- like I don't totally recall why we went with nominal rather than actual wavelengths in the first place... maybe (hopefully!) it makes a negligible difference in the calculations. But storing them makes it so that later if we decide they do matter, we can use them. I agree having the channel name store this nominal info isn't good... we could consider adding another info field for it or using another field in
I would expect WSL2 to be problematic for most viz tests. Instead I would use (and have used for many years) |
|
I will proceed with 2 then and keep 3 in mind for a future PR. Thanks for your feedback on running tests. The main reason I was asking about how to run tests on something else than Windows was that some of the tests were failing, even on the main branch without my modifications (see the qouted output at the end of my previous message). I might give |
|
The code I added to SCI and BLL assumed that the number of wavelengths could be calculated from the info structure, but when it contained actual freqs, this method failed and lead to errors. Now these routines use the |
… in np.zeros so there was need to write 0 to it when correlation was infinite.
…ments to show correct return format
…ode over several lines
for more information, see https://pre-commit.ci
FIX: mne/io/hitachi tests with actual wavelength data now succeed * Beer-Lambert Law (BLL) and Scalp Coupling Index (SCI) calculations used the info structure to determine the number of wavelengths, but that lead to errors as the info structure can contain arbitrary data (e.g. different wavelengths for each channel). * BLL now uses the nominal wavelengths for the BLL calculation for all channels. Previously, it used the actual wavelengths of the first channel pair for all channels, which was incorrect if the channels have different actual wavelengths. * In the future, there may be an option in BLL to use the actual freq values for each channel, to improve calculation accuracy. * The Hitachi tests have data with actual wavelengths for each channel stored in the info structure. This caused issues with counting the number of wavelengths, which caused tests to fail.
bca2f12 to
ba5d48b
Compare
larsoner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a doc/changes/13408.newfeature.rst, change from draft mode to ready, and ping me when you think this is ready @zEdS15B3GCwq ? I merged main into this branch to get all CIs to run but feel free to force-push over it if you want
I had a quick look at the code and it looks reasonable to me!
Reference issue (if any)
Started discussion in #13401. Most of the description is copied from there with changes.
The same idea has also been raised in issue #9816, but that discussion did not result in a PR. In short, (some) Shimadzu fNIRS devices work with 3 wavelengths, and in the future there might be other devices with more, but mne's SNIRF I/O and processing functions are coded to accept only 2 wavelengths.
From my perspective, these changes would be mostly beneficial as I'm using a Shimadzu device. While the device doesn't support .snirf output, there are ways to convert the data to that format. And while it's possible to preprocess and convert to Hb data first in some other tool(s) and then import to mne, it would be ideal to keep the entire processing pipeline to one tool.
Describe the new feature or enhancement
Goal: extend MNE-Python's NIRS processing to support an arbitrary number of wavelengths (≥2) instead of being limited (hard-coded in several places) to two.
I'd like to discuss implementation details and to explore devs' opinion on how feasible such changes would be. For this reason, this a draft PR, and some important points (like tests) haven't been addressed (at all) yet.
Describe your proposed implementation
I've done some exploration in the codebase to identify relevant bits of code, both manually and with the help of Claude 4. Based on this, it seems that relevant parts of the code are in SNIRF I/O (mne.io.snirf) and NIRS preprocessing (mne.preprocessing.nirs).
Major changes
pinv()to calculateΔOD = E × L × PPF × Δc. I think it's possible to update this with minimal changes, just changing the EL matrix to n x 2 where n is the number of wavelengths, instead of the current 2 x 2. After Beer-Lambert, excess channels will need to be dropped.Minor changes
List of affected files and functions:
class RawSNIRF(BaseRaw)has a simple check for 2 wavelengths and an error message.beer_lambert_law()logic needs to be updated to >=2 wavelengths, plus it will need to remove excess channels after assigning HbO/R data to the first 2._load_absorption()is hard-coded to 2 wavelengths.scalp_coupling_index()needs an algorithmic update_check_channels_ordered()has various validation checks and_fnirs_spread_bads()needs to spread bad markings over all wavelengths in groupI'm hoping to receive feedback on the correctness of the above proposed algorithmic changes. For n=2 wavelengths, the new calculation methods wouldn't introduce any new changes (E in Beer-Lambert would still be 2x2, and there is only one correlation pair so the minimum is irrelevant), and if necessary, the current implementation could be preserved in a separate code path. Current tests for n=2 should also work.
Describe possible alternatives
The algorithmic changes (Beer-Lambert and SCI) could be implemented in another way. The Beer-Lambert pseudo-inverse should work with n>2 with minimal changes and the change seems mathematically correct. The SCI calculation using the minumum of all correlations appers to be the most conservative in terms of being able to spot any coupling issues, but without studies to rely on, it's a choice that should be discussed.
Additional context
I'll open another PR for the
mne-nirsproject once this PR has advanced a bit. In particular, the scalp coupling algorithm should be agreed on, since that project has a very similar, windowed implementation of it that probably should be updated with the changes here in mind.As I mentioned above, the current code is untested, as in I haven't actually run it with any .snirf files yet. I'm sorry for this, I'll try to do some testing as soon as possible.