Giter VIP home page Giter VIP logo

Comments (18)

a-hurst avatar a-hurst commented on May 25, 2024 1

In my efforts yesterday I took a look at autoreject's interpolation code, which is also based on mne's private _make_interpolation_matrix and bypasses the built-in MNE raw.interpolate_bads method. Since it's all in a utils.py file, I tried replacing the raw.interpolate_bads instances in PyPREP with autoreject's utils.interpolate_bads and I got different results with the same file and random seed.

On my test file (again, the notorious 007), the autoreject interpolation ended up producing seemingly worse results, with more channels interpolated (22 vs 20) and significantly more remaining bad channels (23 vs 4). You can see results here:
prep_interpolation_compare.zip

Also, since @jasmainak mentioned that they'd made a pull request to MNE years back to get the behaviour of the _make_interpolation_matrix to work correctly and match MATLAB PREP, and the _make_interpolation_matrix function has been modified in MNE since then, I tried replacing the current MNE version of the function with the version from the pull request (both in NoisyChannels and in interpolate_bads) to see if that made any difference for the autoreject code. This made absolutely zero difference in results, so we can at least rule out that as a source of headaches (at least in terms of AutoReject vs PyPREP, the MATLAB implementation may still be different).

from pyprep.

a-hurst avatar a-hurst commented on May 25, 2024 1

I am not aware of a suitable dataset / simulation for that :-/ see also: autoreject/autoreject#50

I did a little digging yesterday and found this paper, which describes a MATLAB tool that sort of fits the bill? From what I understand, it takes fMRI data and uses a forward model to generate "biologically-plausible" EEG signals for a given set of electrodes at given coordinates.

Assuming the model is actually good enough to be useful, we could use it to simulate EEG signals for a given dataset, write some code to add different kinds of noise to different channels at random, and then use that to test the performance of PyPREP, Matlab PREP, AutoReject, or any other similar type of EEG preprocessing pipeline. Unfortunately it's in MATLAB so I wouldn't really know how to use it, but maybe I can find a friend who knows the language and owes me a favour...

from pyprep.

sappelhoff avatar sappelhoff commented on May 25, 2024 1

you could also look into this: https://mne.tools/stable/auto_examples/simulation/plot_simulate_raw_data.html?highlight=simulating

from pyprep.

sappelhoff avatar sappelhoff commented on May 25, 2024

To link to the code --> currently we are using a private function:

interpol_mat = _make_interpolation_matrix(reconstr_pos, chn_pos)

and that may bypass some of the checks and adjustments that MNE does and that may be necessary. It'd just be safer to use the public API, although it might be slower.

from pyprep.

yjmantilla avatar yjmantilla commented on May 25, 2024

Ahh I see, it seems I misremembered the way interpolation was done in autoreject in #41 . They do use Epochs.interpolate_bads() but in the transform , that is , not to get the bad channels but to correct them.

So there something indeed with the implementation.

from pyprep.

a-hurst avatar a-hurst commented on May 25, 2024

@sappelhoff I was looking into this again this afternoon and noticed something that might explain some of #41 (I'll need to formally test)!

Basically, the user-facing MNE interpolate_bads function ensures that the electrode (x, y, z) coordinates are centered on an origin of (0, 0, 0) so that the assumptions of the spherical interpolation are satisfied (see the _check_origin function here. This seems to be crucial for accuracy, since the "standard_1005" montage is in MNI "realistic" units instead of spherical ones, and is thus not centered on zero by default.

PyPREP's (and seemingly autoreject's) interpolation methods both ignore this completely, which is bound to cause problems. A quick fix here would be to mimic interpolate_bads here and use the public fit_sphere_to_headshape method to get the actual origin of a given montage and adjust the channel coordinate accordingly.

Additionally, I ran into a few MNE issue threads discussing spherical vs "realistic" montages and their costs/benefits, and noticed your posts about your eeg_positions package. For the kind of spherical interpolation we're doing here, would we expect the x/y/z coordinates from a "realistic" or "spherical" montage to give us better results?

from pyprep.

sappelhoff avatar sappelhoff commented on May 25, 2024

Nice, thanks for digging deeper into this!

A quick fix here would be to mimic interpolate_bads here and use the public fit_sphere_to_headshape method to get the actual origin of a given montage and adjust the channel coordinate accordingly.

πŸ‘ yes πŸ€” ... unless people supply their own montage --> in that case, we should trust them to do the right thing?

For the kind of spherical interpolation we're doing here, would we expect the x/y/z coordinates from a "realistic" or "spherical" montage to give us better results?

For a dataset I collected, I also collected the electrode positions. I then compared the interpolation when using the "true" positions (measured), versus the "standard_1005" MNE montage --> the differences were very tiny. --> Of course that doesn't mean that the differences between the "realistic" (my measured ones, mne "standard_1005") and "spherical" (e.g., from eeg_positions) models would also be small πŸ€” --> this calls for a test! 😎

from pyprep.

sappelhoff avatar sappelhoff commented on May 25, 2024

I dug out the "gist" of what I did in my test (posted on some private lab channel):

interpolation with spherical splines: measured electrode positions ~ template electrode positions

Maybe an interesting fun fact for the EEG folks, yesterday evening I played around a bit with my electrode position data (from CapTrak) and the "spherical spline interpolation" (which builds a spherical head model based on electrode positions).

Taking a channel (true data) ... then interpolating it using spherical splines
once with the measured CapTrak electrode positions
once with a 10-20 standard template for electrode positions (not measured)
correlated the interpolated with the true channel data

I did this for all channels and subjects, and then compared the mean correlation between true data and interpolated data.

for measured electrode positions: 0.7062
for template electrode positions: 0.7083

--> I am a bit surprised that measured electrode positions really don't provide an advantage for spherical spline interpolation πŸ€·β€β™‚οΈ

--> I would have expected the correlation for measured >> template

from pyprep.

a-hurst avatar a-hurst commented on May 25, 2024

πŸ‘ yes πŸ€” ... unless people supply their own montage --> in that case, we should trust them to do the right thing?

Well, the way MNE accomplishes it is to offer "origin" as a parameter, using that directly if it's a 3-value array, calculating it automatically with fit_sphere_to_headshape if it's the string "auto", and defaulting to an origin of (0, 0, 0) if the string is anything else. I guess we could modify the NoisyChannels.find_bads_by_ransac() method to add an origin parameter to mimic this, and add a corresponding "origin" parameter in the prep_params dict. I'll put together a PR for this and you can let me know what you think of the API!

I did this for all channels and subjects, and then compared the mean correlation between true data and interpolated data.

for measured electrode positions: 0.7062
for template electrode positions: 0.7083

--> I am a bit surprised that measured electrode positions really don't provide an advantage for spherical spline interpolation πŸ€·β€β™‚οΈ

--> I would have expected the correlation for measured >> template

This is cool, thanks for testing! Just to be clear: for the template electrode positions, are you using the standard MNE "standard_1020" montage or a spherical one generated using eeg_positions? If the latter, were you using the "Fpz-T8-Oz-T7" equator or the "Nz-T10-Iz-T9" one? I feel like that might impact the accuracy of the interpolated values as well.

I'm still wrapping my head around all the spherical interpolation logic, but my amateur intuition would be that the Perrin et al. (1989) interpolation math that both MNE and EEGLAB use is making the assumption of the electrodes being on a perfectly spherical head, and that using more accurate real-world electrode positions might end up breaking that assumption with problematic consequences.

Also in terms of comparison to MatPREP (?), from what I can find the default montages in EEGLAB were spherical ones up until this year, so that's likely what they expect/use as well.

from pyprep.

sappelhoff avatar sappelhoff commented on May 25, 2024

are you using the standard MNE "standard_1020" montage or a spherical one generated using eeg_positions

I was using the former --> I didn't test this with a spherical model (such as from eeg_positions) yet :-)

If the latter, were you using the "Fpz-T8-Oz-T7" equator or the "Nz-T10-Iz-T9" one?

this is a very new feature - why do you think that that should impact the accuracy of interpolated values? and in what way?

In both cases, positions are calculated on a sphere, and the relative electrode distances should be equivalent πŸ€”

is making the assumption of the electrodes being on a perfectly spherical head, and that using more accurate real-world electrode positions might end up breaking that assumption with problematic consequences.

I don't have proof of this, but I was under the impression that the spherical spline approach is relatively robust - even when the underlying montage is not spherical.

Quoting my phd supervisor on this:

when interpolating a channel, those with the strongest influence will be those that surround this channel. It shouldn’t matter much if their locations vary by a few millimeters (standard <-> measured), as long as the basic layout is preserved (which it should). For which of our analyses would you need that? Unless doing source reconstruction, the recorded elec positions don’t have a real advantage

btw for his question

For which of our analyses would you need that?

my answer was more or less: "just out of curiosity"

from what I can find the default montages in EEGLAB were spherical ones up until this year

interesting! why did they change? do you have a link to a PR or issue?

from pyprep.

a-hurst avatar a-hurst commented on May 25, 2024

If the latter, were you using the "Fpz-T8-Oz-T7" equator or the "Nz-T10-Iz-T9" one?

this is a very new feature - why do you think that that should impact the accuracy of interpolated values? and in what way?

Again, I haven't thought this through a ton, but my initial thinking was that the change in equator might change where the center of the sphere was, thus changing the polar coordinates of the electrodes and therefore the interpolation results. If the polar coordinates wouldn't change with a different equator, please disregard.

Regarding your PhD supervisor's thoughts on this: my understanding is that RANSAC works by choosing a bunch of random subsets from the "good" channels", using those random channels to repeatedly predict the signal for each channel in the file, and then comparing the predicted signal with the actual signal to identify "bad" channels. Since the default number of channel picks for each interpolation is 25% of all good channels, we would frequently get situations where none of the prediction channels are adjacent so variance in the spatial coordinates would matter a lot more. Does that make sense?

Really I'm just hoping this might help with #41, since it's a PyPREP vs MatPREP (?) difference I wasn't aware of until now and I'd love it if a spherical model magically fixed anything.

from what I can find the default montages in EEGLAB were spherical ones up until this year

interesting! why did they change? do you have a link to a PR or issue?

It's mentioned in their wiki page on channel locations:

As of 2021, the default channel location file for electrode position is the MNI file, which is best suited for source localization. Before 2021, it was the BESA spherical location file.

from pyprep.

sappelhoff avatar sappelhoff commented on May 25, 2024

Does that make sense?

I am really not sure. Intuitively it makes more sense to me that it doesn't make much of a difference where the equator is. If I select a small subset of electrodes on a sphere and imagine an interpolation across the sphere between them, then the relative difference between the electrodes is what matters for the interpolation across the surface, and not the fact whether LPA is on the equator, or below it. πŸ€”

Really I'm just hoping this might help with #41, since it's a PyPREP vs MatPREP (?) difference I wasn't aware of until now and I'd love it if a spherical model magically fixed anything.

It might be a fix still πŸ€·β€β™‚οΈ in any case I think using a spherical template makes more sense than one based on some "realistic" positions that we don't know anything about.

It's mentioned in their wiki page on channel locations:

thanks

from pyprep.

a-hurst avatar a-hurst commented on May 25, 2024

@sappelhoff Okay, so I ran some head-to-head spherical-vs-realistic testing on the same open dataset I've been testing with #41, and so far the results are... interesting. Note that in my local copy of PyPREP, I've also modified it to always correct the montage origin to (0, 0, 0) for the _make_interpolation_matrix part of RANSAC just like it's done in the official mne raw.interpolate_bads method we use following robust re-referencing.

For the following plots, I first did the import and cleanline part of PREP for the first 10 IDs from the open dataset and saved them to .fif files, all with the same random seed, so that the only thing varying between runs would be the Robust Re-referencing part of the pipeline. Then I did 8 runs with varying random seeds for each ID: 4 with a spherical template generated by eeg_positions, and 4 with the built-in MNE standard_1020 which uses "realistic" positions. The difference in setup code between spherical and realistic runs can be seen here:

    # Load EEG data
    raw = mne.io.read_raw_fif(file_path)
    ch_names_eeg = raw.info["ch_names"]
    
    # Add a montage to the data
    mne.datasets.eegbci.standardize(raw)
    if spherical:
        montage = get_elec_coords(elec_names=ch_names_eeg, as_mne_montage=True)
    else:
        montage_kind = "standard_1020"
        montage = mne.channels.make_standard_montage(montage_kind)
    raw.set_montage(montage)

Then I averaged the counts of "initial bad", "interpolated", and "remaining bad" channels for each ID/montage type and got the spherical-minus-realistic difference in counts for each type of bad channel for each ID. Below, you can see the mean spherical-minus-realistic differences (Β± 1 standard error) for each type of bad channel:
Screen Shot 2021-03-18 at 12 31 52 PM
And here we have the same data in boxplot form to show the full ranges more clearly:
Screen Shot 2021-03-18 at 12 31 29 PM

Basically, using the spherical montage seems to be slightly worse than using the realistic one in terms of both initial bad channels and interpolated bad channels, and about even for remaining bads.

from pyprep.

sappelhoff avatar sappelhoff commented on May 25, 2024

I don't fully understand this part of your interpretation:

using the spherical montage seems to be slightly worse than using the realistic one in terms of both initial bad channels and interpolated bad channels, and about even for remaining bads

why is "spherical" worse? there are more initial bad channels and more interpolated channels than for "realistic" --> but we don't know what the true "labels" are, right? so all we can say is that spherical seems to detect more bad channels and also interpolated them (more "liberal") than realistic?

In any case, I am surprised about the difference here, thanks for testing.

It would be an easy option to add for users to select a spherical or a realistic template, ... but:

  1. we don't understand the caveats yet - we just know that they yield slightly different results
  2. although we could do this change by adding eeg_positions as a dependency to pyprep, I think the better change would be to do in MNE-Python, where talk was about adding a spherical template next to the realistic, already existing, one (you probably saw that in mne-tools/mne-python#7472)

Both of these above issues would be important to address next, but I don't know where to start with "1", except perhaps to start asking experts of the spherical spline interpolation method, or to design new tests.

from pyprep.

a-hurst avatar a-hurst commented on May 25, 2024

why is "spherical" worse? there are more initial bad channels and more interpolated channels than for "realistic" --> but we don't know what the true "labels" are, right? so all we can say is that spherical seems to detect more bad channels and also interpolated them (more "liberal") than realistic?

For the purposes of testing, I'm just using "worse" here to mean "matches up less closely with MATLAB PREP", which seems to detect far fewer bad channels on the same dataset (see my other post in #41). You're correct though that it would be useful to have testing data where we know a-priori which channels are bad and why: either simulated data or known-clean data where we deliberately add different kinds of noise to different channels, so we could objectively measure the sensitivity/specificity of different EEG cleaning methods. Do you know if anything like this already exists?

from pyprep.

sappelhoff avatar sappelhoff commented on May 25, 2024

I am not aware of a suitable dataset / simulation for that :-/ see also: autoreject/autoreject#50

from pyprep.

a-hurst avatar a-hurst commented on May 25, 2024

@sappelhoff Now that we've verified that the RANSAC interpolation matrix code itself is identical between MatPREP and PyPREP, is this issue able to be closed? Or is this idea something you'd like to revisit for maintenance reasons?

from pyprep.

sappelhoff avatar sappelhoff commented on May 25, 2024

Good point @a-hurst I think we can close this.

from pyprep.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.