While chugging along on the PyPREP/MatPREP comparison unit tests, I had a thought abou

I understood that <a class="user-mention notranslate" data-hovercard-type="user" data-

I understood that <a class="user-mention notranslate" data-hovercard-type

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Extend PrepPipeline to allow performing PREP in stages about pyprep HOT 11 OPEN

sappelhoff commented on May 25, 2024

Extend PrepPipeline to allow performing PREP in stages

from pyprep.

Comments (11)

yjmantilla commented on May 25, 2024 1

I understood that @a-hurst meant without the "final" interpolation. Not "without any interpolation at all". Now, I do wonder how it could be done that way:

The interpolation inside the robust referencing is always done, but it doesn't affect the real signal but a dummy-temporal one , that interpolation is only done to calculate the robust reference.

pyprep/pyprep/reference.py

Line 116 in 051edd6

dummy.interpolate_bads()

pyprep/pyprep/reference.py

Line 318 in 051edd6

raw_tmp.interpolate_bads()

What I get is that at the end of robust referencing PREP once again searches for bad channels but this time he interpolates it on the "real" signal . These channels then become the "interpolated channels":

pyprep/pyprep/reference.py

Line 153 in 051edd6

self.raw.interpolate_bads()

Then prep once again searches for bad channels and what he finds is named "still_noisy_channels".

from pyprep.

yjmantilla commented on May 25, 2024

I agree. Usually I dont do the remove line noise at all because the signals were already filtered.

from pyprep.

sappelhoff commented on May 25, 2024

I think I forgot how the robust re referencing works ... I am surprised that it can work without interpolation. Can you explain?

Other than that, the api proposal sounds good to me. As long as we have good documentation an warnings in place to prevent users from accidentally screwing up their data, it should be fine.

from pyprep.

a-hurst commented on May 25, 2024

I understood that @a-hurst meant without the "final" interpolation.

Yep, that's what I was referring to! As you said, during referencing NoisyChannels only ever does interpolation to make predictions during RANSAC. Then once it does its loops of NoisyChannels, it interpolates all 'bad' channels on a dummy copy of the data and uses that to calculate an average reference signal free of bad channels, and then re-references the data.

After that, NoisyChannels is run again, and any channels that are still flagged as 'bad' get interpolated. The signal is then average-referenced to account for the interpolated channels, and final run of NoisyChannels is used to detect any 'remaining bad' channels even after interpolating bads.

What I'm proposing is to have a method robust_reference that does the above up until interpolating bad channels post-rereference, and a separate interpolate_bads that does the interpolation and subsequent referencing/NoisyChannels detection, so it's possible to opt out of the extra steps and time required if it's not needed.

EDIT: Nevermind, I totally missed it! See the lines here:

pyprep/pyprep/reference.py

Lines 315 to 325 in 051edd6

 if self.noisy_channels["bad_all"]: 

 raw_tmp._data = signal * 1e-6 

 raw_tmp.info["bads"] = self.noisy_channels["bad_all"] 

 raw_tmp.interpolate_bads() 

 signal_tmp = raw_tmp.get_data() * 1e6 

 else: 

 signal_tmp = signal 

 self.reference_signal = ( 

 np.nanmean(raw_tmp.get_data(picks=self.reference_channels), axis=0) 

 * 1e6 

 )

I guess that means reference signal calculation is going to depend on MNE's interpolation code being identical to EEGLAB's (which, confusingly, uses a different function than interpolation does during RANSAC). Still, that doesn't change that initial re-referencing and final interpolation can be done separately (which is what I originally meant).

from pyprep.

a-hurst commented on May 25, 2024

@sappelhoff @yjmantilla So I started taking a crack at this thinking it might simplify #95 a bit, but I've run into a couple snags I'd like to get your thoughts on.

For one, a lot of the pipeline attributes are named in a way that assumes interpolation will always happen:

pyprep/pyprep/prep_pipeline.py

Lines 226 to 237 in 1c9589c

 self.noisy_channels_before_interpolation = ( 

 reference.noisy_channels_before_interpolation 

 ) 

 self.noisy_channels_after_interpolation = ( 

 reference.noisy_channels_after_interpolation 

 ) 

 self.bad_before_interpolation = reference.bad_before_interpolation 

 self.EEG_before_interpolation = reference.EEG_before_interpolation 

 self.reference_before_interpolation = reference.reference_signal 

 self.reference_after_interpolation = reference.reference_signal_new 

 self.interpolated_channels = reference.interpolated_channels 

 self.still_noisy_channels = reference.still_noisy_channels

With final interpolation of bads being made optional, should all the before_interpolation attributes be renamed along the lines of after_reference or something? That would be the most API-breaking change that's been made so far since they're properties existing scripts are likely to use, but the existing naming seems confusing if interpolation doesn't necessarily happen. What do you think?

Relatedly, I'm trying to decide whether it makes sense for interpolate_bads to be its own method, instead of just having a interpolate_bads argument to PrepPipeline.robust_reference. The code would be simpler with it all in one method, but is there any use-case you can think of where you'd want to robust reference, do something else with the data, and then interpolate bad channels afterwards?

from pyprep.

sappelhoff commented on May 25, 2024

I think that the upcoming release should be the one where we go wild in terms of API changes, with the hopes of having a much more stable API after 0.4.0 ... and eventually a fixed API (with deprecation cycles if necessary) starting at 1.0.0 --> so I agree with renaming attributes that don't make sense semantically (any more).

Re: Your second question I cannot think of a use case right now 🤔 if a user really wanted to do that wouldn't they be able to pass interpolate_bads=False and then later interpolate "manually" using the MNE-Python API and the information they can extract manually from the pyprep objects?

from pyprep.

yjmantilla commented on May 25, 2024

With final interpolation of bads being made optional, should all the before_interpolation attributes be renamed along the lines of after_reference or something?

Regarding the renaming to "reference"... Yeap, I dont see any more proper name, after all the reference does indeed split those two "before" and "after" objects regardless the interpolation being done or not.

That would be the most API-breaking change that's been made so far since they're properties existing scripts are likely to use, but the existing naming seems confusing if interpolation doesn't necessarily happen. What do you think?

I dont know if this is too hacky or confusing but what about using a decorator to redirect EEG_X_interpolation to EEG_X_reference (the later would be the one in the code). It would be similar to what was done here:

pyprep/pyprep/prep_pipeline.py

Lines 168 to 175 in 1c9589c

 @property 

 def raw(self): 

 """Return a version of self.raw_eeg that includes the non-eeg channels.""" 

 full_raw = self.raw_eeg.copy() 

 if self.raw_non_eeg is None: 

 return full_raw 

 else: 

 return full_raw.add_channels([self.raw_non_eeg])

That way we wouldn't break the previous notation... Downside is that it may be confusing (but could be just clarified in the docstring).

Relatedly, I'm trying to decide whether it makes sense for interpolate_bads to be its own method, instead of just having a interpolate_bads argument to PrepPipeline.robust_reference. The code would be simpler with it all in one method, but is there any use-case you can think of where you'd want to robust reference, do something else with the data, and then interpolate bad channels afterwards?

if a user really wanted to do that wouldn't they be able to pass interpolate_bads=False and then later interpolate "manually" using the MNE-Python API and the information they can extract manually from the pyprep objects?

I agree with @sappelhoff , in that case he would pass false and then interpolate with his preferred way.

Now, I understand dummy.interpolate() in the following code is just mne stuff, so maybe we can expose the _eeglab_interpolate_bads ; I mean, this function is what diverges from MNE interpolation so in a way is the only thing I see we could provide as a "different interpolation" from what is already on MNE. Nevertheless , this would be a too technical use-case and an hypothetical user that needs it wouldn't have trouble with just using the private function. So, I think overall is not necessary to make interpolated_bads a method.

pyprep/pyprep/reference.py

Lines 151 to 154 in 1c9589c

 if self.matlab_strict: 

 _eeglab_interpolate_bads(self.raw) 

 else: 

 self.raw.interpolate_bads()

from pyprep.

a-hurst commented on May 25, 2024

@sappelhoff @yjmantilla Okay, I've got a rough draft for reorganized/renamed attributes for the PrepPipeline object, am hoping to get your thoughts before I go and write docstrings or anything:

        # NOTE: 'original' refers to before initial average reference, not first
        # pass afterwards. Not necessarily comparable to later values?
        self.noisy_info = {
            "original": None, "post-reference": None, "post-interpolation": None
        }
        self.bad_channels = {
            "original": None, "post-reference": None, "post-interpolation": None
        }
        self.reference_signals = {"robust": None, "post-interpolation": None}

        self.robust_reference_signal = None
        self._interpolated_reference_signal = None

    @property
    def current_noisy_info(self):
        post_ref = self.noisy_info["post-reference"]
        post_interp = self.noisy_info["post-interpolation"]
        return post_interp if post_interp else post_ref

    @property
    def remaining_bad_channels(self):
        post_ref = self.bad_channels["post-reference"]
        post_interp = self.bad_channels["post-interpolation"]
        return post_interp if post_interp else post_ref

    @property
    def current_reference_signal(self):
        post_ref = self.robust_reference_signal
        post_interp = self._interpolated_reference_signal
        return post_interp if post_interp else post_ref

Basically I've grouped all the bad channel lists & noisy channels dicts into dictionaries with keys for each major timepoint, with the values being None if that part of the pipeline hasn't been run. I've also added some getter methods to get the info for the pipeline's current state, which will fetch the post-interpolation values if interpolation was used, or the post-reference values otherwise. Does this seem like a good approach? Or is it a big and largely unnecessary change when it's just the post-reference variable names that are a problem?

I dont know if this is too hacky or confusing but what about using a decorator to redirect EEG_X_interpolation to EEG_X_reference (the later would be the one in the code). It would be similar to what was done here:

I was thinking about that: it's definitely possible (including with the above reorganization), I'm just not sure whether the project has enough users at this stage to warrant properly deprecating methods/attributes instead of just dropping them. Do we have any PyPi download stats that would give us an idea?

Now, I understand dummy.interpolate() in the following code is just mne stuff, so maybe we can expose the _eeglab_interpolate_bads ; I mean, this function is what diverges from MNE interpolation so in a way is the only thing I see we could provide as a "different interpolation" from what is already on MNE. Nevertheless , this would be a too technical use-case and an hypothetical user that needs it wouldn't have trouble with just using the private function. So, I think overall is not necessary to make interpolated_bads a method.

In the Reference class itself I've made interpolate_bads a separate method, so I guess if someone really has a use-case where they want to re-reference and interpolate separately they can do it that way.

from pyprep.

sappelhoff commented on May 25, 2024

Do we have any PyPi download stats that would give us an idea?

https://pypistats.org/packages/pyprep

But any person who reasonably uses pyprep currently is not installing it via pip, I think - but rather directly from GitHub.
Also, I have had the experience that the pypi stats are often inflated by bots, CI runs, mirrors, etc. and it's hard to get clean data on human users that actually use your package. I think that GitHub stars/forks/watchings and number of raised issues (by different people) are better indicators.

I'm just not sure whether the project has enough users at this stage to warrant properly deprecating methods/attributes instead of just dropping them.

IMHO we should not properly deprecate ... let's do what we think is best without looking back - for now. We have a prominent warning in the README that this will be the case (alpha software).

Regarding the proposed getter methods: I think as a user I'd like to have access to all data, and not "being served a part of the data based on the state my object is in", while the other data is present, but hidden from me. Or at least, I'd like to have control over what data I can get (with reasonable defaults so that I don't have to give it much thought).

from pyprep.

a-hurst commented on May 25, 2024

Regarding the proposed getter methods: I think as a user I'd like to have access to all data, and not "being served a part of the data based on the state my object is in", while the other data is present, but hidden from me. Or at least, I'd like to have control over what data I can get (with reasonable defaults so that I don't have to give it much thought).

I was thinking that the self.noisy_info and self.bad_channels dicts would be proper documented attributes as well, so that users could access the NoisyChannels dicts and bad channel names at each stage directly. The "getter" methods here are just so that there would be a consistent interface for getting post-reference info regardless of whether interpolating bad channels was enabled or disabled (allows for writing code where the option is easily toggled on/off).

If you think that separate attributes for the original/post-reference/post-interpolation info makes for a better API than the dict approach though I'll change it accordingly 🙂

from pyprep.

sappelhoff commented on May 25, 2024

ahh I see, makes sense - thanks for the explanation :)

from pyprep.

Extend PrepPipeline to allow performing PREP in stages about pyprep HOT 11 OPEN

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

	if self.noisy_channels["bad_all"]:
	raw_tmp._data = signal * 1e-6
	raw_tmp.info["bads"] = self.noisy_channels["bad_all"]
	raw_tmp.interpolate_bads()
	signal_tmp = raw_tmp.get_data() * 1e6
	else:
	signal_tmp = signal
	self.reference_signal = (
	np.nanmean(raw_tmp.get_data(picks=self.reference_channels), axis=0)
	* 1e6
	)

	self.noisy_channels_before_interpolation = (
	reference.noisy_channels_before_interpolation
	)
	self.noisy_channels_after_interpolation = (
	reference.noisy_channels_after_interpolation
	)
	self.bad_before_interpolation = reference.bad_before_interpolation
	self.EEG_before_interpolation = reference.EEG_before_interpolation
	self.reference_before_interpolation = reference.reference_signal
	self.reference_after_interpolation = reference.reference_signal_new
	self.interpolated_channels = reference.interpolated_channels
	self.still_noisy_channels = reference.still_noisy_channels

	@property
	def raw(self):
	"""Return a version of self.raw_eeg that includes the non-eeg channels."""
	full_raw = self.raw_eeg.copy()
	if self.raw_non_eeg is None:
	return full_raw
	else:
	return full_raw.add_channels([self.raw_non_eeg])

	if self.matlab_strict:
	_eeglab_interpolate_bads(self.raw)
	else:
	self.raw.interpolate_bads()