Giter VIP home page Giter VIP logo

Comments (11)

robintibor avatar robintibor commented on June 3, 2024 1

I will have a try with that right now actually since it is giving unexpected results with dict. we can still see which interface we prefer after.

from braindecode.

gemeinl avatar gemeinl commented on June 3, 2024
  1. Requiring mne object-awareness seems fine to me. We rely on mne a lot and hence also expect users to know mne and the mne API. We sould also keep apply_function() (https://mne.tools/stable/generated/mne.io.Raw.html#mne.io.Raw.apply_function) in mind for custom transforms.

  2. I think, we first need to know whether on-the-fly transforms are fast enough to be used with lazy loading. Would storing transforms in the datasets then be in addition to the current transforms?

from braindecode.

hubertjb avatar hubertjb commented on June 3, 2024
Requiring mne object-awareness seems fine to me. We rely on mne a lot and hence also expect users to know mne and the mne API. We sould also keep `apply_function()` (https://mne.tools/stable/generated/mne.io.Raw.html#mne.io.Raw.apply_function) in mind for custom transforms.

True, apply_function would help here, and it conveniently works in-place! We could rewrite _transform then:

def _transform(raw_or_epochs, transforms):
    for transform, transform_kwargs in transforms.items():
        if callable(transform):
            raw_or_epochs.apply_function(transform, transform_kwargs)
        else:
            if not hasattr(raw_or_epochs, transform):
                raise AttributeError(
                    f'MNE object does not have {transform} method.')
            getattr(raw_or_epochs.load_data(), transform)(**transform_kwargs)
I think, we first need to know whether on-the-fly transforms are fast enough to be used with lazy loading.

I think there is no way around this with super large datasets, don't you think? Unless we want to keep preprocessing everything offline, saving the preprocessed files to disk, and then loading them at training time.

Another point here is that if we were to work directly on numpy arrays (instead of on MNE objects) we might get a little performance gain with some transforms.

Would storing transforms in the datasets then be in addition to the current transforms?

That was part of my question - the way we had implemented it during the sprint was the transforms were stored in the dataset, and either applied in the __init__ (if loading == eager) or in __getitem__ (if loading == lazy). The new transform_concat_ds function takes care of the first case, but outside of the dataset object. Do we need both approaches? I liked the solution with a list of transforms stored inside the dataset object as it's pretty flexible and also follows torch-vision. The transform_concat_ds function is nice too though, but it limits us to the eager loading scenario I think.

from braindecode.

robintibor avatar robintibor commented on June 3, 2024

I would argue for first having a clean and simple implementation of the eager/already preprocessed lazy loading version. From that we can still extend to on the fly, which we may only need on per-crop basis in the end anyways. Better to have clean fully implemented version that runs on actual examples and going from there.

from braindecode.

robintibor avatar robintibor commented on June 3, 2024

ok no we have a working version for eager case. I still would like to make some small improvements to the usage, e.g., right now we supply an OrderedDict, while we should supply simply a list. We don't use the dict functionality as far as I see(?) and we already created a bug/unintended behavior in an example!

raw_transform_dict = OrderedDict([
("pick_types", dict(eeg=True, meg=False, stim=False)),
('apply_function', dict(fun=lambda x: x*1e6, channel_wise=False)),
('filter', dict(l_freq=low_cut_hz, h_freq=high_cut_hz)),
('apply_function', dict(fun=standardize_func, channel_wise=False))
])

The first apply_function is overwritten and never called! @gemeinl
So let's remove the OrderedDict and simply supply list. Another alternative as discussed before would be to call them line by line with some syntax like:

transform_concat_ds(pick_types, eeg=True, meg=False, stim=False)
transform_concat_ds(apply_function, fun=lambda x: x * 1e6, channel_wise=False)

Which one you guys prefer? @hubertjb @sliwy @gemeinl

from braindecode.

robintibor avatar robintibor commented on June 3, 2024

Also, as we have confused @agramfort ;) #90 (comment) maybe we can also think about different name for the function? Like apply_to_raws or apply_to_concat_ds ? Although I feel transform_raws would be fine for me (ducking from @agramfort ;))

from braindecode.

agramfort avatar agramfort commented on June 3, 2024

from braindecode.

hubertjb avatar hubertjb commented on June 3, 2024

@robintibor what would the list look like for the example you linked? I suppose something like this?

raw_transforms = [ 
      ["pick_types", dict(eeg=True, meg=False, stim=False)], 
      ['apply_function', dict(fun=lambda x: x*1e6, channel_wise=False)], 
      ['filter', dict(l_freq=low_cut_hz, h_freq=high_cut_hz)], 
      ['apply_function', dict(fun=standardize_func, channel_wise=False)] 
 ]

from braindecode.

robintibor avatar robintibor commented on June 3, 2024

Yes like that @hubertjb, so really just removing wrapping in OrderedDict

from braindecode.

hubertjb avatar hubertjb commented on June 3, 2024

I'm personally in favor of the list approach then. I like the convenience of having all the transforms in a single variable that can be reused if need to.

from braindecode.

robintibor avatar robintibor commented on June 3, 2024

oki, now implemented in #94

from braindecode.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.