This is not something to do right now, but as far as I heard it's now possible to test

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Testing Matlab PREP in our CI? Directly comparing with pyprep,about sappelhoff/pyprep

Comments (26)

a-hurst commented on May 25, 2024 2

@sappelhoff I've never used it, but there's a promising-looking package called oct2py that lets you run MATLAB functions via Octave through Python, passing and receiving data as numpy arrays.

If we can get it working with the basics of PREP, it could be handy for both CI and for our validation efforts since it would let us substitute individual PyPREP functions with their MATLAB PREP equivalents to compare things like interpolation directly between packages.

from pyprep.

jasmainak commented on May 25, 2024 1

I did a quick search of autoreject issues and prep (matlab) issues and PRs, but couldn't find anything

there are none. Autoreject used to be a private repo at the time the RANSAC in PREP was implemented and I didn't want to go into a rabbit hole fixing PREP. Had some ideas to improve RANSAC in PREP but never got around to them :)

from pyprep.

a-hurst commented on May 25, 2024 1

@sappelhoff I'll get to doing a proper fork eventually: right now I'm still ironing out what's Octave-compatible and what isn't (which is made a lot more difficult by every change/fix taking half a day chugging along to test). This morning I woke up to this error after two successful iterations of findNoisyChannels so I guess that's the next thing to figure out:

oct2py.utils.Oct2PyError: Octave evaluation error:
error: structure has no member 'etc'
error: called from:
    pop_select at line 196, column 1
    eeg_interp at line 158, column 19
    interpolateChannels at line 49, column 12
    robustReference at line 80, column 19
    performReference>doRobustPost at line 118, column 22
    performReference at line 76, column 5

Because I'm lazy and don't know MATLAB too well, I'm going to try "print-by-line" profiling here to get a better sense of what's going on with the correlation speed. That said, if you know of a better/easier way of profiling MATLAB code that's Octave-compatible, please let me know!

from pyprep.

a-hurst commented on May 25, 2024 1

So far the changes are minimal, but of course you're right: I'll start tracking things more formally as I go.

Also, found the cause of the slowness: the calculation of the correlation coefficients for each 1-second window of the file takes 2-to-3 seconds per window for some reason?

windowCorrelation = corrcoef(eegPortion);

Since the file is about 2700 seconds long, that adds up to several hours. No clue why it's so slow here (I'm assuming it's faster in regular MATLAB, since it's massively faster in PyPREP), but I'll look into it!

from pyprep.

a-hurst commented on May 25, 2024 1

why not setup a matlab CI flow that installs EEGLAB and a recent version of PREP and then runs on a dataset of a dataset of our choice, and then the results are passed on to the next step in the CI where we run pyprep and then compare results?

Two questions on this:

Would the caching of build artifacts like this mean that it'd only re-run the MATLAB script any time it was changed, instead of every time the CI tests are run?
Would the artifacts be accessible from all the different CI bots, (Windows/macOS/Linux) or just the one it was run in? I ask because MATLAB via both GitHub Actions and Travis CI are both Linux-only. I guess the MATLAB PREP comparison could be its own separate runner in the CI matrix.

My main rationale behind the separate-repo-artifacts approach was that it would get around potential issues 1) and 2) easily, and also allow for running of local unit tests of MatPREP vs. PyPREP without a full MATLAB/EEGLAB/PREP install (using urllib or similar to just download the artifacts to a temporary folder for local tests). If 1 & 2 aren't issues, then doing everything in a single repository definitely makes sense.

my only concern is that we also start testing IO roundtrip accuracies there and I hope that won't trip us up.

From my local testing, the EEG float values seem to be identical in an MNE-imported .set vs the same .set in EEGLAB (other than the volts/microVolts conversion) so this thankfully shouldn't be an issue!

Let's only use .set and not the set+fdt combo

Ah cool, didn't even know that was possible! Single-file is definitely much cleaner to work with. The MATLAB/EEGLAB world is still very foreign to me...

from pyprep.

a-hurst commented on May 25, 2024 1

@sappelhoff Spent most of the day hacking away at this, and after much hacking and debugging I have a basic proof-of-concept working: https://github.com/a-hurst/matprep_artifacts/runs/2478319171?check_suite_focus=true

Right now it's just generating the .mat with the internal NoisyChannels info and it's still in a hacked-until-it-worked state, but definitely a relief to see the basics up and running!

EDIT: Got a chance to tidy it up, now it's got a README and generates single-file .set files at each main stage across the pipleine in addition the noisy_info .mat. All that's left is to set up a "release-on-tag" mechanism and finalize the choice of the test file (using S004R04.edf from the eegbci dataset as pyprep's current S004R01.edf it uses for tests). I opted for the other file because it's twice as long and thus there's a better chance of RANSAC having enough windows to detect bads properly, but I'm also open to S004R01.edf for speed/consistency.

from pyprep.

a-hurst commented on May 25, 2024 1

I just took a look --> that looks awesome! Super clean / clear! I really like that you separated config/settings/setup and the .m script.

Thanks! One of my biggest long-standing annoyances with scripting in the sciences is that people like to bury all the configurable setting switches throughout their scripts (or don't make key things configurable at all) instead of having it all in one clear and obvious place, so I always try to make a separate settings.R/.m/.py script whenever it makes sense.

If you've got no strong preference on the test file I'll move along with S004R04.edf and see how things go on the Python end! The nice thing about the script setup is that the test file could be changed at any time and it wouldn't affect the Python side at all: since it's using 1_matprep_raw.set as its starting point, it'll always be starting with the same data as MATLAB PREP.

from pyprep.

sappelhoff commented on May 25, 2024 1

regarding the organization of repos etc., we could also:

leave pyprep under sappelhoff
leave matprep_artifacts under a-hurst

... but include matprep_artifacts in pyprep as a git submodule ... to keep a record of it within pyprep.

I thinking about this also in the direction of longevity of the project: For release 0.4 I'd want to switch on the Zenodo tracking of the repo, i.e., to permanently store/archive a copy of the full code on an independent site. And it'd be cool if the matprep artifacts could be part of that archive, which I think they would via the git submodules solution. 🤔 💭 (but have never tested)

WDYT?

EDIT (even before sending off): Actually git submodules seem to not be included in the zenodo archive 😕 it's an open issue: zenodo/zenodo#1036

So at least you know my thought process now. If anyone can come up with a good solution for proper archival and cross-referencing of "all things pyprep", that'd be great.

from pyprep.

a-hurst commented on May 25, 2024 1

@sappelhoff Sounds like a good idea to me! Apologies for never replying to your earlier post, I must have missed it somehow.

And the Zenodo archiving sounds good too, would be nice to have a proper DOI to cite for PyPREP!

from pyprep.

a-hurst commented on May 25, 2024

Direct comparison during CI like this is a good idea, but I think we can probably do it without actually adding any MATLAB code to the CI: as long as the MATLAB PREP results are consistent with a random seed, we could run MATLAB PREP on a file locally, upload the resulting .set to Google Drive or a similar service that allows downloading files with curl/wget, and then have the CI just download it prior to running PyPREP tests.

Also, I remember @yjmantilla mentioning that PyPREP intentionally differs in some minor ways from MATLAB PREP because they do some weird things with medians or rounding or something. To make tests like this work, we'd need to add a special flag of some kind that would make PyPREP to do the same weird things for comparison purposes so that we can be sure the only functional differences are in those areas.

from pyprep.

sappelhoff commented on May 25, 2024

but I think we can probably do it without actually adding any MATLAB code to the CI: as long as the MATLAB PREP results are consistent with a random seed, we could run MATLAB PREP on a file locally, upload the resulting .set to Google Drive or a similar service that allows downloading files with curl/wget, and then have the CI just download it prior to running PyPREP tests.

agreed, that'd be a good first step

Having it properly implemented in CI however serves the additional purposes of:

having the MATLAB pipeline documented transparently and easily accessible
having the opportunity to be notified about upstream changes in MATLAB prep

because they do some weird things with medians or rounding or something.

yes, I think they didn't take the proper median for an even number of channels ... not sure if that has been fixed by now. We should check it out if we find the time. I did a quick search of autoreject issues and prep (matlab) issues and PRs, but couldn't find anything. @jasmainak originally detected that problem, maybe he knows where to find it again.

To make tests like this work, we'd need to add a special flag of some kind that would make PyPREP to do the same weird things for comparison purposes so that we can be sure the only functional differences are in those areas.

yes, that'd need some clever fixes ... such as flags. or preferably, selecting a test dataset where that specific median-difference doesn't matter between PREP and PYPREP.

from pyprep.

yjmantilla commented on May 25, 2024

@sappelhoff

yes, that'd need some clever fixes ... such as flags. or preferably, selecting a test dataset where that specific median-difference doesn't matter between PREP and PYPREP.

In this one I would go for the flag, that is , if someone wants to go with the matlab's prep way they can do it through it. I guess we should really open the issue there to see whats going on with that. Maybe they will correct it and there won't be any need for the flag.

Also I don't know how easy it would be to find such a dataset... maybe is super easy but the flag would suffice in case it is hard to find.

With #45 the CI comparison should become easier.

from pyprep.

sappelhoff commented on May 25, 2024

yes, would be fine with me. I guess the first job would be to go back to MATLAB PREP and find the ominous false-median-computation.

... and then all other points in code where pyprep and prep deviate. THEN we can think about how best to allow users to change pyprep from "best practice in our opinion" to original PREP.

from pyprep.

yjmantilla commented on May 25, 2024

jasmainak reported these lines in #21

from pyprep.

a-hurst commented on May 25, 2024

@sappelhoff @yjmantilla A quick update I thought you'd be interested in: during my recent burst of PyPREP debugging I spent a few hours hacking away at things and got EEGLAB + PREP (somewhat) running via oct2py:

Most of the work was figuring out how to convert MNE Raw objects into the EEGLAB format, which now seems to be working well enough for PREP to not error out immediately. A quick rundown of some problems to eventually overcome:

PREP is not natively Octave-compatible: it relies on a couple odd MATLAB-only functions that I needed to substitute or comment out. Fine for testing locally, but would require workarounds for CI usage.
The findNoisyChannels bit that does the channel correlations before RANSAC is slooooooooooooowwwwwwww, like 4-5 hours per iteration slow. Guessing this is an Octave issue as well? At its current pace, it's going to take a full day to do robust referencing on a file that takes maybe 6-10 minutes for the same via PyPREP.

In other words, it seems to work, but there seem to be some changes needed to the MATLAB code itself before CI comparison is doable.

from pyprep.

sappelhoff commented on May 25, 2024

nice, thanks! 😎

Now did you by any chance do your changes by

forking PREP
making a new branch on your fork and adjusting for octave

Because then you could push that branch up to GitHub and open a PR to your own fork --> that'll let us inspect the git diff, so we can better see the extent of the adjustments :-)

it's surprising that it's so ultra slow now. Perhaps profiling the function would be a good idea, to see exactly where this slowing comes from. 🤔

from pyprep.

sappelhoff commented on May 25, 2024

I see - and I think there's a very high chance to get lost by not following the workflow of:

fork, then clone, then checkout a new branch, and then adding your changes/adjustments incrementally with new commits that you then push go GitHub (on your fork), using meaningful commit messages

... so just a recommendation to start with that better early than late :)

apart from that I don't have good advice unfortunately :-(

from pyprep.

a-hurst commented on May 25, 2024

@sappelhoff Just a quick status update: I've given up on oct2py for the time being given how horribly slow Octave seems to be at findNoisyChannels, but I realized I've got a site licence for MATLAB from my institution so I installed it and got MatPREP working smoothly using @yjmantilla's script they shared a few months back.

In doing so, I noticed that MatPREP retains a lot of debug info regarding the findNoisyChannels process that PyPREP doesn't: e.g. the full matrices of regular correlations and RANSAC correlations for each channel/window, the HF noise z-scores for each channel, the fractions of "bad" RANSAC windows for each channel, etc. What I did yesterday was:

Write a Python script to import and parse the PREP debug info .mat files into Python/csv format
Adapt PyPREP to collect most of the same debug information
Modify the MatPREP script to only perform CleanLine and save the resulting data for each participant to a .set/.fdt
Modify my PyPREP script to import the .set/.fdt files, apply the EEGLAB montage, and perform robust re-referencing, saving the debug info to pickle files for later parsing. This theoretically allows for the closest comparison between PyPREP's NoisyChannels and MatPREP's possible, and also saves a ton of time waiting for cleanline during testing.

With the detailed debug information for both implementations saved such that they can be directly compared, hopefully I can get to the bottom of the NoisyChannels differences!

EDIT: From what I can tell, it looks like PyPREP and MatPREP start to diverge as soon as removeTrend is done: the values in the input data are both identical immediately before (I set breakpoints in both and checked), but immediately after the results diverge quite a bit:

First sample of first 10 channels in MATLAB PREP before trend removal:

signal.data(:, 1)

ans =

   1.0e+04 *

   -0.3335
    1.0037
   -0.3581
   -0.1795
    0.9913
   -0.7414
   -0.7198
   -0.5665
    0.5557
   -1.2237

First sample of first 10 channels in MATLAB PREP after trend removal:

signal2.data(:, 1)

ans =

   -3.7068
   -4.8012
   -0.8008
   -2.5809
   -1.5205
    0.1093
    1.9928
   -4.6906
  -10.2859
    1.6814

First sample of first 10 channels in PyPREP before trend removal:

raw_copy._data[:10, 0]
array([-0.00333541,  0.01003745, -0.00358109, -0.00179516,  0.00991251,
       -0.00741443, -0.00719762, -0.00566537,  0.00555661, -0.01223686])

First sample of first 10 channels in PyPREP after trend removal:

raw_filtered._data[:10, 0]
array([-4.11658012e-19, -1.79973325e-18,  6.98802181e-19,  2.89261751e-19,
        2.50128829e-18, -9.28348110e-19,  7.57247455e-19,  3.55753838e-19,
        3.82858892e-19, -7.01343280e-19])

As you can see, there's a huge order-of-magnitude shift with PyPREP's removeTrend that's much smaller in MatPREP, and the values also don't line up. Looks like our filter is doing something different from theirs, although I'm not quite sure what that is yet.

from pyprep.

sappelhoff commented on May 25, 2024

That's great that you got your hands on a MATLAB license! Looking forward to what you may find :-)

from pyprep.

a-hurst commented on May 25, 2024

@sappelhoff Okay, so once #70 is merged and I dig a bit more into #55 to see if it's an easy fix, I think we'll be at the point where we can start adding MATLAB comparison for NoisyChannels, removeTrend, and potentially Reference into the CI! I guess the question now is what's the best way to do it.

Since Oct2Py seems to have been a dead-end in terms of PREP compatibility, here's my current rough proposal:

Create a separate Git repository with a MATLAB script for running all stages of PREP on a given file and saving the resulting .set/.fdts following each stage (plus a .mat containing the extra info described in #68).
Set up CI for the above repository to run the script and save all the output files as GitHub Release artifacts.
Set up PyPREP to (optionally?) automatically grab the latest release artifacts from that repository during pytest and use the data for comparison

The advantages of this approach (in my mind) would be that we wouldn't need to re-run MATLAB PREP every time we run a CI job, and we also wouldn't need to depend on PyPREP's CI service (or anyone running unit tests locally) as having a properly-configured MATLAB environment. Separate .set/.fdt sets after each major step (post-cleanline, post-detrend, post-rereference, post-interpolation) would mean that a difference in an earlier part (e.g. CleanLine) won't automatically result in comparisons of later parts (e.g. NoisyChannels) failing, since we could load the relevant .set/.fdt before each to make sure PyPREP's starting with the same data for each stage.

I've already got a Python script that reads MATLAB PREP's extra info (saved into a .mat) and transforms it into a useful Python dict, so that part's already done, and my local MATLAB script for testing could be quickly adapted for the purposes described above. I think the main headache will be automating the setup of the MATLAB environment within a CI context: that said, apparently GitHub Actions and Travis CI both have some form of MATLAB support, so I guess we'll see.

from pyprep.

sappelhoff commented on May 25, 2024

+1 for using Matlab in GitHub actions ... let's resort to Travis only when GH actions turns out to be difficult
re: separate repos: I think I would prefer everything to live in one (this) repo
why not setup a matlab CI flow that installs EEGLAB and a recent version of PREP and then runs on a dataset of a dataset of our choice, and then the results are passed on to the next step in the CI where we run pyprep and then compare results?

Let me explain what I mean with a brief circleci example:

in this CI step, we build the docs, and "store" the results: https://github.com/mne-tools/mne-bids/blob/7a4e65f9847ba6e94d1cd469e8472f024f6329e8/.circleci/config.yml#L35-L37
in a later CI step, we get back our stored results and continue working on them: https://github.com/mne-tools/mne-bids/blob/7a4e65f9847ba6e94d1cd469e8472f024f6329e8/.circleci/config.yml#L68-L69
I don't know what these are called in GH actions from the top of my head, but this concept exists for all serious CIs

Separate .set/.fdt sets after each major step

I think that's a nice idea, my only concern is that we also start testing IO roundtrip accuracies there and I hope that won't trip us up. --> sidenote: Let's only use .set and not the set+fdt combo (I personally find the one file approach nicer, and with the new EEGLAB set style of storing the matlab EEGLAB data at the "root" of the set file (which is a mat file) instead of an EEG "struct", one can also access metadata without loading the full binary data. FYI: mne-tools/mne-python#8864 (comment)

from pyprep.

sappelhoff commented on May 25, 2024

Would the caching of build artifacts like this mean that it'd only re-run the MATLAB script any time it was changed, instead of every time the CI tests are run?

no, that'd be an "every time" approach instead of computing once per release. Why would that be a problem? The results should stay the same if all versions are properly pinned, and a seed is set, right?

The only issue I could potentially see is (a) waiting time, (b) computation is expensive and pollutes the environment

Would the artifacts be accessible from all the different CI bots, (Windows/macOS/Linux) or just the one it was run in?

no, pretty sure that the "artifacts" (in the way we talk about it currently) are only available to the current CI "bot" (matrix).

But I think comparing PREP and PyPREP for equivalence would be fine enough to do only under one specific set of parameters, as long as we continue to run all other unit tests under more diverse conditions.

and also allow for running of local unit tests of MatPREP vs. PyPREP without a full MATLAB/EEGLAB/PREP install

yes, that is a fair point. Needs to be balanced against the increased amount of infrastructure (a whole separate repo). How would you want to do the new repo? under your user account? mine? a new pyprep GH account that we then also transfer sappelhoff/pyprep to?

from pyprep.

a-hurst commented on May 25, 2024

I think comparing PREP and PyPREP for equivalence would be fine enough to do only under one specific set of parameters

Agreed, I definitely think it would be overkill to multiple test files/configurations: setting up just one will be hard enough! I only mentioned artifacts across CIs to wonder if comparison tests could be on macOS/Windows runners too given that the MATLAB CI only works on Linux instances. I guess there'd be no reason to expect differences between OSes in terms of MATLAB differences, though.

that'd be an "every time" approach instead of computing once per release. Why would that be a problem?

From my reading into the MATLAB GitHub Actions, it looks like it needs to download & install MATLAB on the instance every single time it's run, so I'd imagine it would slow down CI considerably if the results can't be cached for future runs.

Based on your thoughts/feedback, here's my current proposal: for now, I'll try and set up a MATLAB PREP script with CI and artifact saving in a new repository of my own, just to see what it's like in terms of speed/functionality. If it's pretty quick/painless and wouldn't add much time to the PyPREP CI, I'll take my work and make it into a PyPREP PR so everything's in one repository. If it's excessively slow and/or messy, we'll keep it as a separate repository that rarely needs to be run. If that happens, I can transfer the repository to you or a separate PyPREP organization (if you'd prefer control), but I'm also happy to keep it as my own if you'd prefer not to be responsible for its maintenance. Does that sound like a reasonable plan?

from pyprep.

sappelhoff commented on May 25, 2024

Sounds great! Good luck and let me know if I can help.

from pyprep.

sappelhoff commented on May 25, 2024

I just took a look --> that looks awesome! Super clean / clear! I really like that you separated config/settings/setup and the .m script.

Choice of the data is up to you, I haven't worked with either of the datasets.

You could add a GH actions "badge" to the README that shows whether CI is passing and that acts as a link to the most recent run.

from pyprep.

sappelhoff commented on May 25, 2024

update, I added matprep_artifacts as a git submodule in ab07d85 to "have it all in one place". Technically, nothing changes for future contributors and our general workflow. I'll just sleep more soundly to have a comprehensive (and tracked) cross-reference to that now-crucial part of the pyprep infrastructure within pyprep.

Cc @a-hurst @yjmantilla

from pyprep.

Testing Matlab PREP in our CI? Directly comparing with pyprep about pyprep HOT 26 CLOSED

Comments (26)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent