sappelhoff / gsoc2019 Goto Github PK

View Code? Open in Web Editor NEW

0.0 5.0 0.0 204 KB

Organization, Discussion, Progress tracking for Google Summer of Code 2019

Home Page: https://blogs.python-gsoc.org/en/blogs/sappelhoffs-blog/

Jupyter Notebook 3.38% Python 4.17% HTML 92.44%

gsoc-2019 python neuroscience

gsoc2019's Introduction

Hi there 👋

I am a Senior Biosignal Data Analyst at Zander Labs.

You can learn more about me on my personal website.

Here are a few open source projects that I work on in my free time (see more under "pinned" repositories below):

The Brain Imaging Data Structure (BIDS)
- BIDS Specification
- BIDS website
MNE-Python
- Website
- MNE-BIDS subproject

gsoc2019's People

Contributors

Watchers

gsoc2019's Issues

Pipeline API: Use JSON

Let me start by quoting @jasmainak quoting @agramfort

we should not focus on specific pipelines but rather an API that allows people to build their own pipelines from

Here goes a short proposal for what I'd like to try out first in terms of an "API" / way to operate the pipeline

An analysis pipeline is defined by

some input data
a number of processing steps in a certain order
- note: a step can be a function, like raw.filter or similar
a set of parameters that define each processing step function
- coming back to raw.filter, we have for example the highpass frequency
some output data + documentation

Other assumptions

we want to reach many people
we want to allow these people to run pipelines
- without having to do a lot of programming / environment-setup
- transparently, so that the pipelines remain easily understandable and reproducible
rather than having something "perfect", we'd rather cover what 80% of researchers to 80% of the time

Suggestion

Each analysis step is defined by a JSON
- i.e., the JSON has a field step: 'filter' ... and then some required parameter fields that depend on the value of the step field (e.g., hfreq, ...)
On top, there is always one "general parameters" JSON file for the whole pipeline
- here, we can define things like output_dir: '~/Desktop/tmp', n_jobs: 3, jobs_over: 'subjects', ...
API-wise, we have a single function that takes as argument a tuple of JSON objects (or dictionaries, ...) --> this tuple of JSONs defines the immutable order of steps in the pipeline
Implementation-wise users could interact with this "pipeline" either:
- directly from within python (passing a tuple of dicts)
- from the python command line (passing paths to JSON files)
- from the command line with a docker image (BIDS-app)

overall the pipeline will be directed at BIDS, because

the JSONs/step-wise-dicts will require fewer input arguments, because we know a lot about where to find which metadata etc.
we also know where to find the data
we can save the outputs in a derivatives/pipeline_name/XXX directory as part of the raw data

Example pipeline

One thing we could however do is demo this functionality with some example pipelines (maybe under a separate repo as binder-examples/ does it).

Apart from the discussion on specific API implementations in #1, I want to suggest an "example pipeline" here, based on a discussion with Teon and Mainak at the coding sprint in Paris:

Preprocessing pipeline

given a BIDS dataset
allow an input which (i) subjects, (ii) tasks, (iii) sessions, (iv) runs, ... to analyze
get all the raw data
run autoreject global on each datafile and save the rejection dictionary as a JSON in /derivatives/pipelines//sub-X/...
use rejection dict to clean data, then filter, then run ICA
somehow save ICA result to /derivatives ... (best would be in a non-MNE-centric format ... combination of JSON and TSV? ... NPY format?)
Automatically identify stereotypical artifacts: blinks, horizontal eye movements, and heartbeat ... use one of the following
- either a VEOG, HEOG, and ECG channel name are provided in the input arguments ... then find artifacts via correlation
- or "templates" (artifact topographies) are provided and we'll use Corrmap to auto-identify artifacts
Reject artifacts and transform raw data to ica-cleaned data ... save that data
Then load the ica-cleaned data and apply autoreject-local ... save report and the clean data

sappelhoff / gsoc2019 Goto Github PK

gsoc2019's Introduction

Hi there 👋

gsoc2019's People

Contributors

Watchers

gsoc2019's Issues

Pipeline API: Use JSON

An analysis pipeline is defined by

Other assumptions

Suggestion

Example pipeline

Preprocessing pipeline

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent