Giter VIP home page Giter VIP logo

gsoc2019's Introduction

gsoc2019's People

Contributors

sappelhoff avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

gsoc2019's Issues

Pipeline API: Use JSON

Let me start by quoting @jasmainak quoting @agramfort

we should not focus on specific pipelines but rather an API that allows people to build their own pipelines from

Here goes a short proposal for what I'd like to try out first in terms of an "API" / way to operate the pipeline

An analysis pipeline is defined by

  1. some input data
  2. a number of processing steps in a certain order
    • note: a step can be a function, like raw.filter or similar
  3. a set of parameters that define each processing step function
    • coming back to raw.filter, we have for example the highpass frequency
  4. some output data + documentation

Other assumptions

  1. we want to reach many people
  2. we want to allow these people to run pipelines
    • without having to do a lot of programming / environment-setup
    • transparently, so that the pipelines remain easily understandable and reproducible
  3. rather than having something "perfect", we'd rather cover what 80% of researchers to 80% of the time

Suggestion

  • Each analysis step is defined by a JSON
    • i.e., the JSON has a field step: 'filter' ... and then some required parameter fields that depend on the value of the step field (e.g., hfreq, ...)
  • On top, there is always one "general parameters" JSON file for the whole pipeline
    • here, we can define things like output_dir: '~/Desktop/tmp', n_jobs: 3, jobs_over: 'subjects', ...
  • API-wise, we have a single function that takes as argument a tuple of JSON objects (or dictionaries, ...) --> this tuple of JSONs defines the immutable order of steps in the pipeline
  • Implementation-wise users could interact with this "pipeline" either:
    • directly from within python (passing a tuple of dicts)
    • from the python command line (passing paths to JSON files)
    • from the command line with a docker image (BIDS-app)

overall the pipeline will be directed at BIDS, because

  • the JSONs/step-wise-dicts will require fewer input arguments, because we know a lot about where to find which metadata etc.
  • we also know where to find the data
  • we can save the outputs in a derivatives/pipeline_name/XXX directory as part of the raw data

Example pipeline

One thing we could however do is demo this functionality with some example pipelines (maybe under a separate repo as binder-examples/ does it).

Apart from the discussion on specific API implementations in #1, I want to suggest an "example pipeline" here, based on a discussion with Teon and Mainak at the coding sprint in Paris:

Preprocessing pipeline

  • given a BIDS dataset
  • allow an input which (i) subjects, (ii) tasks, (iii) sessions, (iv) runs, ... to analyze
  • get all the raw data
  • run autoreject global on each datafile and save the rejection dictionary as a JSON in /derivatives/pipelines//sub-X/...
  • use rejection dict to clean data, then filter, then run ICA
  • somehow save ICA result to /derivatives ... (best would be in a non-MNE-centric format ... combination of JSON and TSV? ... NPY format?)
  • Automatically identify stereotypical artifacts: blinks, horizontal eye movements, and heartbeat ... use one of the following
    • either a VEOG, HEOG, and ECG channel name are provided in the input arguments ... then find artifacts via correlation
    • or "templates" (artifact topographies) are provided and we'll use Corrmap to auto-identify artifacts
  • Reject artifacts and transform raw data to ica-cleaned data ... save that data
  • Then load the ica-cleaned data and apply autoreject-local ... save report and the clean data

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.