Giter VIP home page Giter VIP logo

meld_classifier's Introduction

MELD classifier

Neural network lesion classifier for the MELD project.

The manuscript describing the classifier can be found here https://academic.oup.com/brain/advance-article/doi/10.1093/brain/awac224/6659752

Code Authors : Hannah Spitzer, Mathilde Ripart, Sophie Adler, Konrad Wagstyl

overview

This package comes with a pretrained model that can be used to predict new subjects. It also contains code for training neural network lesion classifiers on new data.

Disclaimer

The MELD surface-based FCD detection algorithm is intended for research purposes only and has not been reviewed or approved by the Medicines and Healthcare products Regulatory Agency (MHRA), European Medicine Agency (EMA) or by any other agency. Any clinical application of the software is at the sole risk of the party engaged in such application. There is no warranty of any kind that the software will produce useful results in any way. Use of the software is at the recipient's own risk.

Installation

Note: The current installation of the MELD classifier is not working on MAC M1 and MAC M2 OS because of incompatibilities with tensorflow and these OS systems.

Prerequisites

For preprocessing, MELD classifier requires Freesurfer. It is trained on data from versions 6 & v5.3, but compatible with Freesurfer version up to V7.2. Please follow instructions on Freesurfer to install FreeSurfer.
New update ! MELD pipeline is now also working with FastSurfer (quicker version of Freesurfer). If you wish to use FastSurfer instead please follow instructions for the native install of Fastsurfer. Note that Fastsurfer requires to install Freesurfer V7.2 to works
WARNING: MELD pipeline has not been adapted for Freesurfer V7.3 and above. Please install Freesurfer V7.2 instead.

You will need to ensure that Freesurfer is activated in your terminal (you should have some printed FREESURFER paths when opening the terminal). Otherwise you will need to manually activate Freesurfer on each new terminal by running :

export FREESURFER_HOME=<freesurfer_installation_directory>/freesurfer
source $FREESURFER_HOME/SetUpFreeSurfer.sh

with <freesurfer_installation_directory> being the path to where your Freesurfer has been installed.

Conda installation

We use anaconda to manage the environment and dependencies. Please follow instructions on anaconda to install Anaconda.

Install MELD classifier and python dependencies:

# checkout and install the github repo 
git clone https://github.com/MELDProject/meld_classifier.git 

# enter the meld_classifier directory
cd meld_classifier
# create the meld classifier environment with all the dependencies 
# ! Note : If you have a new MAC1 OS system, you will need to install the special environments for new MAC1 users in the second command below.
conda env create -f environment.yml    # For Linux and old MAC os users
conda env create -f environment_MAC1.yml  # For new MAC1 users 
# activate the environment
conda activate meld_classifier
# install meld_classifier with pip (with `-e`, the development mode, to allow changes in the code to be immediately visible in the installation)
pip install -e .

Set up paths and download model

Before being able to use the classifier on your data, some paths need to be set up and the pretrained model needs to be downloaded. For this, run:

python scripts/prepare_classifier.py

This script will ask you for the location of your MELD data folder and download the pretrained model and test data to a folder inside your MELD data folder. Please provide the path to where you would like to store MRI data to run the classifier on.

Note: You can also skip the downloading of the test data. For this, append the option --skip-download-data to the python call.

FAQs

Please see our FAQ for common installation problems.

Verify installation

We provide a test script to allow you to verify that you have installed all packages, set up paths correctly, and downloaded all data. This script will run the pipeline to predict the lesion classifier on a new patient. It takes approximately 15minutes to run.

Note: Do not forget to activate Fressurfer as describe above before to run the test.

cd <path_to_meld_classifier>
conda activate meld_classifier
pytest

Note: If you run into errors at this stage and need help, you can re-run the command below to save the terminal outputs in a txt file, and send it to us. We can then work with you to solve any problems.

pytest -s | tee pytest_errors.log

You will find this pytest_errors.log file in <path_to_meld_classifier>.

Usage

With this package, you can use the provided classifier to predict subjects from existing and new sites. For new site, you will need to harmonise your data first. In addition, you can train your own classifier model. For more details, check out the guides linked below:

Contribute

If you'd like to contribute to this code base, have a look at our contribution guide

Manuscript

Please check out our manuscript to learn more.

An overview of the notebooks that we used to create the figures can be found here.

A guide to using the MELD surface-based FCD detection algorithm on a new patient is found here.

Acknowledgments

We would like to thank the MELD consortium for providing the data to train this classifier and their expertise to build this pipeline.
We would like to thank Lennart Walger and Andrew Chen, for their help testing and improving the MELD pipeline to v1.1.0

meld_classifier's People

Contributors

mathrip avatar kwagstyl avatar sophieadler avatar hspitzer avatar

Stargazers

 avatar  avatar Hanwen Bi avatar Rudi Kreidenhuber avatar  avatar

Watchers

 avatar Simon Warfield avatar  avatar

meld_classifier's Issues

Pipeline runs even though key files are missing

error: MRISread(/.../meld_classifier_v1.1.0/data/fs_outputs/MELD_< ID >/surf/rh.pial): could not open file [ ok ]
error: No such file or directory
error: mris_curvature_stats: could not read surface file /.../meld_classifier_v1.1.0/data/fs_outputs/MELD_< ID >/surf/rh.pial

open files without context might lead to memory leak

Observation:

running prediction for single subject works fine, but process is killed for 100+ subjects because of memory overflow

e.g. data_preprocessing.py line 160:

hdf5_file_context = h5py.File(hdf5_file, "r")

with hdf5_file_context as f:

is erroneous code, the file is opened before 'with'

skip preprocessing if run already

run_script_preprocessing.py takes quite a while and sometimes has run already, but there is no option to skip it (or skip automatically)

Container that encapsulates the classifier and pre-processing steps

It would be convenient to have a container, such as a docker container, that includes all of the software and data needed to apply the MELD classifier to a subject.

An example of a Dockerfile that would build such an environment is here:

FROM ubuntu:jammy AS FCDDetection
ENV DEBIAN_FRONTEND="noninteractive"

Update the ubuntu.

RUN apt-get -y update &&
apt-get -y upgrade

Install the prerequisite software

RUN apt-get install -y build-essential
apt-utils
vim
nano
curl
wget
pip
python3
git

Install freesurfer in /opt/freesurfer.

RUN mkdir -p /freesurfertar
&& curl -SL https://surfer.nmr.mgh.harvard.edu/pub/dist/freesurfer/7.3.2/freesurfer-linux-ubuntu22_amd64-7.3.2.tar.gz
| tar -xzC /freesurfertar
&& mv /freesurfertar/freesurfer /opt/
&& rmdir /freesurfertar

This will also need to be configured in the running docker container.

This modifies only the build environment.

ENV PATH=/opt/freesurfer/bin:$PATH
RUN echo "PATH=/opt/freesurfer/bin:$PATH" >> ~/.bashrc
ENV FREESURFER_HOME=/opt/freesurfer
RUN echo "FREESURFER_HOME=/opt/freesurfer" >> ~/.bashrc

Install miniconda

ENV CONDA_DIR /opt/conda
RUN wget --quiet https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda.sh &&
/bin/bash ~/miniconda.sh -b -p /opt/conda

Put conda in path so we can use conda activate

ENV PATH=$CONDA_DIR/bin:$PATH

Update to the newest version of conda:

RUN conda update -n base -c defaults conda

RUN conda init bash

checkout and install the github repo

RUN cd / && git clone https://github.com/MELDProject/meld_classifier.git

enter the meld_classifier directory

create the meld classifier environment with all the dependencies

SHELL ["/bin/bash", "-c"]
RUN cd /meld_classifier && conda env create -f environment.yml
RUN cd /meld_classifier && conda run -n meld_classifier /bin/bash -c "pip install -e ."

Use the bash shell this way because conda cant activate.

#SHELL ["conda", "run", "-n", "meld_classifier", "/bin/bash", "-c"]

Configure MELD for this docker image:

COPY ./meld_config.ini /meld_classifier/meld_config.ini
RUN mkdir /MELD-data

RUN conda run --no-capture-output -n meld_classifier /bin/bash -c " python3 /meld_classifier/scripts/prepare_classifier.py --skip-config"

COPY user-freesurfer-license.txt ${FREESURFER_HOME}/.license

RUN cd /meld_classifier && conda run --no-capture-output -n meld_classifier /bin/bash -c " pytest "

Input data not BIDS conform

Option A:

provide script to transform BIDS conform data to the MELD format

Option B [preferred ?]:

include bids as a dependency and make all scripts expect BIDS conform data

Fastsurfer as alternative for Freesurfer

especially recon-surf instead of recon-all with similar performance at much shorter processing times?

--> would probably require validation first ?

requires update from python 3.7

Using MP2RAGE images - New training necessary

Dear all,

If I understood correctly, when using different MRI sequences images such as MP2RAGE, which were not used in the original training, then a new training should be done and the already existing trained model cannot be used?

Best

Better checks for freesurfer output

Currently it just checks if the subject folder exist in the fs_outputs directory

(in new_pt_pipeline_script1.py)
if os.path.isdir(os.path.join(fs_folder, subject)):

If not all outputs have been generated, the rest of the pipeline fails.
Rather check if all necessary feature files exist and create missing ones specifically.

memory consumtion of new_pt_pipeline_script3.py

It seems I cannot run the prediction on my full cohort at once because it uses 120+ GB of ram for 170 subjects

It seems to build up a large "exp" MeldCohort class, which might be unnecessary?
At least it should internally split the pipeline or run subjects sequentially.

Make the project truly open-source

Hello MELD team,

The project seems very cool, however it seems impossible to play wit its own MRI data without having a model dataset === having an access to lots of EEG records other patients to build a model.

In my opinion, no law would be broken in case you publish the model build on a patient's data publicly in this or in the separate repository so all people could try the project without contacting or being locked on a medical universities. Probably several models should be published per specific MRI scanner I guess. As an example OpenAI company - they harvested data from all over the world and everything is fine so far. Could we give a possibility to every individual to run the project it would help for the project development overall.

Error generating report when no clusters are found

File "/.../meld_classifier_v1.1.0/scripts/manage_results/plot_prediction_report.py", line 180, in get_cluster_location
ind = pred_rois[np.where(pred_rois == pred_rois[:,1].max())[0]][0][0]
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

better checks if parts of pipeline have completed

Currently, processing skips a step if the main directory, which is created in the beginning, exists.

It would be better to check for files indicating a step of the pipeline has finished successfully.

pipeline output location

fs_outputs folder contains non fs_outputs such as the prediction.mgh and everything under 'surf_meld'

Error messages indicating pipeline failure

A lot of times if files are not found or paths are incorrect the pipeline just crashes without a specific error message.

Checks if a previous pipeline step was successful (i.e. all necessary files were produced) would be preferable, along with messages if that is not the case.

mix up between required and excluded features?

should 'required_subject_features' not be linked with 'subject_features_to_include' ?

_, required_subject_features = self._filter_features(
subject_features_to_exclude,
return_excluded=True,

same in

# e.g. use this to filter subjects with FLAIR features
_, undesired_subject_features = self._filter_features(
subject_features_to_include,
return_excluded=True,

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.