Giter VIP home page Giter VIP logo

meshlearn's Introduction

status license

meshlearn

AI model to predict computationally expensive local, vertex-wise descriptors like the local gyrification index from the local mesh neighborhood.

This includes a python package and API (meshlearn) and two command line applications for training and predicting lGI, meshlearn_lgi_train and meshlearn_lgi_predict. End users are most likely interested only in the meshlearn_lgi_predict command, in combination with one of our pre-trained models.

Vis0 Fig. 0 Left: Brain surface, faces drawn. Right: Visualization of predicted lGI per-vertex data on the mesh, using the viridis colormap.

About

Predict per-vertex descriptors like the local gyrification index (lGI) or other local descriptors for a mesh.

  • The local gyrification index is a brain morphometry descriptor used in computational neuroimaging. It describes the folding of the human cortex at a specific point, based on a mesh reconstruction of the cortical surface from a magnetic resonance image (MRI). See Schaer et al. 2008 for details.
  • The geodesic circle radius and related descriptors are described in my cpp_geodesics repo and in the references listed there. Ignore the global descriptors (like mean geodesic distance) in there.

Vis1

Fig. 1 A mesh representing the human cortex, edges drawn.

Vis2

Fig. 2 Close up view of the triangular mesh, showing the vertices, edges and faces. Each vertex neighborhood (example for the ML model) describes the mesh structure in a sphere around the respective vertex. Vertex neighborhoods are computed from the mesh during pre-processing.

This implementation uses Python, with tensorflow and lightgbm for the machine learning part. Mesh pre-processing is done with pymesh and igl.

Why

Computing lGI and some other mesh properties for brain surface meshes is slow and sometimes fails even for good quality meshes, leading to exclusion of the respective MRI scans. The lGI computation also requires Matlab, which is inconvenient and prevents the computation of lGI on high performance computer clusters (due to the excessive licensing costs), which would be a way to deal with the long computation times. This project aims to provide a trained model that will predict the lGI for a vertex based on the mesh neighborhood. The aim is to have a faster and more robust method to compute lGI, based on free software.

Usage

Predicting using pre-trained models

Please keep in mind that meshlearn is in the alpha stage, use in production is not yet recommended. You are free to play around with it though!

Currently meshlearn comes with one pre-trained model for predicting the local gyrification index (lGI, Schaer et al.) for full-resolution, native space FreeSurfer meshes. These meshes are (a part of) the result of running FreeSurfer's recon-all pipeline on structural MRI scans of the human brain.

The model is a gradiant-boosting machine as implemented in lightgbm, and it was trained on a diverse training set of about 60 GB of pre-processed mesh data, obtained from the publicly available, multi-site ABIDE I dataset. The model can be found at tests/test_data/models/lgbm_lgi/, and consists of the model file (ml_model.pkl, the pickled lightgbm model) and a metadata file (ml_model.json) that contains the pre-processing settings used to train the model. These settings must also be used when predicting for a new mesh.

The meshlearn_lgi_predict command line application that is part of meshlearn can be used to predict lGI for your own FreeSurfer meshes using the supplied model or alternative models. After installation of meshlearn, run meshlearn_lgi_predict --help for available options. (For now, you will need to follow the installation instructions in the development section below, as there is not official release yet.)

Information on model performance can be found in the mentioned ml_model.json file, under the key model_info.evaluation. The model has not been fine-tuned yet.

Training your own model

If you want to train your own model instead of using one of our models, you will need suitable training data, Matlab and a powerful multi-core machine with 128+ GB of RAM. Please see the development instructions for more details.

meshlearn's People

Contributors

dfsp-spirit avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar

meshlearn's Issues

Data Loader: add variables related to total brain size to feature columns

I see several options:

  • parse TBV from FreeSurfer metadata files. Precise but ugly, since we cannot predict based only on the mesh anymore
  • add the min and max value of the x,y,z coords (or the x,y,z size) of the mesh.

I would suggest we go with the 2nd approach and compute feature importances to see whether it helps.

Data Loader: Load less data per file, and more files.

The reason in to train on more different meshes (people's brains), as they differ quite a bit in:

  • brain anatomy
  • image quality, due to
    • site effects like MRI scanner model and settings
    • artifacts from subject motion, etc.

Feature Engineering: Add more local (per-vertex) shape descriptors

See these publications for a first overview:

"It is clear that distance to plane performed the worst, with Gaussian curvature as the second worst. However, no
descriptor consistently performed better than the others. Mean curvature had the highest statistics, however, never had
the highest precision. On the other hand, while shape index had the highest precision for most recall values, it had the
lowest precision at high recall and only the third highest statistics. Overall, the best descriptors were mean curvature,
shape index, and curvature index."

The curvatures seem popular (and can be computed very quickly), but also special face descriptors exist, apparently typically used for object retrieval. We will have to see whether they are implemented in Python somewhere, and whether they are so cheap to compute that we can afford to add them during the data loading/pre-processing.

Implement post-processing: smooth data

Maybe we can improve our predictions by applying a post-processing. Once all values for a mesh are know, some smoothing could actually be beneficial.

This would require the ability to perform smoothing of per-vertex data on the meshes, which may be slow in Python. But we could call into C++ for that, like in the haze package for R.

Feature request: add number of neighbors in ball point queries with different radii as features?

We could add a new descriptor to each row: currently we have the neighborhoods (verts + normals) in a fixed radius, along with the count of neighbors (before filtering/limiting).

We could add as new feature: supply several radii, add just number of neighbors for them (no coords/normals).

Intention: this would characterize vertex density around vertex on different scales, and it fast to compute and implement.

Model validation: Plot predicted and computed lGI

We should predict lGI for some meshes that are not part of the training dataset, compute it for them, and create figures that show:

  • the 2 different (predicted/computed) overlays next to each other
  • the difference (error), e.g., as mae, rmse, or similar as a per-vertex overlay
  • we could also map the error for a group of subjects to fsaverage and show the mean over subjects at each vertex then.

Feature Engineering: Add more global mesh descriptors

E.g.,:

  • We could approximate TBV by len_x * len_y * len_z, where len_x is: max(x_coords) - min(x_coords) of the mesh.
  • We could compute global measures like average curvature over all vertices, total edge count, total face count, etc.

Architecture: split function into submodules 'data' and 'model'

Would be nice to split them to have a clear separation. In the long term, I would like to split the whole package into a (pypi-published, official) data load for mesh neighborhoods and my private (as very problem-specific) modeling package.

Currently having 2 separate packages would be way too inconvenient though.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.