Giter VIP home page Giter VIP logo

giocoal / algonauts2023-image-fmri-encoding-model Goto Github PK

View Code? Open in Web Editor NEW
1.0 2.0 0.0 713.26 MB

Code for my Master's Thesis "Deep Neural Encoding Models of the Human Visual Cortex to Predict fMRI Responses to Natural Visual Scenes" and my submission for the "Algonauts Project 2023 Challenge".

Home Page: http://algonauts.csail.mit.edu/challenge.html

License: Creative Commons Zero v1.0 Universal

Python 100.00%
algonauts-project brain-encoding dinov2 fmri-data-analysis fmri-roi-analysis natural-scenes-dataset pytorch algonauts-challenge

algonauts2023-image-fmri-encoding-model's Introduction

Code for the Master's Thesis "Deep Neural Encoding Models of the Human Visual Cortex to Predict fMRI Responses to Natural Visual Scenes".

Research Internship - MSc in Data Science - University of Milano-Bicocca - Imaging and Vision Laboratory.

This is a repository for my submission for the Algonauts Project 2023 Challenge (id: giorgiocarbone).

Contributors Forks Stargazers Issues MIT License LinkedIn Slides Thesis

Table of contents

Introduction

One of the main objectives of computational neuroscience is to comprehend the biological mechanisms that enable humans to perceive, process, and understand complex visual scenes. Visual neural encoding models are computational models that mimic the hierarchical processes underlying the human visual system and aim to explain the relationship between visual stimuli and corresponding neural activations evoked in the human visual cortex. A visual encoder can serve as a structured system for testing biological hypotheses concerning how visual information is processed, represented, and organized in the human brain.

The main objective of this thesis is to develop a comprehensive voxel-based and subject-specific image-fMRI neural encoding model of the human visual cortex based on Deep Neural Networks (DNNs) and transfer learning for the prediction of local neural blood oxygen level-dependent (BOLD) functional magnetic resonance imaging (fMRI) responses to complex visual stimuli.

We applied a two-step linearizing strategy to visual encoding, based on the use of two separate computational models respectively for the non-linear feature mapping (employing pre-trained computer vision DNNs as feature extractors) of the stimulus image into its latent representations and the subsequent linear activity mapping of the visual features into the BOLD response amplitudes of the individual voxels, using Principal Component Analysis (PCA) to reduce the dimensionality of the visual features and independent ridge regression models to map the PCA components in the activity of each voxel.

Furthermore, in order to meet the criteria of mappability and predictivity that characterize a good encoding model, we adopted a ROI-wise and mixed encoding strategy, modeling the encoding of voxels belonging to different regions of interest (ROIs, groups of voxels that share functional properties) separately to achieve maximum accuracy across the entire visual cortex and within individual ROIs. To determine the best feature mapping method for each region of interest, we tested the extraction of visual features from layers at varying depths of several pre-trained Convolutional Neural Networks (AlexNet, ZFNet, RetinaNet, EfficientNet-B2, VGG-16, VGG-19) and Vision Transformers (ViTs), characterized by different training parameters (training goal, training dataset, and learning method). During this testing phase, the existence of similarity and functional alignment between the hierarchical architecture of the pre-trained DNNs and the structure of the visual cortex emerged, a result that motivated the use of the ROI-wise strategy.

The proposed model achieves, in predicting the neural responses to the images of the test set of the Algonauts Project 2023 Challenge dataset, an overall accuracy score of 0.52, expressed as the Median Noise Normalized Squared Correlation (MNNSC) across all voxels of the cortical surfaces of all subjects, outperforming the baseline model proposed by the challenge organizers (which achieved a score of 0.41). The results of this thesis demonstrate the effectiveness of mixed, ROI-wise, deep, and transfer learning-based approaches in the context of image-fMRI visual encoding modeling.

Dataset

The thesis project was developed using the Algonauts Project 2023 Challenge dataset, a large collection of eight subjects' fMRI responses to visual scenes. During the fMRI scans, each subject viewed 9,000-10,000 colored natural scenes, and the corresponding activations for the 39,548 voxels of the visual cortex were encoded as betas, which are single-value estimates of the amplitude of the BOLD fMRI response, indirectly representing the activation or deactivation of the neurons in a specific voxel evoked by viewing a stimulus.

Requirements

  • Python 3.9.16
  • CUDA Toolkit 11.6
  • CuDNN 8302
  • Pillow 9.2.0
  • NiBabel 5.2.0
  • Nilearn 0.10.3
  • Plotly 5.14.1
  • torch 1.13.0
  • torchvision 0.14.0
  • Transformers 4.31.0
  • PyTorchCV 0.0.67
  • EfficientNet-PyTorch 0.7.1
  • matplotlib 3.5.2
  • numpy 1.22.4
  • pandas 1.5.3
  • scikit_learn 1.1.1
  • scipy 1.7.3
  • tqdm 4.64.1
  • torchmetrics 0.11.4
  • plotly 5.14.1

Status

Project is: ##c5f015 Done

Contact

Feel free to contact me!

algonauts2023-image-fmri-encoding-model's People

Contributors

giocoal avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.