Giter VIP home page Giter VIP logo

scale-mae's Introduction

Scale-MAE 🛰️

image

This repository provides a reimplementation of the code for Scale-MAE: A Scale-Aware Masked Autoencoder for Multiscale Geospatial Representation Learning (the original code was optimized for our distributed cluster).

@article{reed2022scale,
  title={Scale-MAE: A Scale-Aware Masked Autoencoder for Multiscale Geospatial Representation Learning},
  author={Reed, Colorado J and Gupta, Ritwik and Li, Shufan and Brockman, Sarah and Funk, Christopher and Clipp, Brian and Candido, Salvatore and Uyttendaele, Matt and Darrell, Trevor},
  journal={arXiv preprint arXiv:2212.14532},
  year={2022}
}
  • This repo is a modification on the MAE repo. Installation and preparation follow that repo ;-).

  • As mentioned in the MAE repo, this repo is based on timm==0.3.2, for which a fix is needed to work with PyTorch 1.8.1+. In addition, install gdal, rasterio, and Shapely. This tends to work pretty well (but gdal is notoriously tricky):

Installation

conda create -n scalemae python=3.9 geopandas # geopandas should install gdal correctly
conda activate scalemae
# replace with your desired pytorch target (e.g. cuda version)
conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia
pip install -e .

Data Preparation

Download the FMoW-rgb dataset as described in the here and then make a symlink to the data directory in the root of this repo. For example, if you downloaded the data to ~/data/fmow-rgb, then run:

ln -s ~/data/fmow-rgb data

Pretraining

Datasets are defined by config files in config.

# change to num of gpus you have
python -m torch.distributed.launch --nproc_per_node=4
main_pretrain.py

use -h to see details of all arguments.

Pretrained Models

Evaluation

KNN Evaluation

python -m torch.distributed.launch --nproc_per_node=4 \
main_pretrain.py \
--resume <path-to-model-checkpoint.pth> \
--eval_only \
--eval_dataset <eval_dataset_name>  \
--eval_train_fnames <train_split_file>  \
--eval_val_fnames <val_split_file>

We support resisc (default), airound, mlrsnet, and fmow kNN evaluation. We provide all split files in splits folder. If --eval_train_fnames and --eval_val_fnames are specified, the content of these two txt files will be read as the train split and test split. If this is the case, the root folder of the dataset is assumed to be the parent folder of such txt files. Alternatively, one can specify --eval_path. If this is the case, 90% of the data is randomly selected as the training set while the 10% is selected as the test set. The dataset is assumed to have the standard structure of ImageFolder in torchvision.

Finetuning

python -m torch.distributed.launch --nproc_per_node=4 \
main_linprobe.py \
--checkpoint_path <path-to-model-checkpoint.pth>

Use the flag --finetune to enable full fine-tuning instead of a linear probing.


Note: THIS SOFTWARE AND/OR DATA WAS DEPOSITED IN THE BAIR OPEN RESEARCH COMMONS REPOSITORY ON 2/8/23.

scale-mae's People

Contributors

cjrd avatar ritwikgupta avatar jacklishufan avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.