Giter VIP home page Giter VIP logo

whu-sigma / hypersigma Goto Github PK

View Code? Open in Web Editor NEW
108.0 4.0 7.0 26.2 MB

The official repo for the paper "HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model"

Python 94.84% Shell 0.13% Jupyter Notebook 5.03%
computer-vision deep-learning hyperspectral-anomaly-detection hyperspectral-datasets hyperspectral-image-classification hyperspectral-image-denoising hyperspectral-image-segmentation hyperspectral-unmixing pytorch remote-sensing

hypersigma's Introduction

HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model

Di Wang1 ∗, Meiqi Hu1 ∗, Yao Jin1 ∗, Yuchun Miao1 ∗, Jiaqi Yang1 ∗, Yichu Xu1 ∗, Xiaolei Qin1 ∗, Jiaqi Ma1 ∗, Lingyu Sun1 ∗, Chenxing Li1 ∗, Chuan Fu2, Hongruixuan Chen3, Chengxi Han1 †, Naoto Yokoya3, Jing Zhang1 †, Minqiang Xu4, Lin Liu4, Lefei Zhang1, Chen Wu1 †, Bo Du1 †, Dacheng Tao5, Liangpei Zhang1 †

1 Wuhan University, 2 Chongqing University, 3 The University of Tokyo, 4 National Engineering Research Center of Speech and Language Information Processing, 5 Nanyang Technological University.

Equal contribution, Corresponding author

Hits Hits Hits Hits

Update | Overview | Datasets | Pretrained Models | Usage | Statement

🔥 Update

2024.06.18

🌞 Overview

HyperSIGMA is the first billion-level foundation model specifically designed for HSI interpretation. To tackle the spectral and spatial redundancy challenges in HSIs, we introduce a novel sparse sampling attention (SSA) mechanism, which effectively promotes the learning of diverse contextual features and serves as the basic block of HyperSIGMA. HyperSIGMA integrates spatial and spectral features using a specially designed spectral enhancement module.

Figure 1. Framework of HyperSIGMA.


Extensive experiments on various high-level and low-level HSI tasks demonstrate HyperSIGMA’s versatility and superior representational capability compared to current state-of-the-art methods. It outperforms advanced models like SpectralGPT, even those specifically designed for these tasks.

Figure 2. HyperSIGMA demonstrates superior performance across 16 datasets and 7 tasks, including both high-level and low-level hyperspectral tasks, as well as multispectral scenes.

📖 Datasets

To train the foundational model, we collected hyperspectral remote sensing image samples from around the globe, constructing a large-scale hyperspectral dataset named HyperGlobal-450K for pre-training. HyperGlobal-450K contains over 20 million three-band images, far exceeding the scale of existing hyperspectral datasets.

Figure 3. The distribution of HyperGlobal-450K samples across the globe, comprising 1,701 images (1,486 EO-1 and 215 GF-5B) with hundreds of spectral bands.

🚀 Pretrained Models

Pretrain Backbone Model Weights
Spatial_MAE ViT-B Baidu & Huggingface
Spatial_MAE ViT-L Baidu
Spatial_MAE ViT-H Baidu
Spectral_MAE ViT-B Baidu & Huggingface
Spectral_MAE ViT-L Baidu
Spectral_MAE ViT-H Baidu

🔨 Usage

Pretraining

We pretrain the HyperSIGMA with SLURM. This is an example of pretraining the large version of Spatial ViT:

srun -J spatmae -p xahdnormal --gres=dcu:4 --ntasks=64 --ntasks-per-node=4 --cpus-per-task=8 --kill-on-bad-exit=1 \
python main_pretrain_Spat.py \
--model 'spat_mae_l' --norm_pix_loss \
--data_path [pretrain data path] \
--output_dir [model saved patch] \
--log_dir [log saved path] \
--blr 1.5e-4 --batch_size 32 --gpu_num 64 --port 60001

Another example of pretraining the huge version of Spectral ViT:

srun -J specmae -p xahdnormal --gres=dcu:4 --ntasks=128 --ntasks-per-node=4 --cpus-per-task=8 --kill-on-bad-exit=1 \
python main_pretrain_Spec.py \
--model 'spec_mae_h' --norm_pix_loss \
--data_path [pretrain data path] \
--output_dir [model saved patch] \
--log_dir [log saved path] \
--blr 1.5e-4 --batch_size 16 --gpu_num 128 --port 60004  --epochs 1600 --mask_ratio 0.75 \
--use_ckpt 'True'

The training can be recovered by setting --resume

--resume [path of saved model]

Finetuning

Image Classification:

Please refer to ImageClassification-README.

Target Detection & Anomaly Detection:

Please refer to HyperspectralDetection-README.

Change Detection:

Please refer to ChangeDetection-README.

Spectral Unmixing:

Please refer to HyperspectralUnmixing-README.

Denoising:

Please refer to Denoising-README.

Super-Resolution:

Please refer to SR-README.

Multispectral Change Detection:

Please refer to MultispectralCD-README.

⭐ Citation

If you find HyperSIGMA helpful, please consider giving this repo a ⭐ and citing:

@article{hypersigma,
  title={HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model},
  author={Wang, Di and Hu, Meiqi and Jin, Yao and Miao, Yuchun and Yang, Jiaqi and Xu, Yichu and Qin, Xiaolei and Ma, Jiaqi and Sun, Lingyu and Li, Chenxing and Fu, Chuan and Chen, Hongruixuan and Han, Chengxi and Yokoya, Naoto and Zhang, Jing and Xu, Minqiang and Liu, Lin and Zhang, Lefei and Wu, Chen and Du, Bo and Tao, Dacheng and Zhang, Liangpei},
  journal={arXiv preprint arXiv:2406.11519},
  year={2024}
}

🎺 Statement

For any other questions please contact di.wang at gmail.com or whu.edu.cn, and chengxi.han at whu.edu.cn.

💖 Thanks

This project is based on MMCV, MAE, Swin Transformer, VSA, RVSA, DAT, HTD-IRN, GT-HAD, MSDformer, SST-Former, CNNAEU and DeepTrans. Thanks for their wonderful work!

hypersigma's People

Contributors

chengxihan avatar chenhongruixuan avatar dotwang avatar jqyang22 avatar leonmakise avatar meiqihu avatar miaoyuchun avatar xiaoleiqinn avatar yichuxu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

hypersigma's Issues

TypeError: spat_vit_b_rvsa() got an unexpected keyword argument 'args'

Hello, I have two questions.

(1) I entered the command below, but I don't know why the error below occurs.
$ bash train_complex.sh 1e-4 hypersigma 2 l2 100

====================Command Result Start======================
Namespace(prefix='hypersigma_gaussian', arch='hypersigma', batchSize=4, lr=0.0001, wd=0, loss='l2', testdir=None, sigma=None, training_dataset_path='./dataset/WDC/training/wdc.db', pretrain='/mnt/code/users/yuchunmiao/hypersigma-master/pre_train/checkpoint-400.pth', init='kn', no_cuda=False, from_scratch=False, pretrain_path='./pre_train/spat-vit-base-ultra-checkpoint-1599.pth', no_log=False, threads=1, seed=2018, resume=False, no_ropt=False, chop=False, resumePath=None, dataroot='/data/HSI_Data/ICVL64_31.db', clip=1000000.0, gpu_ids=[0], basedir='./output/original_hypersigma_1e-4_spat-vit-base-ultra-checkpoint-1599_batch4_warmup_l2_epoch_100_gaussian_new_fusion', epoch=100, update_lr=5e-05, meta_lr=5e-05, n_way=1, k_spt=2, k_qry=5, task_num=16, update_step=5, update_step_test=10)
Cuda Acess: 1
=> creating model 'hypersigma'
load our vit fusion_new_v5 final models
Traceback (most recent call last):
File "/data/jwjang/project/hsi_foundation/HyperSIGMA/ImageDenoising/hsi_denoising_gaussian_wdc.py", line 22, in
engine = Engine(opt)
File "/data/jwjang/project/hsi_foundation/HyperSIGMA/ImageDenoising/hsi_setup.py", line 146, in init
self.__setup()
File "/data/jwjang/project/hsi_foundation/HyperSIGMA/ImageDenoising/hsi_setup.py", line 173, in __setup
self.net = hypersigma(args=self.opt)
TypeError: spat_vit_b_rvsa() got an unexpected keyword argument 'args'
====================Command Result End======================

Any help is highly appreciated ;)

Thanks!

and, (2) I made the Washington dc mall.tif file( dc.tif ) into a train_0.mat, train_1.mat file through utility/mat_data.py in Image Denoising. Then I'm trying to make a train dataset. How can I make two mat files into a wdc.db file?

Or is there any other way to create a wbc.db file in training_dataset_path?

Please, suggest me how to do this.

About HyperGlobal-450K dataset download.

This is really an excellent work. I am trying to reproduce it. But I can't seem to find the download link of the pretrained dataset HyperGlobal-450K. Could you provide the download link of the pretrained dataset?

Excellent work

I think you should also thank the OpenMMLab MMCV for your excellent work😋

Invalid load key

Hi,
I'm trying to run the segmentation demo on the indian pines dataset.
For this, I'm running the "ImageClassification/demo seg hypersigma.ipynb" notebook with the spat-vit-base-ultra-checkpoint-1599.pth model from HuggingFace.

When loading the model I receive a pickle error:

_UnpicklingError Traceback (most recent call last)
Cell In[31], line 2
1 model_params =model.state_dict()
----> 2 spat_net = torch.load((r"spat-base.pth"), map_location=torch.device('cpu'))
3 for k in list(spat_net['model'].keys()):
4 if 'patch_embed.proj' in k:

File /opt/conda/lib/python3.10/site-packages/torch/serialization.py:1028, in load(f, map_location, pickle_module, weights_only, mmap, **pickle_load_args)
1026 except RuntimeError as e:
1027 raise pickle.UnpicklingError(UNSAFE_MESSAGE + str(e)) from None
-> 1028 return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)

File /opt/conda/lib/python3.10/site-packages/torch/serialization.py:1246, in _legacy_load(f, map_location, pickle_module, **pickle_load_args)
1240 if not hasattr(f, 'readinto') and (3, 8, 0) <= sys.version_info < (3, 8, 2):
1241 raise RuntimeError(
1242 "torch.load does not work with file-like objects that do not implement readinto on Python 3.8.0 and 3.8.1. "
1243 f"Received object of type "{type(f)}". Please update to Python 3.8.2 or newer to restore this "
1244 "functionality.")
-> 1246 magic_number = pickle_module.load(f, **pickle_load_args)
1247 if magic_number != MAGIC_NUMBER:
1248 raise RuntimeError("Invalid magic number; corrupt file?")

UnpicklingError: invalid load key, '\xff'._

Could it be related to the pytorch version (2.1.2)??
Thank you very much in advance!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.