HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model

Di Wang^{1 ∗}, Meiqi Hu^{1 ∗}, Yao Jin^{1 ∗}, Yuchun Miao^{1 ∗}, Jiaqi Yang^{1 ∗}, Yichu Xu^{1 ∗}, Xiaolei Qin^{1 ∗}, Jiaqi Ma^{1 ∗}, Lingyu Sun^{1 ∗}, Chenxing Li^{1 ∗}, Chuan Fu², Hongruixuan Chen³, Chengxi Han^{1 †}, Naoto Yokoya³, Jing Zhang^{1 †}, Minqiang Xu⁴, Lin Liu⁴, Lefei Zhang¹, Chen Wu^{1 †}, Bo Du^{1 †}, Dacheng Tao⁵, Liangpei Zhang^{1 †}

¹ Wuhan University, ² Chongqing University, ³ The University of Tokyo, ⁴ National Engineering Research Center of Speech and Language Information Processing, ⁵ Nanyang Technological University.

^∗ Equal contribution, ^† Corresponding author

🔥 Update

2024.06.18

The paper is post on arxiv!(arXiv 2406.11519)

🌞 Overview

HyperSIGMA is the first billion-level foundation model specifically designed for HSI interpretation. To tackle the spectral and spatial redundancy challenges in HSIs, we introduce a novel sparse sampling attention (SSA) mechanism, which effectively promotes the learning of diverse contextual features and serves as the basic block of HyperSIGMA. HyperSIGMA integrates spatial and spectral features using a specially designed spectral enhancement module.

Figure 1. Framework of HyperSIGMA.

Extensive experiments on various high-level and low-level HSI tasks demonstrate HyperSIGMA’s versatility and superior representational capability compared to current state-of-the-art methods. It outperforms advanced models like SpectralGPT, even those specifically designed for these tasks.

Figure 2. HyperSIGMA demonstrates superior performance across 16 datasets and 7 tasks, including both high-level and low-level hyperspectral tasks, as well as multispectral scenes.

📖 Datasets

To train the foundational model, we collected hyperspectral remote sensing image samples from around the globe, constructing a large-scale hyperspectral dataset named HyperGlobal-450K for pre-training. HyperGlobal-450K contains over 20 million three-band images, far exceeding the scale of existing hyperspectral datasets.

Figure 3. The distribution of HyperGlobal-450K samples across the globe, comprising 1,701 images (1,486 EO-1 and 215 GF-5B) with hundreds of spectral bands.

🚀 Pretrained Models

Pretrain	Backbone	Model Weights
Spatial_MAE	ViT-B	Baidu & Huggingface
Spatial_MAE	ViT-L	Baidu
Spatial_MAE	ViT-H	Baidu
Spectral_MAE	ViT-B	Baidu & Huggingface
Spectral_MAE	ViT-L	Baidu
Spectral_MAE	ViT-H	Baidu

🔨 Usage

Pretraining

We pretrain the HyperSIGMA with SLURM. This is an example of pretraining the large version of Spatial ViT:

srun -J spatmae -p xahdnormal --gres=dcu:4 --ntasks=64 --ntasks-per-node=4 --cpus-per-task=8 --kill-on-bad-exit=1 \
python main_pretrain_Spat.py \
--model 'spat_mae_l' --norm_pix_loss \
--data_path [pretrain data path] \
--output_dir [model saved patch] \
--log_dir [log saved path] \
--blr 1.5e-4 --batch_size 32 --gpu_num 64 --port 60001

Another example of pretraining the huge version of Spectral ViT:

srun -J specmae -p xahdnormal --gres=dcu:4 --ntasks=128 --ntasks-per-node=4 --cpus-per-task=8 --kill-on-bad-exit=1 \
python main_pretrain_Spec.py \
--model 'spec_mae_h' --norm_pix_loss \
--data_path [pretrain data path] \
--output_dir [model saved patch] \
--log_dir [log saved path] \
--blr 1.5e-4 --batch_size 16 --gpu_num 128 --port 60004  --epochs 1600 --mask_ratio 0.75 \
--use_ckpt 'True'

The training can be recovered by setting --resume

--resume [path of saved model]

Finetuning

Image Classification:

Please refer to ImageClassification-README.

Target Detection & Anomaly Detection:

Please refer to HyperspectralDetection-README.

Change Detection:

Please refer to ChangeDetection-README.

Spectral Unmixing:

Please refer to HyperspectralUnmixing-README.

Denoising:

Please refer to Denoising-README.

Super-Resolution:

Please refer to SR-README.

Multispectral Change Detection:

Please refer to MultispectralCD-README.

⭐ Citation

If you find HyperSIGMA helpful, please consider giving this repo a ⭐ and citing:

@article{hypersigma,
  title={HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model},
  author={Wang, Di and Hu, Meiqi and Jin, Yao and Miao, Yuchun and Yang, Jiaqi and Xu, Yichu and Qin, Xiaolei and Ma, Jiaqi and Sun, Lingyu and Li, Chenxing and Fu, Chuan and Chen, Hongruixuan and Han, Chengxi and Yokoya, Naoto and Zhang, Jing and Xu, Minqiang and Liu, Lin and Zhang, Lefei and Wu, Chen and Du, Bo and Tao, Dacheng and Zhang, Liangpei},
  journal={arXiv preprint arXiv:2406.11519},
  year={2024}
}

🎺 Statement

For any other questions please contact di.wang at gmail.com or whu.edu.cn, and chengxi.han at whu.edu.cn.

💖 Thanks

This project is based on MMCV, MAE, Swin Transformer, VSA, RVSA, DAT, HTD-IRN, GT-HAD, MSDformer, SST-Former, CNNAEU and DeepTrans. Thanks for their wonderful work!

TypeError: spat_vit_b_rvsa() got an unexpected keyword argument 'args'

Hello, I have two questions.

(1) I entered the command below, but I don't know why the error below occurs.
$ bash train_complex.sh 1e-4 hypersigma 2 l2 100

====================Command Result Start======================
Namespace(prefix='hypersigma_gaussian', arch='hypersigma', batchSize=4, lr=0.0001, wd=0, loss='l2', testdir=None, sigma=None, training_dataset_path='./dataset/WDC/training/wdc.db', pretrain='/mnt/code/users/yuchunmiao/hypersigma-master/pre_train/checkpoint-400.pth', init='kn', no_cuda=False, from_scratch=False, pretrain_path='./pre_train/spat-vit-base-ultra-checkpoint-1599.pth', no_log=False, threads=1, seed=2018, resume=False, no_ropt=False, chop=False, resumePath=None, dataroot='/data/HSI_Data/ICVL64_31.db', clip=1000000.0, gpu_ids=[0], basedir='./output/original_hypersigma_1e-4_spat-vit-base-ultra-checkpoint-1599_batch4_warmup_l2_epoch_100_gaussian_new_fusion', epoch=100, update_lr=5e-05, meta_lr=5e-05, n_way=1, k_spt=2, k_qry=5, task_num=16, update_step=5, update_step_test=10)
Cuda Acess: 1
=> creating model 'hypersigma'
load our vit fusion_new_v5 final models
Traceback (most recent call last):
File "/data/jwjang/project/hsi_foundation/HyperSIGMA/ImageDenoising/hsi_denoising_gaussian_wdc.py", line 22, in
engine = Engine(opt)
File "/data/jwjang/project/hsi_foundation/HyperSIGMA/ImageDenoising/hsi_setup.py", line 146, in init
self.__setup()
File "/data/jwjang/project/hsi_foundation/HyperSIGMA/ImageDenoising/hsi_setup.py", line 173, in __setup
self.net = hypersigma(args=self.opt)
TypeError: spat_vit_b_rvsa() got an unexpected keyword argument 'args'
====================Command Result End======================

Any help is highly appreciated ;)

Thanks!

and, (2) I made the Washington dc mall.tif file( dc.tif ) into a train_0.mat, train_1.mat file through utility/mat_data.py in Image Denoising. Then I'm trying to make a train dataset. How can I make two mat files into a wdc.db file?

Or is there any other way to create a wbc.db file in training_dataset_path?

Please, suggest me how to do this.

whu-sigma / hypersigma Goto Github PK

hypersigma's Introduction

HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model

🔥 Update

🌞 Overview

📖 Datasets

🚀 Pretrained Models

🔨 Usage

Pretraining

Finetuning

⭐ Citation

🎺 Statement

💖 Thanks

hypersigma's People

Contributors

Stargazers

Watchers

Forkers

hypersigma's Issues

Recommend Projects

Recommend Topics

Recommend Org