`Awesome Remote Sensing Foundation Models`

🌟A collection of papers, datasets, benchmarks, code, and pre-trained weights for Remote Sensing Foundation Models (RSFMs).

🔥🔥🔥 Last Updated on 2024.03.17 🔥🔥🔥

Remote Sensing Vision Foundation Models

Abbreviation	Title	Publication	Paper	Code & Weights
GeoKR	Geographical Knowledge-Driven Representation Learning for Remote Sensing Images	TGRS2021	GeoKR	link
-	Self-Supervised Learning of Remote Sensing Scene Representations Using Contrastive Multiview Coding	CVPRW2021	Paper	link
GASSL	Geography-Aware Self-Supervised Learning	ICCV2021	GASSL	link
SeCo	Seasonal Contrast: Unsupervised Pre-Training From Uncurated Remote Sensing Data	ICCV2021	SeCo	link
DINO-MM	Self-supervised Vision Transformers for Joint SAR-optical Representation Learning	IGARSS2022	DINO-MM	link
SatMAE	SatMAE: Pre-training Transformers for Temporal and Multi-Spectral Satellite Imagery	NeurIPS2022	SatMAE	link
RS-BYOL	Self-Supervised Learning for Invariant Representations From Multi-Spectral and SAR Images	JSTARS2022	RS-BYOL	null
GeCo	Geographical Supervision Correction for Remote Sensing Representation Learning	TGRS2022	GeCo	null
RingMo	RingMo: A remote sensing foundation model with masked image modeling	TGRS2022	RingMo	Code
RVSA	Advancing plain vision transformer toward remote sensing foundation model	TGRS2022	RVSA	link
RSP	An Empirical Study of Remote Sensing Pretraining	TGRS2022	RSP	link
MATTER	Self-Supervised Material and Texture Representation Learning for Remote Sensing Tasks	CVPR2022	MATTER	null
CSPT	Consecutive Pre-Training: A Knowledge Transfer Learning Strategy with Relevant Unlabeled Data for Remote Sensing Domain	RS2022	CSPT	link
-	Self-supervised Vision Transformers for Land-cover Segmentation and Classification	CVPRW2022	Paper	link
BFM	A billion-scale foundation model for remote sensing images	Arxiv2023	BFM	null
TOV	TOV: The original vision model for optical remote sensing image understanding via self-supervised learning	JSTARS2023	TOV	link
CMID	CMID: A Unified Self-Supervised Learning Framework for Remote Sensing Image Understanding	TGRS2023	CMID	link
RingMo-Sense	RingMo-Sense: Remote Sensing Foundation Model for Spatiotemporal Prediction via Spatiotemporal Evolution Disentangling	TGRS2023	RingMo-Sense	null
IaI-SimCLR	Multi-Modal Multi-Objective Contrastive Learning for Sentinel-1/2 Imagery	CVPRW2023	IaI-SimCLR	null
CACo	Change-Aware Sampling and Contrastive Learning for Satellite Images	CVPR2023	CACo	link
SatLas	SatlasPretrain: A Large-Scale Dataset for Remote Sensing Image Understanding	ICCV2023	SatLas	link
GFM	Towards Geospatial Foundation Models via Continual Pretraining	ICCV2023	GFM	link
Scale-MAE	Scale-MAE: A Scale-Aware Masked Autoencoder for Multiscale Geospatial Representation Learning	ICCV2023	Scale-MAE	link
DINO-MC	DINO-MC: Self-supervised Contrastive Learning for Remote Sensing Imagery with Multi-sized Local Crops	Arxiv2023	DINO-MC	link
CROMA	CROMA: Remote Sensing Representations with Contrastive Radar-Optical Masked Autoencoders	NeurIPS2023	CROMA	link
Cross-Scale MAE	Cross-Scale MAE: A Tale of Multiscale Exploitation in Remote Sensing	NeurIPS2023	Cross-Scale MAE	link
DeCUR	DeCUR: decoupling common & unique representations for multimodal self-supervision	Arxiv2023	DeCUR	link
Presto	Lightweight, Pre-trained Transformers for Remote Sensing Timeseries	Arxiv2023	Presto	link
CtxMIM	CtxMIM: Context-Enhanced Masked Image Modeling for Remote Sensing Image Understanding	Arxiv2023	CtxMIM	null
XGeo	Multisensory Geospatial Models via Cross-Sensor Pretraining	-	XGeo	null
FG-MAE	Feature Guided Masked Autoencoder for Self-supervised Learning in Remote Sensing	Arxiv2023	FG-MAE	link
Prithvi	Foundation Models for Generalist Geospatial Artificial Intelligence	Arxiv2023	Prithvi	link
RingMo-lite	RingMo-lite: A Remote Sensing Multi-task Lightweight Network with CNN-Transformer Hybrid Framework	Arxiv2023	RingMo-lite	null
-	A Self-Supervised Cross-Modal Remote Sensing Foundation Model with Multi-Domain Representation and Cross-Domain Fusion	IGARSS2023	Paper	null
EarthPT	EarthPT: a foundation model for Earth Observation	NeurIPS2023 CCAI workshop	EarthPT	link
USat	USat: A Unified Self-Supervised Encoder for Multi-Sensor Satellite Imagery	Arxiv2023	USat	link
FoMo-Bench	FoMo-Bench: a multi-modal, multi-scale and multi-task Forest Monitoring Benchmark for remote sensing foundation models	Arxiv2023	FoMo-Bench	Comming soon
AIEarth	Analytical Insight of Earth: A Cloud-Platform of Intelligent Computing for Geospatial Big Data	Arxiv2023	AIEarth	link
-	Self-Supervised Learning for SAR ATR with a Knowledge-Guided Predictive Architecture	Arxiv2023	Paper	null
Clay	Clay Foundation Model	-	null	link
U-BARN	Self-Supervised Spatio-Temporal Representation Learning of Satellite Image Time Series	JSTARS2024	Paper	null
GeRSP	Generic Knowledge Boosted Pre-training For Remote Sensing Images	Arxiv2024	GeRSP	GeRSP
SwiMDiff	SwiMDiff: Scene-wide Matching Contrastive Learning with Diffusion Constraint for Remote Sensing Image	Arxiv2024	SwiMDiff	null
OFA-Net	One for All: Toward Unified Foundation Models for Earth Vision	Arxiv2024	OFA-Net	null
SMLFR	Generative ConvNet Foundation Model With Sparse Modeling and Low-Frequency Reconstruction for Remote Sensing Image Interpretation	TGRS2024	SMLFR	link
SpectralGPT	SpectralGPT: Spectral Foundation Model	TPAMI2024	SpectralGPT	link
SkySense	SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery	CVPR2024	SkySense	Comming soon

Few-shot Remote Sensing Vision Foundation Models

Abbreviation	Title	Publication	Paper	Code & Weights
METEOR	Meta-learning to address diverse Earth observation problems across resolutions	Nature Communications Earth & Environment	METEOR	METEOR

Remote Sensing Vision-Language Foundation Models

Abbreviation	Title	Publication	Paper	Code & Weights
RSGPT	RSGPT: A Remote Sensing Vision Language Model and Benchmark	Arxiv2023	RSGPT	link
RemoteCLIP	RemoteCLIP: A Vision Language Foundation Model for Remote Sensing	Arxiv2023	RemoteCLIP	link
GRAFT	Remote Sensing Vision-Language Foundation Models without Annotations via Ground Remote Alignment	ICLR2024	GRAFT	null
-	Charting New Territories: Exploring the Geographic and Geospatial Capabilities of Multimodal LLMs	Arxiv2023	Paper	link
-	Remote Sensing ChatGPT: Solving Remote Sensing Tasks with ChatGPT and Visual Models	Arxiv2024	Paper	link
SkyEyeGPT	SkyEyeGPT: Unifying Remote Sensing Vision-Language Tasks via Instruction Tuning with Large Language Model	Arxiv2024	Paper	link
EarthGPT	EarthGPT: A Universal Multi-modal Large Language Model for Multi-sensor Image Comprehension in Remote Sensing Domain	Arxiv2024	Paper	null
GeoChat	GeoChat: Grounded Large Vision-Language Model for Remote Sensing	CVPR2024	GeoChat	link
LHRS-Bot	LHRS-Bot: Empowering Remote Sensing with VGI-Enhanced Large Multimodal Language Model	Arxiv2024	Paper	link

Remote Sensing Generative Foundation Models

Abbreviation	Title	Publication	Paper	Code & Weights
DiffusionSat	DiffusionSat: A Generative Foundation Model for Satellite Imagery	Arxiv2023	DiffusionSat	null
Seg2Sat	Seg2Sat - Segmentation to aerial view using pretrained diffuser models	Github	null	link
-	Generate Your Own Scotland: Satellite Image Generation Conditioned on Maps	NeurIPSW2023	Paper	link

Remote Sensing Vision-Location Foundation Models

Abbreviation	Title	Publication	Paper	Code & Weights
CSP	CSP: Self-Supervised Contrastive Spatial Pre-Training for Geospatial-Visual Representations	ICML2023	CSP	link
GeoCLIP	GeoCLIP: Clip-Inspired Alignment between Locations and Images for Effective Worldwide Geo-localization	NeurIPS2023	GeoCLIP	link
SatCLIP	SatCLIP: Global, General-Purpose Location Embeddings with Satellite Imagery	Arxiv2023	SatCLIP	link

Remote Sensing Vision-Audio Foundation Models

Abbreviation	Title	Publication	Paper	Code & Weights
-	Self-supervised audiovisual representation learning for remote sensing data	JAG2022	Paper	link

Benchmarks for RSFMs

Abbreviation	Title	Publication	Paper	Link	Downstream Tasks
-	Revisiting pre-trained remote sensing model benchmarks: resizing and normalization matters	Arxiv2023	Paper	link	Classification
GEO-Bench	GEO-Bench: Toward Foundation Models for Earth Monitoring	Arxiv2023	Paper	link	Classification & Segmentation
FoMo-Bench	FoMo-Bench: a multi-modal, multi-scale and multi-task Forest Monitoring Benchmark for remote sensing foundation models	Arxiv2023	FoMo-Bench	Comming soon	Classification & Segmentation & Detection for forest monitoring
PhilEO	PhilEO Bench: Evaluating Geo-Spatial Foundation Models	Arxiv2024	Paper	link	Segmentation & Regression estimation
SkySense	SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery	CVPR2024	SkySense	Comming Soon	Classification & Segmentation & Detection & Change detection & Multi-Modal Segmentation: Time-insensitive LandCover Mapping & Multi-Modal Segmentation: Time-sensitive Crop Mapping & Multi-Modal Scene Classification
VLEO-Bench	Good at captioning, bad at counting: Benchmarking GPT-4V on Earth observation data	Arxiv2024	VLEO-bench	link	Location Recognition & Captioning & Scene Classification & Counting & Detection & Change detection

(Large-scale) Pre-training Datasets

Abbreviation	Title	Publication	Paper	Attribute	Link
fMoW	Functional Map of the World	CVPR2018	fMoW	Vision	link
SEN12MS	SEN12MS -- A Curated Dataset of Georeferenced Multi-Spectral Sentinel-1/2 Imagery for Deep Learning and Data Fusion	-	SEN12MS	Vision	link
BEN-MM	BigEarthNet-MM: A Large Scale Multi-Modal Multi-Label Benchmark Archive for Remote Sensing Image Classification and Retrieval	GRSM2021	BEN-MM	Vision	link
MillionAID	On Creating Benchmark Dataset for Aerial Image Interpretation: Reviews, Guidances, and Million-AID	JSTARS2021	MillionAID	Vision	link
SeCo	Seasonal Contrast: Unsupervised Pre-Training From Uncurated Remote Sensing Data	ICCV2021	SeCo	Vision	link
fMoW-S2	SatMAE: Pre-training Transformers for Temporal and Multi-Spectral Satellite Imagery	NeurIPS2022	fMoW-S2	Vision	link
TOV-RS-Balanced	TOV: The original vision model for optical remote sensing image understanding via self-supervised learning	JSTARS2023	TOV	Vision	link
SSL4EO-S12	SSL4EO-S12: A Large-Scale Multi-Modal, Multi-Temporal Dataset for Self-Supervised Learning in Earth Observation	GRSM2023	SSL4EO-S12	Vision	link
SSL4EO-L	SSL4EO-L: Datasets and Foundation Models for Landsat Imagery	Arxiv2023	SSL4EO-L	Vision	link
SatlasPretrain	SatlasPretrain: A Large-Scale Dataset for Remote Sensing Image Understanding	ICCV2023	SatlasPretrain	Vision (Supervised)	link
CACo	Change-Aware Sampling and Contrastive Learning for Satellite Images	CVPR2023	CACo	Vision	Comming soon
RSVG	RSVG: Exploring Data and Models for Visual Grounding on Remote Sensing Data	TGRS2023	RSVG	Vision-Language	link
RS5M	RS5M: A Large Scale Vision-Language Dataset for Remote Sensing Vision-Language Foundation Model	Arxiv2023	RS5M	Vision-Language	link
GEO-Bench	GEO-Bench: Toward Foundation Models for Earth Monitoring	Arxiv2023	GEO-Bench	Vision (Evaluation)	link
RSICap & RSIEval	RSGPT: A Remote Sensing Vision Language Model and Benchmark	Arxiv2023	RSGPT	Vision-Language	Comming soon
Clay	Clay Foundation Model	-	null	Vision	link
SATIN	SATIN: A Multi-Task Metadataset for Classifying Satellite Imagery using Vision-Language Models	ICCVW2023	SATIN	Vision-Language	link
SkyScript	SkyScript: A Large and Semantically Diverse Vision-Language Dataset for Remote Sensing	AAAI2024	SkyScript	Vision-Language	link
ChatEarthNet	ChatEarthNet: A Global-Scale, High-Quality Image-Text Dataset for Remote Sensing	Arxiv2024	ChatEarthNet	Vision-Language	[Comming soon]

Survey Papers

Title	Publication	Paper	Attribute
Self-Supervised Remote Sensing Feature Learning: Learning Paradigms, Challenges, and Future Works	TGRS2023	Paper	Vision & Vision-Language
Vision-Language Models in Remote Sensing: Current Progress and Future Trends	Arxiv2023	Paper	Vision-Language
The Potential of Visual ChatGPT For Remote Sensing	Arxiv2023	Paper	Vision-Language
遥感大模型：进展与前瞻	武汉大学学报 (信息科学版) 2023	Paper	Vision & Vision-Language
地理人工智能样本：模型、质量与服务	武汉大学学报 (信息科学版) 2023	Paper	-
Brain-Inspired Remote Sensing Foundation Models and Open Problems: A Comprehensive Survey	JSTARS2023	Paper	Vision & Vision-Language
Revisiting pre-trained remote sensing model benchmarks: resizing and normalization matters	Arxiv2023	Paper	Vision
An Agenda for Multimodal Foundation Models for Earth Observation	IGARSS2023	Paper	Vision
Transfer learning in environmental remote sensing	RSE2024	Paper	Transfer learning
遥感基础模型发展综述与未来设想	遥感学报2023	Paper	-
On the Promises and Challenges of Multimodal Foundation Models for Geographical, Environmental, Agricultural, and Urban Planning Applications	Arxiv2023	Paper	Vision-Language

Cite

If you find this repository useful, please consider giving a star ⭐ and citation:

@InProceedings{guo2023skysense,
      title={SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery}, 
      author={Xin Guo and Jiangwei Lao and Bo Dang and Yingying Zhang and Lei Yu and Lixiang Ru and Liheng Zhong and Ziyuan Huang and Kang Wu and Dingxiang Hu and Huimei He and Jian Wang and Jingdong Chen and Ming Yang and Yongjun Zhang and Yansheng Li},
      booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
      month     = {},
      year      = {2024},
      pages     = {}
}

mrbourriz / awesome-remote-sensing-foundation-models Goto Github PK

awesome-remote-sensing-foundation-models's Introduction

`Awesome Remote Sensing Foundation Models`

Remote Sensing Vision Foundation Models

Few-shot Remote Sensing Vision Foundation Models

Remote Sensing Vision-Language Foundation Models

Remote Sensing Generative Foundation Models

Remote Sensing Vision-Location Foundation Models

Remote Sensing Vision-Audio Foundation Models

Benchmarks for RSFMs

(Large-scale) Pre-training Datasets

Survey Papers

Cite

awesome-remote-sensing-foundation-models's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent