Giter VIP home page Giter VIP logo

awesome-remote-sensing-foundation-models's Introduction

Maintenance Awesome GitHub watchers GitHub stars GitHub forks

Awesome Remote Sensing Foundation Models

🌟A collection of papers, datasets, benchmarks, code, and pre-trained weights for Remote Sensing Foundation Models (RSFMs).

🔥🔥🔥 Last Updated on 2024.03.17 🔥🔥🔥

Remote Sensing Vision Foundation Models

Abbreviation Title Publication Paper Code & Weights
GeoKR Geographical Knowledge-Driven Representation Learning for Remote Sensing Images TGRS2021 GeoKR link
- Self-Supervised Learning of Remote Sensing Scene Representations Using Contrastive Multiview Coding CVPRW2021 Paper link
GASSL Geography-Aware Self-Supervised Learning ICCV2021 GASSL link
SeCo Seasonal Contrast: Unsupervised Pre-Training From Uncurated Remote Sensing Data ICCV2021 SeCo link
DINO-MM Self-supervised Vision Transformers for Joint SAR-optical Representation Learning IGARSS2022 DINO-MM link
SatMAE SatMAE: Pre-training Transformers for Temporal and Multi-Spectral Satellite Imagery NeurIPS2022 SatMAE link
RS-BYOL Self-Supervised Learning for Invariant Representations From Multi-Spectral and SAR Images JSTARS2022 RS-BYOL null
GeCo Geographical Supervision Correction for Remote Sensing Representation Learning TGRS2022 GeCo null
RingMo RingMo: A remote sensing foundation model with masked image modeling TGRS2022 RingMo Code
RVSA Advancing plain vision transformer toward remote sensing foundation model TGRS2022 RVSA link
RSP An Empirical Study of Remote Sensing Pretraining TGRS2022 RSP link
MATTER Self-Supervised Material and Texture Representation Learning for Remote Sensing Tasks CVPR2022 MATTER null
CSPT Consecutive Pre-Training: A Knowledge Transfer Learning Strategy with Relevant Unlabeled Data for Remote Sensing Domain RS2022 CSPT link
- Self-supervised Vision Transformers for Land-cover Segmentation and Classification CVPRW2022 Paper link
BFM A billion-scale foundation model for remote sensing images Arxiv2023 BFM null
TOV TOV: The original vision model for optical remote sensing image understanding via self-supervised learning JSTARS2023 TOV link
CMID CMID: A Unified Self-Supervised Learning Framework for Remote Sensing Image Understanding TGRS2023 CMID link
RingMo-Sense RingMo-Sense: Remote Sensing Foundation Model for Spatiotemporal Prediction via Spatiotemporal Evolution Disentangling TGRS2023 RingMo-Sense null
IaI-SimCLR Multi-Modal Multi-Objective Contrastive Learning for Sentinel-1/2 Imagery CVPRW2023 IaI-SimCLR null
CACo Change-Aware Sampling and Contrastive Learning for Satellite Images CVPR2023 CACo link
SatLas SatlasPretrain: A Large-Scale Dataset for Remote Sensing Image Understanding ICCV2023 SatLas link
GFM Towards Geospatial Foundation Models via Continual Pretraining ICCV2023 GFM link
Scale-MAE Scale-MAE: A Scale-Aware Masked Autoencoder for Multiscale Geospatial Representation Learning ICCV2023 Scale-MAE link
DINO-MC DINO-MC: Self-supervised Contrastive Learning for Remote Sensing Imagery with Multi-sized Local Crops Arxiv2023 DINO-MC link
CROMA CROMA: Remote Sensing Representations with Contrastive Radar-Optical Masked Autoencoders NeurIPS2023 CROMA link
Cross-Scale MAE Cross-Scale MAE: A Tale of Multiscale Exploitation in Remote Sensing NeurIPS2023 Cross-Scale MAE link
DeCUR DeCUR: decoupling common & unique representations for multimodal self-supervision Arxiv2023 DeCUR link
Presto Lightweight, Pre-trained Transformers for Remote Sensing Timeseries Arxiv2023 Presto link
CtxMIM CtxMIM: Context-Enhanced Masked Image Modeling for Remote Sensing Image Understanding Arxiv2023 CtxMIM null
XGeo Multisensory Geospatial Models via Cross-Sensor Pretraining - XGeo null
FG-MAE Feature Guided Masked Autoencoder for Self-supervised Learning in Remote Sensing Arxiv2023 FG-MAE link
Prithvi Foundation Models for Generalist Geospatial Artificial Intelligence Arxiv2023 Prithvi link
RingMo-lite RingMo-lite: A Remote Sensing Multi-task Lightweight Network with CNN-Transformer Hybrid Framework Arxiv2023 RingMo-lite null
- A Self-Supervised Cross-Modal Remote Sensing Foundation Model with Multi-Domain Representation and Cross-Domain Fusion IGARSS2023 Paper null
EarthPT EarthPT: a foundation model for Earth Observation NeurIPS2023 CCAI workshop EarthPT link
USat USat: A Unified Self-Supervised Encoder for Multi-Sensor Satellite Imagery Arxiv2023 USat link
FoMo-Bench FoMo-Bench: a multi-modal, multi-scale and multi-task Forest Monitoring Benchmark for remote sensing foundation models Arxiv2023 FoMo-Bench Comming soon
AIEarth Analytical Insight of Earth: A Cloud-Platform of Intelligent Computing for Geospatial Big Data Arxiv2023 AIEarth link
- Self-Supervised Learning for SAR ATR with a Knowledge-Guided Predictive Architecture Arxiv2023 Paper null
Clay Clay Foundation Model - null link
U-BARN Self-Supervised Spatio-Temporal Representation Learning of Satellite Image Time Series JSTARS2024 Paper null
GeRSP Generic Knowledge Boosted Pre-training For Remote Sensing Images Arxiv2024 GeRSP GeRSP
SwiMDiff SwiMDiff: Scene-wide Matching Contrastive Learning with Diffusion Constraint for Remote Sensing Image Arxiv2024 SwiMDiff null
OFA-Net One for All: Toward Unified Foundation Models for Earth Vision Arxiv2024 OFA-Net null
SMLFR Generative ConvNet Foundation Model With Sparse Modeling and Low-Frequency Reconstruction for Remote Sensing Image Interpretation TGRS2024 SMLFR link
SpectralGPT SpectralGPT: Spectral Foundation Model TPAMI2024 SpectralGPT link
SkySense SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery CVPR2024 SkySense Comming soon

Few-shot Remote Sensing Vision Foundation Models

Abbreviation Title Publication Paper Code & Weights
METEOR Meta-learning to address diverse Earth observation problems across resolutions Nature Communications Earth & Environment METEOR METEOR

Remote Sensing Vision-Language Foundation Models

Abbreviation Title Publication Paper Code & Weights
RSGPT RSGPT: A Remote Sensing Vision Language Model and Benchmark Arxiv2023 RSGPT link
RemoteCLIP RemoteCLIP: A Vision Language Foundation Model for Remote Sensing Arxiv2023 RemoteCLIP link
GRAFT Remote Sensing Vision-Language Foundation Models without Annotations via Ground Remote Alignment ICLR2024 GRAFT null
- Charting New Territories: Exploring the Geographic and Geospatial Capabilities of Multimodal LLMs Arxiv2023 Paper link
- Remote Sensing ChatGPT: Solving Remote Sensing Tasks with ChatGPT and Visual Models Arxiv2024 Paper link
SkyEyeGPT SkyEyeGPT: Unifying Remote Sensing Vision-Language Tasks via Instruction Tuning with Large Language Model Arxiv2024 Paper link
EarthGPT EarthGPT: A Universal Multi-modal Large Language Model for Multi-sensor Image Comprehension in Remote Sensing Domain Arxiv2024 Paper null
GeoChat GeoChat: Grounded Large Vision-Language Model for Remote Sensing CVPR2024 GeoChat link
LHRS-Bot LHRS-Bot: Empowering Remote Sensing with VGI-Enhanced Large Multimodal Language Model Arxiv2024 Paper link

Remote Sensing Generative Foundation Models

Abbreviation Title Publication Paper Code & Weights
DiffusionSat DiffusionSat: A Generative Foundation Model for Satellite Imagery Arxiv2023 DiffusionSat null
Seg2Sat Seg2Sat - Segmentation to aerial view using pretrained diffuser models Github null link
- Generate Your Own Scotland: Satellite Image Generation Conditioned on Maps NeurIPSW2023 Paper link

Remote Sensing Vision-Location Foundation Models

Abbreviation Title Publication Paper Code & Weights
CSP CSP: Self-Supervised Contrastive Spatial Pre-Training for Geospatial-Visual Representations ICML2023 CSP link
GeoCLIP GeoCLIP: Clip-Inspired Alignment between Locations and Images for Effective Worldwide Geo-localization NeurIPS2023 GeoCLIP link
SatCLIP SatCLIP: Global, General-Purpose Location Embeddings with Satellite Imagery Arxiv2023 SatCLIP link

Remote Sensing Vision-Audio Foundation Models

Abbreviation Title Publication Paper Code & Weights
- Self-supervised audiovisual representation learning for remote sensing data JAG2022 Paper link

Benchmarks for RSFMs

Abbreviation Title Publication Paper Link Downstream Tasks
- Revisiting pre-trained remote sensing model benchmarks: resizing and normalization matters Arxiv2023 Paper link Classification
GEO-Bench GEO-Bench: Toward Foundation Models for Earth Monitoring Arxiv2023 Paper link Classification & Segmentation
FoMo-Bench FoMo-Bench: a multi-modal, multi-scale and multi-task Forest Monitoring Benchmark for remote sensing foundation models Arxiv2023 FoMo-Bench Comming soon Classification & Segmentation & Detection for forest monitoring
PhilEO PhilEO Bench: Evaluating Geo-Spatial Foundation Models Arxiv2024 Paper link Segmentation & Regression estimation
SkySense SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery CVPR2024 SkySense Comming Soon Classification & Segmentation & Detection & Change detection & Multi-Modal Segmentation: Time-insensitive LandCover Mapping & Multi-Modal Segmentation: Time-sensitive Crop Mapping & Multi-Modal Scene Classification
VLEO-Bench Good at captioning, bad at counting: Benchmarking GPT-4V on Earth observation data Arxiv2024 VLEO-bench link Location Recognition & Captioning & Scene Classification & Counting & Detection & Change detection

(Large-scale) Pre-training Datasets

Abbreviation Title Publication Paper Attribute Link
fMoW Functional Map of the World CVPR2018 fMoW Vision link
SEN12MS SEN12MS -- A Curated Dataset of Georeferenced Multi-Spectral Sentinel-1/2 Imagery for Deep Learning and Data Fusion - SEN12MS Vision link
BEN-MM BigEarthNet-MM: A Large Scale Multi-Modal Multi-Label Benchmark Archive for Remote Sensing Image Classification and Retrieval GRSM2021 BEN-MM Vision link
MillionAID On Creating Benchmark Dataset for Aerial Image Interpretation: Reviews, Guidances, and Million-AID JSTARS2021 MillionAID Vision link
SeCo Seasonal Contrast: Unsupervised Pre-Training From Uncurated Remote Sensing Data ICCV2021 SeCo Vision link
fMoW-S2 SatMAE: Pre-training Transformers for Temporal and Multi-Spectral Satellite Imagery NeurIPS2022 fMoW-S2 Vision link
TOV-RS-Balanced TOV: The original vision model for optical remote sensing image understanding via self-supervised learning JSTARS2023 TOV Vision link
SSL4EO-S12 SSL4EO-S12: A Large-Scale Multi-Modal, Multi-Temporal Dataset for Self-Supervised Learning in Earth Observation GRSM2023 SSL4EO-S12 Vision link
SSL4EO-L SSL4EO-L: Datasets and Foundation Models for Landsat Imagery Arxiv2023 SSL4EO-L Vision link
SatlasPretrain SatlasPretrain: A Large-Scale Dataset for Remote Sensing Image Understanding ICCV2023 SatlasPretrain Vision (Supervised) link
CACo Change-Aware Sampling and Contrastive Learning for Satellite Images CVPR2023 CACo Vision Comming soon
RSVG RSVG: Exploring Data and Models for Visual Grounding on Remote Sensing Data TGRS2023 RSVG Vision-Language link
RS5M RS5M: A Large Scale Vision-Language Dataset for Remote Sensing Vision-Language Foundation Model Arxiv2023 RS5M Vision-Language link
GEO-Bench GEO-Bench: Toward Foundation Models for Earth Monitoring Arxiv2023 GEO-Bench Vision (Evaluation) link
RSICap & RSIEval RSGPT: A Remote Sensing Vision Language Model and Benchmark Arxiv2023 RSGPT Vision-Language Comming soon
Clay Clay Foundation Model - null Vision link
SATIN SATIN: A Multi-Task Metadataset for Classifying Satellite Imagery using Vision-Language Models ICCVW2023 SATIN Vision-Language link
SkyScript SkyScript: A Large and Semantically Diverse Vision-Language Dataset for Remote Sensing AAAI2024 SkyScript Vision-Language link
ChatEarthNet ChatEarthNet: A Global-Scale, High-Quality Image-Text Dataset for Remote Sensing Arxiv2024 ChatEarthNet Vision-Language [Comming soon]

Survey Papers

Title Publication Paper Attribute
Self-Supervised Remote Sensing Feature Learning: Learning Paradigms, Challenges, and Future Works TGRS2023 Paper Vision & Vision-Language
Vision-Language Models in Remote Sensing: Current Progress and Future Trends Arxiv2023 Paper Vision-Language
The Potential of Visual ChatGPT For Remote Sensing Arxiv2023 Paper Vision-Language
遥感大模型:进展与前瞻 武汉大学学报 (信息科学版) 2023 Paper Vision & Vision-Language
地理人工智能样本:模型、质量与服务 武汉大学学报 (信息科学版) 2023 Paper -
Brain-Inspired Remote Sensing Foundation Models and Open Problems: A Comprehensive Survey JSTARS2023 Paper Vision & Vision-Language
Revisiting pre-trained remote sensing model benchmarks: resizing and normalization matters Arxiv2023 Paper Vision
An Agenda for Multimodal Foundation Models for Earth Observation IGARSS2023 Paper Vision
Transfer learning in environmental remote sensing RSE2024 Paper Transfer learning
遥感基础模型发展综述与未来设想 遥感学报2023 Paper -
On the Promises and Challenges of Multimodal Foundation Models for Geographical, Environmental, Agricultural, and Urban Planning Applications Arxiv2023 Paper Vision-Language

Cite

If you find this repository useful, please consider giving a star ⭐ and citation:

@InProceedings{guo2023skysense,
      title={SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery}, 
      author={Xin Guo and Jiangwei Lao and Bo Dang and Yingying Zhang and Lei Yu and Lixiang Ru and Liheng Zhong and Ziyuan Huang and Kang Wu and Dingxiang Hu and Huimei He and Jian Wang and Jingdong Chen and Ming Yang and Yongjun Zhang and Yansheng Li},
      booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
      month     = {},
      year      = {2024},
      pages     = {}
}

awesome-remote-sensing-foundation-models's People

Contributors

danielz02 avatar jack-bo1220 avatar marccoru avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.