Comments (4)
The IMP weights have been released in the ViTAE-Transformer repo, please move to https://github.com/ViTAE-Transformer/ViTAE-Transformer/tree/main/Image-Classification
from vitae-transformer-remote-sensing.
Hi, @DotWang. Thank you very much. I got it. I still have a question about the lower performance of RSP in semantic segmentation and change detection, compared to IMP. You mentioned two reasons, i.e., the dataset volume and the task granularity. However, intuitively, the data distribution of MillionAID is closer to these datasets (i.e., Postdam) used in remote senisng. Is the lower performance of RSP related to the heavy reliance of transformer on large amounts of samples? On the other hand, the task granularity exists in both RSP and IMP. So I feel that this reason may not explain the lower performance of RSP. However, it proved that RSP may be only effective in the classification task and can not generalize well to the segmentation task. Maybe there are some errors in my statement. So I want to know your opinion. Thank you very much.
from vitae-transformer-remote-sensing.
I first explain the task granularity. We conduct the IMP and RSP on four tasks: classification, detection, segmentation, and change detection (CD). The experiment results show that RSP performs better than IMP on the first two tasks, not only on the classification task. Intuitively, the granularity of classification and detection are separately in the scene and object-level, meaning the features that they require are close, which is convenient for the transferring of RSP weights. Segmentation is operated at pixel-level, compared with detection, it requires more detailed semantic information. From the task definition, CD may locate between detection and segmentation.
For the data volume, here, the volume does not only mean the image number, it also means the category number. Our pretraining dataset --- MillionAID only has 51 classes, far less than the Imagenet-IK. Limited categories decrease the dataset complexity, restricting the model performance. As you can see, our pretraining accuracies can reach 98%, that is almost impossible on the ImageNet-1k training. At this time, the model may not learn universal and detailed representations as the IMP. In my own opinion, the RSP may perform better than IMP when the pertaining dataset becomes more challenging.
I also notice that you mention the Potsdam dataset. For this dataset, the spectral differences may also affect the performance. Since we use IR-R-G channels, the gaps between evaluation and pretraining are larger than other RS dataset.
In summary:
-
Task granularity means the segmentation requires more detailed semantic information.
-
Current data volume makes the model not yet reach its potential.
-
Spectral differences extra deepen the domain gap for potsdam dataset.
from vitae-transformer-remote-sensing.
Hello, @DotWang. Thank you for your detailed explaination. I got it.
from vitae-transformer-remote-sensing.
Related Issues (20)
- About Labels of Million-AID Dataset HOT 3
- 变化检测预训练权重问题 HOT 2
- About download the pretained model with change detection.
- KeyError: "EncoderDecoder: 'ViTAE_Window_NoShift_basic is not in the models registry'" HOT 7
- DIOR-R Benchmark question. HOT 1
- What are the differences between 'Your_ResNet' and MMCV's ResNet vb/vc/vd HOT 4
- Can't find the hrsc2016 in configs/_base_/datasets? HOT 1
- Reproduce the SeCo DOTA result. HOT 6
- use one image to test issues HOT 4
- Where is the train_labels_{}_{}.txt for scene recognition? HOT 2
- Semantic Segmentation: Potsdam 数据集复现性能差距有点大 HOT 15
- 模型注册问题 HOT 5
- 模型训练问题 HOT 3
- reproduce problem about swin-t in scene classification. HOT 1
- ann_file是什么格式的,怎么把八点法的labelTxt转成ann_file HOT 1
- label讀取的問題 HOT 1
- 数据集处理问题 HOT 3
- mmcv版本问题 HOT 1
- 模型预训练权重在哪下载 HOT 12
- questions about exp. of semantic seg. HOT 15
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vitae-transformer-remote-sensing.