Hello, <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

Hi, <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

About the IMP weights on change detection,about vitae-transformer/vitae-transformer-remote-sensing

Comments (4)

DotWang commented on May 30, 2024

The IMP weights have been released in the ViTAE-Transformer repo, please move to https://github.com/ViTAE-Transformer/ViTAE-Transformer/tree/main/Image-Classification

from vitae-transformer-remote-sensing.

lauraset commented on May 30, 2024

Hi, @DotWang. Thank you very much. I got it. I still have a question about the lower performance of RSP in semantic segmentation and change detection, compared to IMP. You mentioned two reasons, i.e., the dataset volume and the task granularity. However, intuitively, the data distribution of MillionAID is closer to these datasets (i.e., Postdam) used in remote senisng. Is the lower performance of RSP related to the heavy reliance of transformer on large amounts of samples? On the other hand, the task granularity exists in both RSP and IMP. So I feel that this reason may not explain the lower performance of RSP. However, it proved that RSP may be only effective in the classification task and can not generalize well to the segmentation task. Maybe there are some errors in my statement. So I want to know your opinion. Thank you very much.

from vitae-transformer-remote-sensing.

DotWang commented on May 30, 2024

I first explain the task granularity. We conduct the IMP and RSP on four tasks: classification, detection, segmentation, and change detection (CD). The experiment results show that RSP performs better than IMP on the first two tasks, not only on the classification task. Intuitively, the granularity of classification and detection are separately in the scene and object-level, meaning the features that they require are close, which is convenient for the transferring of RSP weights. Segmentation is operated at pixel-level, compared with detection, it requires more detailed semantic information. From the task definition, CD may locate between detection and segmentation.

For the data volume, here, the volume does not only mean the image number, it also means the category number. Our pretraining dataset --- MillionAID only has 51 classes, far less than the Imagenet-IK. Limited categories decrease the dataset complexity, restricting the model performance. As you can see, our pretraining accuracies can reach 98%, that is almost impossible on the ImageNet-1k training. At this time, the model may not learn universal and detailed representations as the IMP. In my own opinion, the RSP may perform better than IMP when the pertaining dataset becomes more challenging.

I also notice that you mention the Potsdam dataset. For this dataset, the spectral differences may also affect the performance. Since we use IR-R-G channels, the gaps between evaluation and pretraining are larger than other RS dataset.

In summary:

Task granularity means the segmentation requires more detailed semantic information.
Current data volume makes the model not yet reach its potential.
Spectral differences extra deepen the domain gap for potsdam dataset.

from vitae-transformer-remote-sensing.

lauraset commented on May 30, 2024

Hello, @DotWang. Thank you for your detailed explaination. I got it.

from vitae-transformer-remote-sensing.

About the IMP weights on change detection about vitae-transformer-remote-sensing HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent