Now I want to pre-train the RegNet with sentinel-2 and SPOT images. But I don't know h

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hello <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-ur

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

How to pre-train the RegNet about deepsum HOT 5 OPEN

lionlai1989 commented on August 20, 2024 1

How to pre-train the RegNet

from deepsum.

Comments (5)

shyu4184 commented on August 20, 2024 1

@lionlai1989

Sorry for late reply.
Author said that they generated a filter which has the intensity value of 1 at the shifted location and 0 for the rest of intensities. That is, if you shifted by (x=1, y=2), the ground truth filter has the intensity value of 1 at the location of (1, 2) and the rest pixels have zero values such as a shifted delta filter. In this work, as far as I understood, the ground truth and trained filters are regarded as a 2-dimensional convolution filter. And if the size of filter is k*k, it has the k^2 cases of translation (classes).

from deepsum.

shyu4184 commented on August 20, 2024

@lionlai1989
Hello,
Did you solve this problem?
I'm implementing the RegNet using pytorch and my own dataset.
In my case, I translated the feature maps by the amount of randomly selected integer translation. In addition, I generated the ground truth filters as mentioned in the paper.
However, the obtained filters are so different compared with the ground truth one.

Would you share you approach?

from deepsum.

lionlai1989 commented on August 20, 2024

Hello @shyu4184
Right now I am using the Arthur's tensorflow code with my sentinel-2 (LR) and SPOT(HR) dataset. I tried to avoid modifying the code of network itself and instead feed my dataset into the training and testing process.

So, in the network,py, there are four lines of code which I don't know where to get them from. All I have is input (LR) and ground truth(HR). For me, the mask is not necessary because all my images have no cloud. And I don't know what are other three input.

    #self.mask_y=tf.placeholder('float32',shape=[None,1,None,None,1],name='mask_y')
    #self.y_filters=tf.placeholder('float32',shape=[None,None,self.dyn_filter_size**2],name='y_filters')
    #self.fill_coeff=tf.placeholder(tf.float32,shape=[None,self.T_in,self.T_in,None,None,1],name='fill_coeff')
    #self.norm_baseline=tf.placeholder('float32',shape=[None,1],name='norm_baseline')

Back to your questions, you said you translated the feature maps by the amount of randomly selected integer translation. Here, I think feature maps is the output of SISRNet, right? And here I don't understand why we can random shift the feature maps wrt the first one by a random integer amount of pixels. I mean this shifted pixel will be a ground truth when pre-train the network, How can we randomly shift them?
And I don't understand how does the registered shift related to the one-shot softmax layer? I mean the value of shift is 1,2,3..., and how can 1,2,3 be transfer to different classes which is the number of k*K (filter size).

Sorry for bringing more questions. If you can help me to understand the implementation.

from deepsum.

shyu4184 commented on August 20, 2024

@lionlai1989
Since I only try to implement the RegNet, I cannot answer the usage of four variables in the first question. Yeah, you're right. The feature maps are the output of SISRNet. The first one is a reference for the rest of feature maps. In addition, since the role of the RegNet is to align the misaligned super-resolved feature maps, we need to translate the feature maps to train the RegNet. Pytorch has a function to shift the feature map as torch.roll. I'm not sure, it maybe tf.roll in the tensorflow.

I also confused that how they generate the ground truth filter. As mentioned in the paper, they synthesized the ground truth filter as a convolution filter whose intensity value is 1 at the shifted pixel and the rest of the intensity value is 0 such as delta filter. Although I followed their explanation, but I'm not sure it's correctly implemented or not.

from deepsum.

lionlai1989 commented on August 20, 2024

@shyu4184
Thank you for your feedback. It feels so good to have someone to talk wrt this topic.
My question is how to generate the ground truth of RegNet when pretraining?

For example, if a image's shift is (x=1, y=2) pixels (paper said it has to be interger) wrt the reference image. Then what should the ground truth vector look like? And how is it related to the number of filter k*k? It is the part really confuse me.

from deepsum.

How to pre-train the RegNet about deepsum HOT 5 OPEN

Comments (5)

Related Issues (3)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent