guyuc / ws-dan.pytorch Goto Github PK
View Code? Open in Web Editor NEWA PyTorch implementation of WS-DAN (Weakly Supervised Data Augmentation Network) for FGVC (Fine-Grained Visual Classification)
License: MIT License
A PyTorch implementation of WS-DAN (Weakly Supervised Data Augmentation Network) for FGVC (Fine-Grained Visual Classification)
License: MIT License
excese me,l would like test inception.py,input 3x229x229,and output 3x17x17,it's the shape I need,but
in wsdan.py ,input 3x229x229,awesome!output 3x12x12?thank
feature_center_batch = F.normalize(feature_center[y], dim=-1)
IndexError: tensors used as indices must be long, byte or bool tensors
How to test a single image
In the forward pass of the model (here) we have this line, which calculates the class logits:
# Classification
p = self.fc(feature_matrix * 100.)
I'm not sure where this multiply by 100 magic number is coming from. Can you tell me why this is here, and why it's necessary? When I remove it, learning seems to stall. The only thing I can think is that it's supposed to boost the gradient, I'm just not sure why this is necessary.
1,3,5代表什么
1,3,5代表什么
l don't understand What does 1, 3, 5 stand for?Why to define epoch+acc=np.array([[0,0,0],[0,0,0],[0,0,0]])
Hello, author.
I encountered the following problems while looking at your code:
训练fgvc-aircraft,40个epoch基本上稳定了,现在raw的正确率82,%,crop正确率74%,drop正确率66%。raw+crop的正确率是78左右。请问这是没有学到注意力吗?
在使用resnet50作为backbone时,在cub数据集上只达到了84.7%的acc。
While training the model on a custom dataset I notice that I run into cases where nonzero_indices is empty and https://github.com/GuYuc/WS-DAN.PyTorch/blob/master/utils.py#L157 then throws an exception.
Is this behavior expected to happen? I'm guessing one could set:
height_min = 0
height_max = imgH
width_min = 0
width_max = imgW
when nonzero_indices
But I wanted to first confirm that I was understanding this code correctly and that nonzero_indices being empty wasn't symptomatic of a deeper issue.
When I did model validation, I got three pictures (raw.jpg,raw_atten.jpg,heat_atten.jpg) I want to get the class label corresponding to each picture in the validation set. What should I do?
Hi mate,
I'm trying to reproduce experiment results using WS-DAN/Xception and I'm impressed by the implementation of the WS-DAN network.
However, in train-wsdan.py, when I try to iterate dataloader, for i, (X, y) in enumerate(data_loader):
, it calls batch_loss.backward()
.
It shows the following error:
** RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [8]] is at version 4; expected version 3 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!
So I print out parameters in "net":
for name,parameters in net.named_parameters():
if parameters.size()[0]==8:
print(name,':',parameters.size())
which shows the so-called "[torch.cuda.FloatTensor [8]]" variables.
module.attentions.conv.weight : torch.Size([8, 2048, 1, 1])
module.attentions.bn.weight : torch.Size([8])
module.attentions.bn.bias : torch.Size([8])
So I find how the attention weights are built at the very beginning:
# Generate Attention Map
if self.training:
# Randomly choose one of attention maps Ak
attention_map = []
for i in range(batch_size):
# attention_weights = torch.sqrt(attention_maps[i].sum(dim=(1, 2)).detach() + EPSILON)
attention_weights = torch.sqrt(attention_maps[i].sum(dim=(1, 2)) + EPSILON)
attention_weights = F.normalize(attention_weights, p=1, dim=0)
# It block the gradients flow??
k_index = np.random.choice(self.M, 2, p=attention_weights.cpu().detach().numpy())
pdb.set_trace()
attention_map.append(attention_maps[i, k_index, ...])
attention_map = torch.stack(attention_map) # (B, 2, H, W) - one for cropping, the other for dropping
So my question is, these parts use NumPy to calculate, so it seems what are we trying to build is actually two separate computation Graphs?
k_index = np.random.choice(self.M, 2, p=attention_weights.cpu().detach().numpy())
pdb.set_trace()
attention_map.append(attention_maps[i, k_index, ...])
attention_map = torch.stack(attention_map)
Or, should we just use pytorch to implement it? Because the gradient calculation error seems caused by this.
Thx for answering in advance!
self.corrects * 100. 为什么要给self.corrects 乘100呢?
When I train the model with Standford Car, it shows 'Val Acc (0.53, 2.60)'. What are 0.53 and 2.60 meaning for? Thank you
Hi
Can this implementation reproduce the performance in the original paper?
Thanks!
Sorry. It was my mistakes.
This is a piece of code in the wsdan.py program.
According to my understanding, M(32) and p([1,32]) of np.random.choice() are different sizes, but why is there no error?
if self.training:
attention_map = []
for i in range(batch_size):
attention_weights = torch.sqrt(attention_maps[i].sum(dim=(1, 2)).detach() + EPSILON)
attention_weights = F.normalize(attention_weights, p=1, dim=0)
k_index = np.random.choice(self.M, 2, p=attention_weights.cpu().numpy())
attention_map.append(attention_maps[i, k_index, ...])
attention_map = torch.stack(attention_map)
else:
attention_map = torch.mean(attention_maps, dim=1, keepdim=True)
Hello, may you fully reproduce the accuracy of the paper on the CUB dataset? 89.4%?
the val acc doesn't change for several epoch when I train the model. What's wrong with it?
Epoch 85/300: 100%|██████████| 8144/8144 [1:24:14<00:00, 1.61 batches/s, Loss 6.2739, Raw Acc (0.39, 2.49), Crop Acc (0.39, 2.49), Drop Acc (0.39, 2.49), Val Loss 44.7789, Val Acc (0.52, 2.70)]
Epoch 86/300: 100%|██████████| 8144/8144 [1:24:12<00:00, 1.61 batches/s, Loss 6.2739, Raw Acc (0.39, 2.49), Crop Acc (0.39, 2.49), Drop Acc (0.39, 2.49), Val Loss 44.8678, Val Acc (0.52, 2.70)]
Epoch 87/300: 100%|██████████| 8144/8144 [1:24:20<00:00, 1.61 batches/s, Loss 6.2739, Raw Acc (0.39, 2.49), Crop Acc (0.39, 2.49), Drop Acc (0.39, 2.49), Val Loss 44.6306, Val Acc (0.52, 2.72)]
Epoch 88/300: 100%|██████████| 8144/8144 [1:24:12<00:00, 1.61 batches/s, Loss 6.2739, Raw Acc (0.39, 2.49), Crop Acc (0.39, 2.49), Drop Acc (0.39, 2.49), Val Loss 44.8817, Val Acc (0.52, 2.71)]
Epoch 89/300: 100%|██████████| 8144/8144 [1:23:19<00:00, 1.63 batches/s, Loss 6.2739, Raw Acc (0.39, 2.49), Crop Acc (0.39, 2.49), Drop Acc (0.39, 2.49), Val Loss 44.9679, Val Acc (0.52, 2.71)]
Epoch 90/300: 100%|██████████| 8144/8144 [1:22:45<00:00, 1.64 batches/s, Loss 6.2739, Raw Acc (0.39, 2.49), Crop Acc (0.39, 2.49), Drop Acc (0.39, 2.49), Val Loss 44.4493, Val Acc (0.52, 2.71)]
Epoch 91/300: 100%|██████████| 8144/8144 [1:22:45<00:00, 1.64 batches/s, Loss 6.2739, Raw Acc (0.39, 2.49), Crop Acc (0.39, 2.49), Drop Acc (0.39, 2.49), Val Loss 44.8331, Val Acc (0.52, 2.67)]
Epoch 92/300: 97%|█████████▋| 7933/8144 [1:09:51<01:57, 1.80 batches/s, Loss 6.2741, Raw Acc (0.39, 2.53), Crop Acc (0.39, 2.53), Drop Acc (0.39, 2.53)]
I train the network on the stanford cars, the loss is keeping 1.2 when it runs about 90 epoch , the accuracy on the test dataset is only less than 70%.
the parameter I used is : batch size=4, pretrainmode=resnet50, others are not change.
Thanks for sharing the implementation of the paper. May I know if the code is open sourced? If it is would you mind adding an open source license to it?
When I test the weights, how can I know the confidence score of the class? Thank you
您好,我想问您一下,在您的eval 代码中我如何可以显示每一个attention map的热图。
As paper described:
For each training image, we randomly choose one of its attention map A k to guide the data augmentation process,
attention map are randomly choosed. but in code:
crop_images = batch_augment(X, attention_map[:, :1, :, :], mode='crop', theta=(0.4, 0.6), padding_ratio=0.1)
only use the first attention map?
Thanks.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.