megvii-research / dpgn Goto Github PK
View Code? Open in Web Editor NEW[CVPR 2020] DPGN: Distribution Propagation Graph Network for Few-shot Learning.
License: MIT License
[CVPR 2020] DPGN: Distribution Propagation Graph Network for Few-shot Learning.
License: MIT License
Hi,
Interesting work and well done!
If I understand the code correctly, to get the best model when training, you used the test set to validate the accuracy. Should we use the validation set instead?
Cheers
Hey,
The work of your team is very impressive, and i want to do some contrast experiments with your approach, could you please give out the link to the pre-trained conv4 model of miniImageNet?
Hope to receive your reply soon, thank you.
Thanks so much for your extraordinary work! I get a very low accuracy using your code. And I only changed num_workers from 8 to 0.
This is my log.txt in 5way_1shot_resnet12_mini-imagenet:
[2020-07-08 00:57:47,863] [main] test_acc : 0.5358599852919579 step : 48000
[2020-07-08 00:57:47,864] [main] test_best_acc : 0.5789199849367141 step : 13000
......
[2020-07-08 01:09:22,572] [main] -------------command line arguments-------------
[2020-07-08 01:09:22,572] [main] Namespace(checkpoint_dir='./checkpoints', config='config/5way_1shot_resnet12_mini-imagenet.py', dataset_root='dataset', device='cuda:0', display_step=100, exp_name='5way_1shot_resnet12_mini-imagenet', log_dir='./logs', log_step=100, mode='eval', num_gpu=1, seed=222)
[2020-07-08 01:09:22,572] [main] -------------configs-------------
[2020-07-08 01:09:22,572] [main] OrderedDict([('dataset_name', 'mini-imagenet'), ('num_generation', 6), ('num_loss_generation', 3), ('generation_weight', 0.2), ('point_distance_metric', 'l2'), ('distribution_distance_metric', 'l2'), ('emb_size', 128), ('backbone', 'resnet12'), ('train_config', OrderedDict([('num_ways', 5), ('num_shots', 1), ('batch_size', 25), ('iteration', 100000), ('lr', 0.001), ('weight_decay', 1e-05), ('dec_lr', 17000), ('dropout', 0.1), ('loss_indicator', [1, 1, 0]), ('lr_adj_base', 0.1), ('num_queries', 1)])), ('eval_config', OrderedDict([('num_ways', 5), ('num_shots', 1), ('batch_size', 10), ('iteration', 1000), ('interval', 1000), ('num_queries', 1)]))])
[2020-07-08 01:09:22,793] [main] find a checkpoint, loading checkpoint from ./checkpoints/5way_1shot_resnet12_mini-imagenet
[2020-07-08 01:09:27,521] [main] best model pack loaded
[2020-07-08 01:09:27,548] [main] current best test accuracy is: 0.5789199849367141, at step: 13000
[2020-07-08 01:10:52,549] [main] ------------------------------------
[2020-07-08 01:10:52,551] [main] step : 13000 test_edge_loss : 2.522496324658394 test_node_acc : 0.5824999854266644
[2020-07-08 01:10:52,551] [main] evaluation: total_count=999, accuracy: mean=58.25%, std=8.37%, ci95=0.52%
And in 5way_5shot_resnet12_mini-imagenet:
[2020-07-07 13:46:28,276] [main] -------------configs-------------
[2020-07-07 13:46:28,277] [main] OrderedDict([('dataset_name', 'mini-imagenet'), ('backbone', 'resnet12'), ('emb_size', 128), ('num_generation', 6), ('num_loss_generation', 6), ('generation_weight', 0.2), ('point_distance_metric', 'l2'), ('distribution_distance_metric', 'l2'), ('train_config', OrderedDict([('num_ways', 5), ('num_shots', 5), ('batch_size', 8), ('iteration', 100000), ('lr', 0.001), ('weight_decay', 1e-05), ('dec_lr', 15000), ('dropout', 0.1), ('lr_adj_base', 0.1), ('loss_indicator', [1, 1, 1]), ('num_queries', 1)])), ('eval_config', OrderedDict([('num_ways', 5), ('num_shots', 5), ('batch_size', 4), ('iteration', 1000), ('interval', 1000), ('num_queries', 1)]))])
[2020-07-07 13:46:28,445] [main] find a checkpoint, loading checkpoint from ./checkpoints/5way_5shot_resnet12_mini-imagenet
[2020-07-07 13:46:34,063] [main] best model pack loaded
[2020-07-07 13:46:34,090] [main] current best test accuracy is: 0.7321500130593777, at step: 17000
[2020-07-07 13:48:42,438] [main] ------------------------------------
[2020-07-07 13:48:42,439] [main] step : 17000 test_edge_loss : 4.024368638753891 test_node_acc : 0.7245500129163265
[2020-07-07 13:48:42,440] [main] evaluation: total_count=999, accuracy: mean=72.46%, std=12.81%, ci95=0.79%
How should I do to get a higher accuracy?
Hello, I have followed the file's instructions but was unable to download the 'tiered-imagenet' dataset due to permission issues. Could you please share this dataset with me?
Line 316 in b940111
self.pred_loss(query_node_pred_generation, query_label.long()).mean()
For example, 5 way 1 shot, num_queries=1,
query_node_pred_generation
has shape of [batch_size, 5, 5]
query_label
has shape of [batch_size, 5]
5 way 1 shot, num_queries=2,
query_node_pred_generation
have shape of [batch_size, 10, 5]
query_label
has shape of [batch_size, 10]
In query_node_pred_generation
, which dimension is the class (i.e., N ways)?
Hi:
In the 5-W 1-S setting, the query set label of each batch during training and testing is [0,1,2,3,4], no scrambling is performed,Will this make the network remember this setting, and the accuracy will increase?In other papers (GNN, relational network) that I read for few shot learning, the labels of the query set are out of order, so I follow this idea of out of order and only use the source code of each batch test query set label randomly scramble, maybe [1,4,2,0,3], [1,2,4,0,3], [2,0,4,3,1], etc., init_edge is also based on the modified label,the sequence generated is still a 10*10 symmetric matrix, and the accuracy value is only about 43%, which is far from the 66.27% accuracy of my source code.I also scrambled during the training and testing phases, and the result was about 43%.What I thought about the graph network at the beginning was that the order of the node labels should have no effect on the accuracy rate, because we made the form of the data into the graph, data structured, and relative, but this huge accuracy difference makes me,I don't quite understand it. Did I set it wrong?
thank you very much!
Can you provide the code of WRN and ResNet18 in the backbone.py?Thanks a lot.
Hi,
I want to use my own dataset on this model,
but i also want to show which the query data is running now[ex. image.show()].
did you have any idea to show the image and the label because i can't find the place that query data is processing now, thank you.
Dear yang,
I want to ask for help. When I open the google drive to download the pretrained model mentioned in README.md. there is nothing in that folder. I don't know what to do. Could you share it with me again?
Thank you! Looking forward to your reply.
Dear Yang:
I am really impressed with your work. The work provides me with a new angle and significantly raise the benchmark of few-shot learning tasks. However, when I tried to reproduce your result with the public code, I found the test accuracy for 5way 5shot miniimagenet (Convnet) tasks is around 78%, and the final test accuracy is about 76%. I guess there must be some tricks, could you kindly help me?
Thank you.
在为miniImageNet分类中使用DPGN backbone为convNet 5way5shot的情况下准确率与文中相应情况的准确率差别有5%。请问是为什么呢?
I want to reproduce you result, and I believe different configurations have big impact on the final result. So can you release the config file of Conv4 in miniImageNet and CIFAR_FS? Thank you very much.
Hello, thanks for creating this wonderful work! I want to reproduce your result on CUB with Conv4 but cannot find the config file Could you please release the file? Thanks a lot for your help!
Hi,
I wanted to check this model out and test it for a dateset of images that I have.
Is that currently possible?
Regards.
Hi,
I am running with CUB with the following command:
python3 main.py --dataset_root dataset --config config/5way_1shot_resnet12_cub-200.py --num_gpu 1 --mode train
And I got this error
File "main.py", line 579, in
main()
File "main.py", line 570, in main
trainer.train()
File "main.py", line 105, in train
last_layer_data, second_last_layer_data = backbone_two_stage_initialization(all_data, self.enc_module)
File "/data/add_disk0/vhnguyen/cvpr21/DPGN/utils.py", line 197, in backbone_two_stage_initialization
encoded_result = encoder(data.squeeze(1))
File "/home/vhnguyen/anaconda2/envs/py36_torch1_7/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/data/add_disk0/vhnguyen/cvpr21/DPGN/backbone.py", line 101, in forward
x = self.avgpool(x)
File "/home/vhnguyen/anaconda2/envs/py36_torch1_7/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/vhnguyen/anaconda2/envs/py36_torch1_7/lib/python3.6/site-packages/torch/nn/modules/pooling.py", line 595, in forward
self.padding, self.ceil_mode, self.count_include_pad, self.divisor_override)
RuntimeError: Given input size: (512x6x6). Calculated output size: (512x0x0). Output size is too small
Could you please let me know how to fix this?
Thank you very much.
您好!
您有没有试过类似于(1,1,2,3,4),(0,0,2,2,3)这种每类不是固定取一个的测试方法呢?我在您的模型上进行这样的测试发现准确率会降低很多。实际上egnn进行这种测试准确率也会降低很多,是不是学习到了一些先验知识呢?
Hello, I am reading your codes and there are a few questions i want to ask.
Thanks for your sharing.
I encountered a problem about the size of avgpool.
When I implement your code, I found that the size of input before avgpool is 512X6X6 for miniimagenet dataset. However, since the filter size of avgpool is 7X7, the output size is too small (i.e., 512X0X0).
Could you help me solve the problem?
Thank you.
Wonderful work, but I cannot repeat your results. Specifically, I try to use ResNet12 as the backbone to solve the 5-way 1-shot task on CIFAR-FS. But the final accuracy I got was 70.63%.
[2020-07-16 09:08:10,015] [main] step : 67000 test_edge_loss : 2.035566942691803 test_node_acc : 0.7063199849426747
[2020-07-16 09:08:10,016] [main] evaluation: total_count=999, accuracy: mean=70.63%, std=7.18%, ci95=0.44%
My program environment is
CUDA Version: 10.0
Python : 3.6.7
Does the environment have such a big impact on accuracy?
Dear Yang,
Thank you for releasing the repo. Could you give me detailed instructions on how to make the data set in the paper?
Best,
Zhongshan Bao
Could you provide the code for generating the pickle file of miniimagenet?
这边尝试在自己的机子上跑您的代码,2080ti 11GB,但只有5way-1shot的ConvNet能跑起来,5shot或者是ResNet12都会出现类似
RuntimeError: CUDA out of memory. Tried to allocate 36.00 MiB (GPU 0; 10.76 GiB total capacity; 9.33 GiB already allocated; 56.69 MiB free; 219.71 MiB cached)
的问题。想知道您这边运行是的gpu是?看了下您的论文也没有提到gpu这块。
Hello team DPGN:
I noticed that you've gotten the result of DPGN with WRN as your backbone, however there isn't any classes named WRN in your backbone.py. Could you please show us how WRN works in DPGN? And if you could put the config file at the same time, that will be better! Thanks a lot!
您好:
不管是在EGNN论文和您的DPGN中,在5-W 1-S设置时,每一个batch的训练和测试时查询集标签都是[0,1,2,3,4],没有进行打乱,这样会不会使网络记住这种设置,从而准确率升高?我看的其他针对few shot learning 的论文(GNN,关系网络),查询集的标签都是乱序的,因此我按照这种乱序**,只在源码中将每一个batch测试时的查询集标签随机打乱,可能[1,4,2,0,3]、[1,2,4,0,3]、[2,0,4,3,1]等等,init_edge也是根据修改后的标签顺序生成的,依然是个10*10的对称矩阵,准确率值只达到了43%左右,这与我用您源码准确率66.27%差的太多。在训练和测试阶段我也进行同时打乱,结果也是43%左右。我对于图网络一开始想的是节点标签顺序应该对于正确率是没有影响的,因为我们将数据构成图的形式,数据结构化,具有相对性了,但是这种巨大的准确率差异,使得我不是很明白,是我哪里设置错了吗?
非常感谢!
Thanks so much for your extraordinary work! I have 2 questions on your codes.
With the dataset link provided, I follow the steps in 'download_CUB.sh' , finding there's no file named 'split' under my directory, and the code can't run without this file on CUB-200 dataset
Hi,DPGN teams:
I am interested in your paper,so I ran your code on my computer.
When I ran the 5-way 1-shot task, I found that the test_ACC value began to rise until it reached a maximum of 57.8 at 18000 steps, and then began to decline slowly until it finally reached 52.7%. What is the reason? Is it because I set the num_workers workers in the dataloader.py to 0?
hi,can you help me? Thank you very much.
Traceback (most recent call last):
File "main.py", line 580, in
main()
File "main.py", line 571, in main
trainer.train()
File "main.py", line 84, in train
for iteration, batch in enumerate(self.data_loader'train'):
File "/home/wuchenxi/Desktop/DPGN-master/venv/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 582, in next
return self._process_next_batch(batch)
File "/home/wuchenxi/Desktop/DPGN-master/venv/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 608, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
ValueError: Traceback (most recent call last):
File "/home/wuchenxi/Desktop/DPGN-master/venv/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 99, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/wuchenxi/Desktop/DPGN-master/venv/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 99, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/wuchenxi/Desktop/DPGN-master/venv/lib/python3.7/site-packages/torchnet/dataset/listdataset.py", line 54, in getitem
return self.load(self.list[idx])
File "/home/wuchenxi/Desktop/DPGN-master/dataloader.py", line 285, in load_function
support_data, support_label, query_data, query_label = self.get_task_batch()
File "/home/wuchenxi/Desktop/DPGN-master/dataloader.py", line 259, in get_task_batch
task_class_list = random.sample(self.full_class_list, self.num_ways)
File "/usr/local/python3.7.5/lib/python3.7/random.py", line 321, in sample
raise ValueError("Sample larger than population or is negative")
ValueError: Sample larger than population or is negative
Dear Yang:
Thank you for releasing the repo. Could you share the detailed learning schedule for Cifar-fs and TieredImagenet datasets for reproducing the results reported in the paper?
Best,
TANG, shixiang
您好!DPGN是一个十分优秀的工作。但是我在阅读您的代码的时候遇到了一些问题,是关于输入GNN的初始化边,edge_feature_gd和edge_feature_gp这两个矩阵。对于5way-1shot,1query来说,它是一个10×10的矩阵,在这个矩阵的右下角的5×5的矩阵中,您将其初始化为5×5的单位阵。但是这样是合理的吗?这样子是否引入了先验知识:任意两个询问集的样本均不同类呢?另外我使用您的代码,batchsize为25,backbone使用convnet,5way-1shot最后的测试结果为64.42±0.52,最后测试出来的结果和您论文中的结果差距(66.01±0.36)有点大,我该如何修改训练方式才能达到您展示的效果呢?
Hi,
A strange bug occurred when I ran the code to the total_loss.backward()
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_SUPPORTED. This error may appear if you passed in a non-contiguous input.
Have you ever come across this problem
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.