bohao-lee / cme Goto Github PK

View Code? Open in Web Editor NEW

65.0 65.0 15.0 1.72 MB

Python 84.87% Makefile 0.28% C 7.98% Cuda 6.60% Shell 0.27%

cme's People

Contributors

Stargazers

Watchers

Forkers

qinzhengmei open-lamp a07913838438 xiaodongdreams xiaofeng-c danke1896 whizza222 wangfp-516 wei-baldwin-zeng yomik-js imyjx hanwhapaullee leowang95 kesseewise zxw-king

cme's Issues

RuntimeError: stack expects each tensor to be equal size, but got [3, 512, 512] at entry 0 and [3, 320, 320] at entry 4

Thanks for your porject and it's very helpful to me, but when I run the finetune part,
it occured RuntimeError: stack expects each tensor to be equal size, but got [3, 512, 512] at entry 0 and [3, 320, 320] at entry 4. I set the height and width to 512. What should I do to solve this problem, waiting for your reply. all the traceback is forward:
Traceback (most recent call last):
File "tool/train_decoupling_disturbance.py", line 405, in
train(epoch,repeat_time,mask_ratio)
File "tool/train_decoupling_disturbance.py", line 242, in train
for batch_idx, (data, target) in enumerate(train_loader):
File "/home/zjx/anaconda3/envs/py38/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 517, in next
data = self._next_data()
File "/home/zjx/anaconda3/envs/py38/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1179, in _next_data
return self._process_data(data)
File "/home/zjx/anaconda3/envs/py38/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1225, in _process_data
data.reraise()
File "/home/zjx/anaconda3/envs/py38/lib/python3.8/site-packages/torch/_utils.py", line 429, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in DataLoader worker process 5.
Original Traceback (most recent call last):
File "/home/zjx/anaconda3/envs/py38/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 202, in _worker_loop
data = fetcher.fetch(index)
File "/home/zjx/anaconda3/envs/py38/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
return self.collate_fn(data)
File "/home/zjx/anaconda3/envs/py38/lib/python3.8/site-packages/torch/utils/data/_utils/collate.py", line 83, in default_collate
return [default_collate(samples) for samples in transposed]
File "/home/zjx/anaconda3/envs/py38/lib/python3.8/site-packages/torch/utils/data/_utils/collate.py", line 83, in
return [default_collate(samples) for samples in transposed]
File "/home/zjx/anaconda3/envs/py38/lib/python3.8/site-packages/torch/utils/data/_utils/collate.py", line 55, in default_collate
return torch.stack(batch, 0, out=out)
RuntimeError: stack expects each tensor to be equal size, but got [3, 512, 512] at entry 0 and [3, 320, 320] at entry 4

How to train on coco

Sir, thanks to share the code of your great work.
May I ask how to train on COCO dataset. I can only find the tutorial of VOC. Can you give a tutorial to prepare coco training?
Thanks in advance!

Some question about MPSR baseline

Nice paper, and thanks for sharing the source code.
I noticed that the CME is fused with the one-stage detector (Meta YOLO)
and the two-stage detector (MPSR).

As you know, MPSR exploited data augmentation strategy with multi-scale.
Did you exploited that as well to get experimental results in the paper?

Hi Can you provide the t-SNE code？

Questions about model size

Hello, can you tell me the amount of parameters and space of the model you constructed? thank you very much

Code of base training.

Sir, would you like to upload the code of base model training? i.e., the code of Class Max-margin

pre-trained weights

I can't find the location of darknet19_448.conv.23.Can you tell me where can I find it in the code?

前辈检测结果图中显示类别求指教！

才发现前辈是国人，真棒
感谢您完美的工作并开源了相关代码，但是我运行valid_decoupling.py之后显示的只有终端中的检测类别和精度，并无前辈论文中如下的检测结果图，请问您可否将此部分相关代码开源呢？非常感谢！！

What's the funciont of parameter 'repeate_time' of train in tool/train_decoupling_disturbance.py?

Thans for your great work! Now I'm modifying source code to train my own novel dataset and I want to use 20/30 shot training in fintuning phase, but function train has a parameter 'repea_time', it is decided by the foumula repeat_time = 13 - shot_num, so if I use shot 20 or 30, the repeat_time is negative, this will cause no foward and backward process in train function. How should I change the 'repeat_time' parameter and why did you use the formula repeat_time = 13 - shot_num to calculate it?
Hoping your answer and thanks very much!

How to train on VOC, can you tell me more in detail? Many errors are reported

RuntimeError: shape '[64, 15, 845]' is invalid for input of size 1622400

你好，我代码流程按照readme一步步来，但是最后报错，维度不匹配，请问是什么原因造成的，有什么解决办法吗？
2023-04-15 02:09:51 epoch 0/353, processed 0 samples, lr 0.000033
THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=383 error=11 : invalid argument
/opt/conda/lib/python3.6/site-packages/torch/nn/_reduction.py:46: UserWarning: size_average and reduce args will be deprecated, please use reduction='sum' instead.
warnings.warn(warning.format(ret))
Traceback (most recent call last):
File "tool/train_decoupling_disturbance.py", line 405, in
train(epoch,repeat_time,mask_ratio)
File "tool/train_decoupling_disturbance.py", line 286, in train
loss = region_loss(output, target, dynamic_weights, target_cls_ids)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(input, **kwargs)
File "/code/2023/CME/core/region_loss.py", line 417, in forward
cls = cls.view(bs, cs, nAnCnHnW).transpose(1,2).contiguous().view(bsnAnCnHnW, cs)
RuntimeError: shape '[64, 15, 845]' is invalid for input of size 1622400

Question about .weights

Nice paper, and thanks for sharing the code.
When I run the code, there is an error: FileNotFoundError: [Errno 2] No such file or directory
Later, I found that there are no. Weights files in the backup folder. How can I get these files?

how to visualize?

How do you visualize the prediction image and draw the boxs on the image？

How long is the model trained by using two Nvidia Tesla V100 GPUs as mentioned in the paper and the github issue?

Batch_size

Where can I reduce batch_size,can you tell me the location of it?

The role of feature disturbance.

Hi, thanks for this impressive work. I have a question about the feature disturbance module. In my opinion, this process is like keeping erasing the most salient region in the support images whose saliency is decided according to the gradient. The less salient version of support images are provided in the training phase so that the trained model can be more robust in that the it can make use of all possible parts in target object. And I think this is proved by the super large improvement on APL of COCO. So how is this related to margin equilibrium?

How to change the framework from Yolo to Faster R-CNN ?

Hello, I want to know how can I use CME in Faster R-CNN, please🙏🏻🙏🏻

How many GPUs will be used for training the code?

Thank you for your work,when use your codes for training,i want to know how many gpus will be used?

Feature disturbance applied on both base&novel?

Thanks for the nice paper and codes,
As shown in Table 3, you get 47.5 nAP with disturbance only on base classes.
And in Table 5, i found 47.5 nAP as well..
Does the default setting is disturbance only on base class?

Why the image and mask are concatenated together instead of multiplied？

NotImplementedError: [] not recognized

Traceback (most recent call last):
File "/backup1/S320080026/Fewshot_Detection-master/train_meta.py", line 114, in
test_metaset = dataset.MetaDataset(metafiles=metadict, train=True)
File "/backup1/S320080026/Fewshot_Detection-master/dataset.py", line 323, in init
raise NotImplementedError('{} not recognized'.format(pair))
NotImplementedError: [] not recognized

I can't find this error,can you help me

About implementation of MPSR based

For the first question, we use CME to imporve MPSR's classfication, so we use it in refinement branch, not another branch.
For the second question, we view margin as inter-class distance as margin. Hope this paper can help u. (https://arxiv.org/pdf/2005.13826.pdf)

Sorry, I still don't understand. CME is meta-learning based method, however, MPSR is fine-tuning method without support branch, so how to implement Feature Disturbance while there is no support mask?

Sorry, we use Feature Disturbance to disturb feature. Because mask reflects object position, we use it. You can disturb corresponding object in image or corresponding feature. Because of limited time, we don't more experiment in it. We'll try to do some research in it.

Originally posted by @Bohao-Lee in #1 (comment)

Hello!
I noticed that the MPSR-based method has better performance. Will you release the code of that? 😁
And I am also interested in how the Feature Disturbance work in MPSR. Did you mean the structure of MPSR hadn't been changed, and the network was directly fine-tuned with the disturbed features?
Looking forward to your reply

As the training time increases, Proposals=0, is this normal?

Thanks for your code and article, but I'm running into some problems. There are tens of thousands of Proposals at first, but then there are fewer and fewer or even 0. Is this normal?

is this normal?

Do you need to train all the time? What should I do if I keep training and it still looks like this?

About the category and confidence after the test

Hello, your code inspired me. Thank you for your perfect work.
However, after I run your test code, only the detection result data appears, and there is no detection box and related detection category and confidence. Until then, I saw the relevant result diagram in your article. Could you please open source the relevant code? Thank you.

Inferencing with query and support images

Thanks for your great projects.
I can't find out how to run inference model with query and support images to generate the result. How do you run model with query and support images? Thank you.

occur the NAN when training the Net using the pascal voc

64: nGT 42, recall 1, proposals 35613, loss: x 27.029242, y 28.630236, w 177730.062500, h 1531226.000000, conf 2786.200684, cls 108.162247, class_contrast 0.000000, total 1711906.125000
coord_mask: tensor(27.0292, device='cuda:0')
80: nGT 40, recall 13, proposals 39993, loss: x 23.945053, y 22.210649, w 2029862.625000, h 16542059.000000, conf nan, cls 111.281578, class_contrast 0.000000, total nan
coord_mask: tensor(23.9451, device='cuda:0')
96: nGT 30, recall 0, proposals 0, loss: x nan, y nan, w nan, h nan, conf nan, cls nan, class_contrast nan, total nan
coord_mask: tensor(nan, device='cuda:0')
112: nGT 36, recall 0, proposals 0, loss: x nan, y nan, w nan, h nan, conf nan, cls nan, class_contrast nan, total nan
coord_mask: tensor(nan, device='cuda:0')
128: nGT 39, recall 0, proposals 0, loss: x nan, y nan, w nan, h nan, conf nan, cls nan, class_contrast nan, total nan
coord_mask: tensor(nan, device='cuda:0')
144: nGT 54, recall 0, proposals 0, loss: x nan, y nan, w nan, h nan, conf nan, cls nan, class_contrast nan, total nan

all are nan

Hello, I follow your process, but during the training, after a while, all are nan . I'm very upset

how to use other data set ?

Have you changed other data sets for training? What should I do if I want to change my own data set?

the nan's problem has not been solved

Thanks for your great projects and Thanks for share the code of your great work!
I do this

steps=-1,500,40000,60000
scales=0.1,10,.1,.1

steps=-1,500,40000,60000
scales=0.01,100,.1,.1

but The problem remains,I don't know how to do it to solve this problem.Can you give me some advice?

能否共享一下t-SNE的代码?并分享一下如何使用。感谢

Where is the code for 'Feature disturbance'

Thanks for nice paper and source code!
I'd like to know how you implement the 'Feature disturbance'
Could you notice about it?
Thx,

Questions about nGPU

if I set the ngpus to 4 and batch size to 64, the region loss will have an error.
output.size(0) is 1920. Do you have any idea to solve this error.

Problem：Get result If you want to get the result of model, run:

你好，非常感谢的代码。我在复现代码的时候遇到如下问题
问题：我想要获得结果
执行下面指令：

出现了这个问题下面问题我该怎么解决，谢谢啦

ModuleNotFoundError: No module named 'core'

what i can do for this problem

No such file or directory:

Thanks for your great job. I meet this error when I run the train_model.sh
Traceback (most recent call last):
File "tool/valid_decoupling.py", line 218, in
valid(datacfg, darknet, learnet, weightfile, outfile, use_baserw)
File "tool/valid_decoupling.py", line 36, in valid
m.load_weights(weightfile)
File "/home/zjx/CME/tool/darknet/darknet_decoupling.py", line 379, in load_weights
fp = open(weightfile, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: 'backup/split1_10shot/000010.weights'

How to solve it . Waiting for your reply.

RuntimeError: shape ‘[30, 5, 6, 13, 13]’ is invalid for input of size 5070

Hi，Lee，I want to train VOC2007 and 2012 in which the 20 classes are set to base classes, and my own dataset's 7 classes are set to novel classes,
But, I get a trouble after one epoch training, can you give me some suggestions?
By the way, I run "python train_meta.py cfg/metayolo.data cfg/darknet_dynamic.cfg cfg/reweighting_net.cfg darknet19_448.conv.23
"

About the process datasets

When run gen_fewlist.py,why there are only 2 supports in 3_shot_train.txt file?