yehli / imagenetmodel Goto Github PK
View Code? Open in Web Editor NEWOfficial ImageNet Model repository
License: Apache License 2.0
Official ImageNet Model repository
License: Apache License 2.0
Is there a pre-trained model? Thank you
您好,请问可以提供label_top5_train_nfnet 这个数据吗,方便复现
Thanks for provide us such excellent work!But could you please provide the visualization code ? I've tried to visualize my model but still have difficult. Thanks very much !
2022-08-26 19:04:36,938 - mmdet - INFO - workflow: [('train', 1)], max: 150 epochs
2022-08-26 19:04:36,938 - mmdet - INFO - Checkpoints will be saved to G:\Code\mmdetection-2.25.1\work_dirs\Version2\RetinaNet_WaveVit by HardDiskBackend.
Traceback (most recent call last):
File "tools/train.py", line 242, in
main()
File "tools/train.py", line 238, in main
meta=meta)
File "g:\code\mmdetection-2.25.1\mmdet\apis\train.py", line 244, in train_detector
runner.run(data_loaders, cfg.workflow)
File "E:\Anaconda3\envs\new_water\lib\site-packages\mmcv\runner\epoch_based_runner.py", line 127, in run
epoch_runner(data_loaders[i], **kwargs)
File "E:\Anaconda3\envs\new_water\lib\site-packages\mmcv\runner\epoch_based_runner.py", line 50, in train
self.run_iter(data_batch, train_mode=True, **kwargs)
File "E:\Anaconda3\envs\new_water\lib\site-packages\mmcv\runner\epoch_based_runner.py", line 30, in run_iter
**kwargs)
File "E:\Anaconda3\envs\new_water\lib\site-packages\mmcv\parallel\data_parallel.py", line 75, in train_step
return self.module.train_step(*inputs[0], **kwargs[0])
File "g:\code\mmdetection-2.25.1\mmdet\models\detectors\base.py", line 248, in train_step
losses = self(**data)
File "E:\Anaconda3\envs\new_water\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "E:\Anaconda3\envs\new_water\lib\site-packages\mmcv\runner\fp16_utils.py", line 110, in new_func
return old_func(*args, **kwargs)
File "g:\code\mmdetection-2.25.1\mmdet\models\detectors\base.py", line 172, in forward
return self.forward_train(img, img_metas, **kwargs)
File "g:\code\mmdetection-2.25.1\mmdet\models\detectors\single_stage.py", line 82, in forward_train
x = self.extract_feat(img)
File "g:\code\mmdetection-2.25.1\mmdet\models\detectors\single_stage.py", line 43, in extract_feat
x = self.backbone(img)
File "E:\Anaconda3\envs\new_water\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "g:\code\mmdetection-2.25.1\mmdet\models\backbones\wavevit.py", line 490, in forward
x = self.forward_features(x)
File "g:\code\mmdetection-2.25.1\mmdet\models\backbones\wavevit.py", line 483, in forward_features
x = blk(x, H, W)
File "E:\Anaconda3\envs\new_water\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "g:\code\mmdetection-2.25.1\mmdet\models\backbones\wavevit.py", line 321, in forward
x = x + self.drop_path(self.attn(self.norm1(x), H, W))
File "E:\Anaconda3\envs\new_water\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "g:\code\mmdetection-2.25.1\mmdet\models\backbones\wavevit.py", line 224, in forward
x_dwt = self.dwt(self.reduce(x))
File "E:\Anaconda3\envs\new_water\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "g:\code\mmdetection-2.25.1\mmdet\models\backbones\wavevit.py", line 123, in forward
return DWT_Function.apply(x, self.w_ll, self.w_lh, self.w_hl, self.w_hh)
File "g:\code\mmdetection-2.25.1\mmdet\models\backbones\wavevit.py", line 23, in forward
x_ll = torch.nn.functional.conv2d(x, w_ll.expand(dim, -1, -1, -1), stride=2, groups=dim)
RuntimeError: expected scalar type Float but found Half
Thank you for your great work! I am interested in the wavelet transform in Wave-ViT.
In my opinion, when the wavelet transform is applied directly to an image, we can obtain high and low frequency components. However, when we perform wavelet transform on high-dimensional features (which may mainly contain semantic features), do the output results still correspond to high-frequency components and low-frequency components?
Wondering how you understand the application of wavelet transform to feature space, or can you recommend some references on the use of wavelet transform in the feature space? I would be very grateful if you could help me solve the doubts. Thanks!
How to load the checkpoint from your pretrained weights which provided on the baidu cloud disk? Thank you very much!
Thank you for the wonderful works!
I am confused about the training resolution of Dual-ViT. From the code, the training resolution is 224 (see https://github.com/YehLi/ImageNetModel/blob/main/classification/main.py#L59), but from the paper, the resolution is 254. Is it a typo in the paper?
Thank you so much.
Please share the code to compute the attention map using Score-CAM in Figure 4 of "Wave-ViT: Unifying Wavelets and Transformers for Visual Representation Learning".
Thanks.
Nice work!
Is there any potential to use wavevit on video relavent tasks ?
I notice that the multiscale vision transformer (MViT) directly uses their pooling attention to video with dimention of TxHxW, for video recognition tasks.
Consider that your wavevit only adopts wavelet transform in the spatial domain, I think the current version is not suitable for video tasks which needs the time domain modeling.
What is your opinion on this?
Your research is excellent.
I would like to download the Wave-ViT pre-training model, but I don't have a Baidu account, so could you please share it with me on Google Drive?
hello, I want to ask what is label_top5_train_nfnet?
How can I combine the wavevit model with the CNN like YOLOv5, I am am interested in this work and try to use this approach to solve the infrared picture with YOLOV5, Can you give me some advice to improve?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.