jaminfong / densenas Goto Github PK
View Code? Open in Web Editor NEWDensely Connected Search Space for More Flexible Neural Architecture Search (CVPR2020)
Home Page: https://arxiv.org/abs/1906.09607
License: Apache License 2.0
Densely Connected Search Space for More Flexible Neural Architecture Search (CVPR2020)
Home Page: https://arxiv.org/abs/1906.09607
License: Apache License 2.0
您好,有一些代码上的问题想要请教一下,我看search.py这个脚本里面,有一些类似下面 super_model.module.display_arch_params() 的操作:
betas, head_alphas, stack_alphas = super_model.module.display_arch_params()
请问这些函数是在哪里定义的?在nn.module的方法中我没有找到对应的api
Hi, thanks for your great work.
Could you please provide the sub loss factor corresponding to DenseNAS-R1 and DenseNAS-R2 respectively for chained cost estimation. I can not find it in the paper and default value in the code is 0.2.
Thank you.
您好,有一个问题请教一下,为什么在search_space_mbv2里面Network类没有定义forward函数?没有forward函数,network里面不同的block之间是怎么衔接起来的呢?
您好,有一个问题想请教一下,DenseNAS中的stage与Mobilenetv2中的stage貌似不是对应的?
mobilenetv2中由7个stage,19个residual block,与论文中的Figure4所展示出来的不一样
您论文里面的mobilenetv2-based是网络结构中的operation包含MBconv的意思么?如果是的话,DenseNAS中的stage数是怎么确定的?
Hi, thanks for your interesting work.
How do you organize your training data? In folder imagenet/train
there are sub-folders like nxxx
. But when using mk_split_img_list.py
to split the training data, it may skip the folders as follows:
if not os.path.isdir(split_path):
continue
Hi !
I am interested in the shape-alignement layer and I would like to know if you could share a snippet of code for this layer ? This part is unfortunately not detailed in the paper.
Thank you much !
hello:
do you use gumbelsoftmax?
i wonder why don't you add random noise in gumbelsoftmax and the temperature doesn't change in whole training process?
thanks
In another issue you mentioned "The code of the search space is released." How do I start search process ? Do you have script similar to run_apis.retrain?
Btw I am the same guy who posted a question on your another work, FNA. :-)
could you please open source the whole training process? I think that would more valuable than this fixed model. Thank you for your time and consideration.
您好,想请教一下我想要retrain自己的model的话,net_config这个档案要如何生成呢?
感谢。
Hi, I am currently trying to reproduce the search process for the default configuration. Could you please provide the following information:
Hi,
Thanks for open-sourcing this work.
I was wondering how to search for a DenseNAS model on CIFAR10 based on ResNet. I have resolved the data loading issue. But for the specification of the search space, I was struggling to make the configuration correct. Because the input size of CIFAR10 is 32*32, I think I need to modify the net_scale
, init_dim
, and last_dim
in imagenet_search_cfg_resnet.yaml
. So I remove seven entries in each list in net_scale
. But I encountered this error:
Original Traceback (most recent call last):
File "/usr/local/anaconda3/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
output = module(*input, **kwargs)
File "/usr/local/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/xx/code/DenseNAS/models/dropped_model.py", line 155, in forward
betas = [branch_weights[block_id][beta_id]
File "/home/xx/code/DenseNAS/models/dropped_model.py", line 155, in <listcomp>
betas = [branch_weights[block_id][beta_id]
IndexError: list index out of range
I greatly appreciate it if you could tell me how to resolve this issue and perform the search on CIFAR10.
您好,非常赞赏你们的工作,有两个地方不太理解,想请教一下。
1 论文中讲到需要有2 个更新的参数,一个是在head layer 里用于选择从前面哪个通道的block中接收输入,还有一个是 stack layer里操作的选择。但是在代码里,却有3个参数,分别是 betas, head_alphas, stack_alphas ,多出来了一个是有关本block输出可以去向哪些block的,这样设置的作用是什么?
2 数据集被分成了train和valid两个部分,在weights更新的步骤里用的是train数据,在arch更新的步骤里用的是valid数据,但是在infer时用的还是valid数据,这部分数据不应该已经被网络学习过了吗?是不是应该用其他的数据再去做infer?
这是我想不明白的2个问题,想请您帮忙解答一下,哪里理解地不对请您指正。
Hi!
Thanks for sharing your great work! I have some questions to ask you.
How to execute your codes? What is the order of executing your codes? If I use your code which paper should be referenced?
Thank you very much!
Best regards,
Liu Jiaqi
您好,有一个问题想请教一下,我在跑完search阶段,准备运行retrain,请问运行时候的net_config文件是再第一个阶段自动生成的么?我的output里面只有excel_record,weights_*.pt这些文件,并没有找到net_config文件,请问是我search阶段出现什么问题了么?
你好,华科的兄弟,知道你是**人我就用中文了。我看了源码,感觉你们这个源码只是针对已经搜索好的代码进行训练的对吗,那搜索过程的源码可不可以给我一下呢。
Hello, thanks for your works.
When I test your code, I found a problem that confuses me a lot.
When I run the search relevant scripts, I found that the super_model tends to choose the small kernel&ratio operation at final in class 'ArchGenerate'. I wonder if I do sth wrong during my test. Here are some parts of log files which prove my problems.
####################
11/19 11:16:15 Derived arch:
[[16, 16], 'mbconv_k3_t1', [], 0, 1]|
[[16, 24], 'mbconv_k3_t3', ['mbconv_k3_t3', 'mbconv_k7_t6'], 2, 2]|
[[24, 48], 'mbconv_k7_t6', ['mbconv_k7_t6'], 1, 2]|
[[48, 72], 'mbconv_k5_t6', ['mbconv_k5_t3', 'mbconv_k7_t6', 'mbconv_k7_t6'], 3, 2]|
[[72, 128], 'mbconv_k3_t6', ['mbconv_k7_t6', 'mbconv_k7_t3'], 2, 1]|
[[128, 192], 'mbconv_k7_t6', ['mbconv_k5_t3', 'mbconv_k7_t6', 'mbconv_k7_t6'], 3, 2]|
[[192, 384], 'mbconv_k5_t3', [], 0, 1]|
[[384, 1984], 'conv1_1']
11/19 11:16:15 Total 19 layers.
11/19 11:16:15 Derived Model Mult-Adds = 424.00MB
11/19 11:16:15 Derived Model Num Params = 5.42MB
11/19 11:16:16 epoch 6 weight_lr 1.992126e-01
#######################
It seems normal before it updates the architecture(before epoch 50)
and tends to choose the smallest kernel&ratio after epoch 50.
#######################
11/20 02:14:01 Derived arch:
[[16, 16], 'mbconv_k3_t1', [], 0, 1]|
[[16, 24], 'mbconv_k3_t3', ['mbconv_k3_t3', 'mbconv_k3_t3', 'mbconv_k3_t3'], 3, 2]|
[[24, 48], 'mbconv_k3_t3', ['mbconv_k3_t3', 'mbconv_k3_t3', 'mbconv_k3_t3'], 3, 2]|
[[48, 72], 'mbconv_k3_t3', ['mbconv_k3_t3', 'mbconv_k3_t3', 'mbconv_k3_t3'], 3, 2]|
[[72, 128], 'mbconv_k3_t3', ['mbconv_k3_t3', 'mbconv_k3_t3', 'mbconv_k3_t3'], 3, 1]|
[[128, 192], 'mbconv_k3_t3', ['mbconv_k3_t3', 'mbconv_k3_t3', 'mbconv_k3_t3'], 3, 2]|
[[192, 384], 'mbconv_k3_t3', [], 0, 1]|
[[384, 1984], 'conv1_1']
11/20 02:14:01 Total 23 layers.
11/20 02:14:01 Derived Model Mult-Adds = 306.19MB
11/20 02:14:01 Derived Model Num Params = 4.46MB
11/20 02:14:02 epoch 54 arch_lr 3.000000e-04
########################
And it almost doesn't change til the end.
I wonder how this phenomenon happens. Hope for your reply!
Hi Jamin,
Thanks for releasing search code.
I ran search process with my custom latency table. After search process got over, it printed three network configs with 465M MAC, 437M MAC and 463M MAC. How do I generate lower latency models like how you were getting for DenseNet-A, B, C (13 ms, 15ms, 17ms)? Actually my objective is to get best model for the specific latency.
Also in this work is there any notion of target latency like in ProxynetNAS where loss term contains penalty to have higher latency than the target latency? Which was making sure that when search process ends, searched model has latency close to target latency.
loss=LossCE+ λ1|w|^2 + λ2*E[latency]
Thanks in advance.
您好,有个问题想请教一下,DenseNAS在search阶段只用到了100类数据,而在retrain阶段使用了全部的1000类,这样设置的原因是因为希望search阶段很快的执行,我这样理解对么?
Hi, sorry to bother you,i just have a question:
how to deal with the problem about list index out of range in the second part of search and in MobileNetV2 search part?
Hello,
Currently, The code to retrain or validate is provided on this repository only.
Will the NAS code be released soon ?
Best Regards,
Atul
In the traditional darts methods, weights and architecture are optimized step by step, but in your paper, weights and architecture are optimized by epoch, is it mean that in this epoch,the model optimizes weights and in the next epoch, the model optimizes the architecture. I am quite confused.
感謝您們釋出這麼棒的研究工作!
在使用你們模組時遇到困難,
想請問,
Search區塊裡:
1.Prepare the image set for search which contains 100 classes of the original ImageNet dataset. And 20% images are used as the validation set and 80% are used as the training set.
1). Generate the split list of the image data.
python dataset/mk_split_img_list.py --image_path 'the path of your ImageNet data' --output_path 'the path to output the list file'
2). Use the image list obtained above to make the lmdb file.
python dataset/img2lmdb.py --image_path 'the path of your ImageNet data' --list_path 'the path of your image list generated above' --output_path 'the path to output the lmdb file' --split 'split folder (train/val)''
...
詳盡的路徑配置和各個floder裡面包含的架構與內容?
從ImageNet下載datasets,
經過各種設置的嘗試,
都沒有成功.
謝謝.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.