Giter VIP home page Giter VIP logo

densenas's People

Contributors

jaminfong avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

densenas's Issues

some questions about the code

您好,有一些代码上的问题想要请教一下,我看search.py这个脚本里面,有一些类似下面 super_model.module.display_arch_params() 的操作:
betas, head_alphas, stack_alphas = super_model.module.display_arch_params()
请问这些函数是在哪里定义的?在nn.module的方法中我没有找到对应的api

Sub loss factor in ResNet-based search space

Hi, thanks for your great work.

Could you please provide the sub loss factor corresponding to DenseNAS-R1 and DenseNAS-R2 respectively for chained cost estimation. I can not find it in the paper and default value in the code is 0.2.

Thank you.

some questions about search_space_mbv2.py

您好,有一个问题请教一下,为什么在search_space_mbv2里面Network类没有定义forward函数?没有forward函数,network里面不同的block之间是怎么衔接起来的呢?

关于DenseNAS中stage的疑问

您好,有一个问题想请教一下,DenseNAS中的stage与Mobilenetv2中的stage貌似不是对应的?
mobilenetv2中由7个stage,19个residual block,与论文中的Figure4所展示出来的不一样

您论文里面的mobilenetv2-based是网络结构中的operation包含MBconv的意思么?如果是的话,DenseNAS中的stage数是怎么确定的?

Preparation of the dataset

Hi, thanks for your interesting work.
How do you organize your training data? In folder imagenet/train there are sub-folders like nxxx. But when using mk_split_img_list.py to split the training data, it may skip the folders as follows:

if not os.path.isdir(split_path):
    continue

Shape alignement code

Hi !
I am interested in the shape-alignement layer and I would like to know if you could share a snippet of code for this layer ? This part is unfortunately not detailed in the paper.
Thank you much !

no random noise in gumbelsoftmax?

hello:
do you use gumbelsoftmax?
i wonder why don't you add random noise in gumbelsoftmax and the temperature doesn't change in whole training process?
thanks

Script for search

In another issue you mentioned "The code of the search space is released." How do I start search process ? Do you have script similar to run_apis.retrain?

Btw I am the same guy who posted a question on your another work, FNA. :-)

net_config

您好,想请教一下我想要retrain自己的model的话,net_config这个档案要如何生成呢?

感谢。

Some questions about the search stage

Hi, I am currently trying to reproduce the search process for the default configuration. Could you please provide the following information:

  1. The names of 100 classes and, if possible, the full names of the files used in train and validation splits, since their obtaining in the code is platform-dependent
  2. The specific version of pytorch used, since in the version 1.6.0 there is a bug related to DataParallel and replication of architecture parameters to other gpus (for example, pytorch/pytorch#36035). In this regard, the search stage does not work in multi-gpu enviroment
  3. Log files (log.txt) for search and train stages with default config file (if I understand correctly it matches DenseNAS-C model)

Search on CIFAR10

Hi,

Thanks for open-sourcing this work.
I was wondering how to search for a DenseNAS model on CIFAR10 based on ResNet. I have resolved the data loading issue. But for the specification of the search space, I was struggling to make the configuration correct. Because the input size of CIFAR10 is 32*32, I think I need to modify the net_scale, init_dim, and last_dim in imagenet_search_cfg_resnet.yaml. So I remove seven entries in each list in net_scale. But I encountered this error:

Original Traceback (most recent call last):
  File "/usr/local/anaconda3/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
    output = module(*input, **kwargs)
  File "/usr/local/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/xx/code/DenseNAS/models/dropped_model.py", line 155, in forward
    betas = [branch_weights[block_id][beta_id]
  File "/home/xx/code/DenseNAS/models/dropped_model.py", line 155, in <listcomp>
    betas = [branch_weights[block_id][beta_id]
IndexError: list index out of range

I greatly appreciate it if you could tell me how to resolve this issue and perform the search on CIFAR10.

有关代码的一些问题

您好,非常赞赏你们的工作,有两个地方不太理解,想请教一下。
1 论文中讲到需要有2 个更新的参数,一个是在head layer 里用于选择从前面哪个通道的block中接收输入,还有一个是 stack layer里操作的选择。但是在代码里,却有3个参数,分别是 betas, head_alphas, stack_alphas ,多出来了一个是有关本block输出可以去向哪些block的,这样设置的作用是什么?
2 数据集被分成了train和valid两个部分,在weights更新的步骤里用的是train数据,在arch更新的步骤里用的是valid数据,但是在infer时用的还是valid数据,这部分数据不应该已经被网络学习过了吗?是不是应该用其他的数据再去做infer?
这是我想不明白的2个问题,想请您帮忙解答一下,哪里理解地不对请您指正。

How to execute your codes?

Hi!
Thanks for sharing your great work! I have some questions to ask you.

How to execute your codes? What is the order of executing your codes? If I use your code which paper should be referenced?

Thank you very much!
Best regards,
Liu Jiaqi

retrain阶段的net_config问题

您好,有一个问题想请教一下,我在跑完search阶段,准备运行retrain,请问运行时候的net_config文件是再第一个阶段自动生成的么?我的output里面只有excel_record,weights_*.pt这些文件,并没有找到net_config文件,请问是我search阶段出现什么问题了么?

About source code.

你好,华科的兄弟,知道你是**人我就用中文了。我看了源码,感觉你们这个源码只是针对已经搜索好的代码进行训练的对吗,那搜索过程的源码可不可以给我一下呢。

architecture derived problem

Hello, thanks for your works.
When I test your code, I found a problem that confuses me a lot.
When I run the search relevant scripts, I found that the super_model tends to choose the small kernel&ratio operation at final in class 'ArchGenerate'. I wonder if I do sth wrong during my test. Here are some parts of log files which prove my problems.

####################
11/19 11:16:15 Derived arch:
[[16, 16], 'mbconv_k3_t1', [], 0, 1]|
[[16, 24], 'mbconv_k3_t3', ['mbconv_k3_t3', 'mbconv_k7_t6'], 2, 2]|
[[24, 48], 'mbconv_k7_t6', ['mbconv_k7_t6'], 1, 2]|
[[48, 72], 'mbconv_k5_t6', ['mbconv_k5_t3', 'mbconv_k7_t6', 'mbconv_k7_t6'], 3, 2]|
[[72, 128], 'mbconv_k3_t6', ['mbconv_k7_t6', 'mbconv_k7_t3'], 2, 1]|
[[128, 192], 'mbconv_k7_t6', ['mbconv_k5_t3', 'mbconv_k7_t6', 'mbconv_k7_t6'], 3, 2]|
[[192, 384], 'mbconv_k5_t3', [], 0, 1]|
[[384, 1984], 'conv1_1']
11/19 11:16:15 Total 19 layers.
11/19 11:16:15 Derived Model Mult-Adds = 424.00MB
11/19 11:16:15 Derived Model Num Params = 5.42MB
11/19 11:16:16 epoch 6 weight_lr 1.992126e-01
#######################

It seems normal before it updates the architecture(before epoch 50)
and tends to choose the smallest kernel&ratio after epoch 50.

#######################
11/20 02:14:01 Derived arch:
[[16, 16], 'mbconv_k3_t1', [], 0, 1]|
[[16, 24], 'mbconv_k3_t3', ['mbconv_k3_t3', 'mbconv_k3_t3', 'mbconv_k3_t3'], 3, 2]|
[[24, 48], 'mbconv_k3_t3', ['mbconv_k3_t3', 'mbconv_k3_t3', 'mbconv_k3_t3'], 3, 2]|
[[48, 72], 'mbconv_k3_t3', ['mbconv_k3_t3', 'mbconv_k3_t3', 'mbconv_k3_t3'], 3, 2]|
[[72, 128], 'mbconv_k3_t3', ['mbconv_k3_t3', 'mbconv_k3_t3', 'mbconv_k3_t3'], 3, 1]|
[[128, 192], 'mbconv_k3_t3', ['mbconv_k3_t3', 'mbconv_k3_t3', 'mbconv_k3_t3'], 3, 2]|
[[192, 384], 'mbconv_k3_t3', [], 0, 1]|
[[384, 1984], 'conv1_1']
11/20 02:14:01 Total 23 layers.
11/20 02:14:01 Derived Model Mult-Adds = 306.19MB
11/20 02:14:01 Derived Model Num Params = 4.46MB
11/20 02:14:02 epoch 54 arch_lr 3.000000e-04
########################

And it almost doesn't change til the end.
I wonder how this phenomenon happens. Hope for your reply!

How to get models with different latencies using search process?

Hi Jamin,

Thanks for releasing search code.
I ran search process with my custom latency table. After search process got over, it printed three network configs with 465M MAC, 437M MAC and 463M MAC. How do I generate lower latency models like how you were getting for DenseNet-A, B, C (13 ms, 15ms, 17ms)? Actually my objective is to get best model for the specific latency.

Also in this work is there any notion of target latency like in ProxynetNAS where loss term contains penalty to have higher latency than the target latency? Which was making sure that when search process ends, searched model has latency close to target latency.
loss=LossCE+ λ1|w|^2 + λ2*E[latency]

Thanks in advance.

About Search

您好,有个问题想请教一下,DenseNAS在search阶段只用到了100类数据,而在retrain阶段使用了全部的1000类,这样设置的原因是因为希望search阶段很快的执行,我这样理解对么?

index out of range

Hi, sorry to bother you,i just have a question:
how to deal with the problem about list index out of range in the second part of search and in MobileNetV2 search part?

Could you please tell me something about your strategy

In the traditional darts methods, weights and architecture are optimized step by step, but in your paper, weights and architecture are optimized by epoch, is it mean that in this epoch,the model optimizes weights and in the next epoch, the model optimizes the architecture. I am quite confused.

請問正確的路徑配置?

感謝您們釋出這麼棒的研究工作!

在使用你們模組時遇到困難,
想請問,
Search區塊裡:

1.Prepare the image set for search which contains 100 classes of the original ImageNet dataset. And 20% images are used as the validation set and 80% are used as the training set.

1). Generate the split list of the image data.
python dataset/mk_split_img_list.py --image_path 'the path of your ImageNet data' --output_path 'the path to output the list file'

2). Use the image list obtained above to make the lmdb file.
python dataset/img2lmdb.py --image_path 'the path of your ImageNet data' --list_path 'the path of your image list generated above' --output_path 'the path to output the lmdb file' --split 'split folder (train/val)''
...

詳盡的路徑配置和各個floder裡面包含的架構與內容?

從ImageNet下載datasets,
經過各種設置的嘗試,
都沒有成功.

謝謝.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.