Giter VIP home page Giter VIP logo

enas's Introduction

Efficient Neural Architecture Search via Parameter Sharing

Authors' implementation of "Efficient Neural Architecture Search via Parameter Sharing" (2018) in TensorFlow.

Includes code for CIFAR-10 image classification and Penn Tree Bank language modeling tasks.

Paper: https://arxiv.org/abs/1802.03268

Authors: Hieu Pham*, Melody Y. Guan*, Barret Zoph, Quoc V. Le, Jeff Dean

This is not an official Google product.

Penn Treebank

The Penn Treebank dataset is included at data/ptb. Depending on the system, you may want to run the script data/ptb/process.py to create the pkl version. All hyper-parameters are specified in these scripts.

To run the ENAS search process on Penn Treebank, please use the script

./scripts/ptb_search.sh

To run ENAS with a determined architecture, you have to specify the archiecture using a string. The following is an example script for using the architecture we described in our paper.

./scripts/ptb_final.sh

A sequence of architecture for a cell with N nodes can be specified using a sequence a of 2N + 1 tokens

  • a[0] is a number in [0, 1, 2, 3], specifying the activation function to use at the first cell: tanh, ReLU, identity, and sigmoid.
  • For each i, a[2*i] specifies a previous index and a[2*i+1] specifies the activation function at the i-th cell.

For a concrete example, the following sequence specifies the architecture we visualize in our paper

0 0 0 1 1 2 1 2 0 2 0 5 1 1 0 6 1 8 1 8 1 8 1

CIFAR-10

To run the experiments on CIFAR-10, please first download the dataset. Again, all hyper-parameters are specified in the scripts that we descibe below.

To run the ENAS experiments on the macro search space as described in our paper, please use the following scripts:

./scripts/cifar10_macro_search.sh
./scripts/cifar10_macro_final.sh

A macro architecture for a neural network with N layers consists of N parts, indexed by 1, 2, 3, ..., N. Part i consists of:

  • A number in [0, 1, 2, 3, 4, 5] that specifies the operation at layer i-th, corresponding to conv_3x3, separable_conv_3x3, conv_5x5, separable_conv_5x5, average_pooling, max_pooling.
  • A sequence of i - 1 numbers, each is either 0 or 1, indicating whether a skip connection should be formed from a the corresponding past layer to the current layer.

A concrete example can be found in our script ./scripts/cifar10_macro_final.sh.

To run the ENAS experiments on the micro search space as described in our paper, please use the following scripts:

./scripts/cifar10_micro_search.sh
./scripts/cifar10_micro_final.sh

A micro cell with B + 2 blocks can be specified using B blocks, corresponding to blocks numbered 2, 3, ..., B+1, each block consists of 4 numbers

index_1, op_1, index_2, op_2

Here, index_1 and index_2 can be any previous index. op_1 and op_2 can be [0, 1, 2, 3, 4], corresponding to separable_conv_3x3, separable_conv_5x5, average_pooling, max_pooling, identity.

A micro architecture can be specified by two sequences of cells concatenated after each other, as shown in our script ./scripts/cifar10_micro_final.sh

Citations

If you happen to use our work, please consider citing our paper.

@article{enas,
  title   = {Efficient Neural Architecture Search via Parameter Sharing},
  author  = {Pham, Hieu and
             Guan, Melody Y. and
             Zoph, Barret and
             Le, Quoc V. and
             Dean, Jeff
  },
  journal   = {Arxiv, 1802.03268},
  year      = {2018}
}

enas's People

Contributors

hyhieu avatar melodyguan avatar

Watchers

James Cloos avatar Xiaoteng Zhang avatar paper2code - bot avatar

Forkers

starstylesky

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.