Giter VIP home page Giter VIP logo

negrinho / deep_architect Goto Github PK

View Code? Open in Web Editor NEW
122.0 11.0 17.0 1.85 MB

A general, modular, and programmable architecture search framework

Home Page: https://deep-architect.readthedocs.io/en/latest/

License: MIT License

Python 98.05% Shell 1.95%
architecture-search neural-architecture-search neural-networks hyperparameter-optimization auto-ml machine-learning automatic-machine-learning deep-neural-networks deep-learning colab

deep_architect's Introduction

Overview

[CODE] [DOCUMENTATION] [PAPER] [BLOG POST] [GOOGLE GROUP] [COLAB]

DeepArchitect: Architecture search so easy you'll think it's magic!

Check colab to play around with it and run examples.

Introduction

DeepArchitect is a framework for automatically searching over computational graphs in arbitrary domains, designed with a focus on modularity, ease of use, reusability, and extensibility. DeepArchitect has the following main components:

  • a language for writing composable and expressive search spaces over computational graphs in arbitrary domains (e.g., Tensorflow, Keras, Pytorch, and even non deep learning frameworks such as scikit-learn and preprocessing pipelines);
  • search algorithms that can be used for arbitrary search spaces;
  • logging functionality to easily track search results;
  • visualization functionality to explore search results.

For researchers, DeepArchitect aims to make architecture search research more reusable and reproducible by providing them with a modular framework that they can use to implement new search algorithms and new search spaces while reusing code. For practitioners, DeepArchitect aims to augment their workflow by providing them with a tool to easily write search spaces encoding a large number of design choices and use search algorithms to automatically find good architectures.

Installation

We recommend playing with the code on Colab first.

For a local installation, run the following code snippet:

git clone [email protected]:negrinho/deep_architect.git deep_architect
cd deep_architect
conda create --name deep_architect python=3.6
conda activate deep_architect
pip install -e .

Run one of the examples to check for correctness, e.g., python examples/framework_starters/main_keras.py or python examples/mnist_with_logging/main.py --config_filepath examples/mnist_with_logging/configs/debug.json.

We have included utils.sh with useful development functionality, e.g., to build documentation, extract code snippets from documentation, and build Singularity containers.

A minimal DeepArchitect example with Keras

We adapt this Keras example by defining a search space of models and sampling a random model from it. The original example has a single fixed three-layer neural network with ReLU activations in the hidden layers and dropout with rate equal to 0.2. We construct a search space by relaxing the number of layers that the network can have, choosing between sigmoid and ReLU activations, and the number of units for each dense layer. Check this search space below:

import keras
from keras.datasets import mnist
from keras.models import Model
from keras.layers import Dense, Dropout, Input
from keras.optimizers import RMSprop

import deep_architect.helpers.keras_support as hke
import deep_architect.modules as mo
import deep_architect.hyperparameters as hp
import deep_architect.core as co
import deep_architect.visualization as vi
from deep_architect.searchers.common import random_specify

batch_size = 128
num_classes = 10
epochs = 20

# the data, split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train = x_train.reshape(60000, 784)
x_test = x_test.reshape(10000, 784)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

# model = Sequential()
# model.add(Dense(512, activation='relu', input_shape=(784,)))
# model.add(Dropout(0.2))
# model.add(Dense(512, activation='relu'))
# model.add(Dropout(0.2))
# model.add(Dense(num_classes, activation='softmax'))

D = hp.Discrete


def dense(h_units, h_activation):
    return hke.siso_keras_module_from_keras_layer_fn(Dense, {
        'units': h_units,
        'activation': h_activation
    })


def dropout(h_rate):
    return hke.siso_keras_module_from_keras_layer_fn(Dropout, {'rate': h_rate})


def cell(h_units, h_activation, h_rate, h_opt_drop):
    return mo.siso_sequential([
        dense(h_units, h_activation),
        mo.siso_optional(lambda: dropout(h_rate), h_opt_drop)
    ])


def model_search_space():
    h_activation = D(['relu', 'sigmoid'])
    h_rate = D([0.0, 0.25, 0.5])
    h_num_repeats = D([1, 2, 4])
    return mo.siso_sequential([
        mo.siso_repeat(
            lambda: cell(
                D([256, 512, 1024]), h_activation, D([0.2, 0.5, 0.7]), D([0, 1])
            ), h_num_repeats),
        dense(D([num_classes]), D(['softmax']))
    ])


(inputs, outputs) = mo.SearchSpaceFactory(model_search_space).get_search_space()
random_specify(outputs)
inputs_val = Input((784,))
co.forward({inputs["in"]: inputs_val})
outputs_val = outputs["out"].val
vi.draw_graph(outputs, draw_module_hyperparameter_info=False)
model = Model(inputs=inputs_val, outputs=outputs_val)
model.summary()

model.compile(
    loss='categorical_crossentropy', optimizer=RMSprop(), metrics=['accuracy'])

history = model.fit(
    x_train,
    y_train,
    batch_size=batch_size,
    epochs=epochs,
    verbose=1,
    validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])

This example shows how to introduce minimal architecture search capabilities given an existing Keras example. Our search space encodes that our network will be composed of a sequence of 1, 2, or 4 cells, followed by a final dense module that outputs probabilities over classes. Each cell is a sub-search space (underlining the modularity and composability of DeepArchitect). The choice of the type of activation for the dense layer in the cell search space is shared among all cell search spaces used. All other hyperparameters of the cell search space are chosen independently for each occurrence of the cell search space in the sequence.

The original single Keras model is commented out in the code above to emphasize how little code is needed to support a nontrivial search space. We encourage the reader to think about supporting the same search space using existing hyperparameter optimization tools or in an ad-hoc manner (e.g. how much code would be necessary to encode the search space and sample a random architecture from it).

The tutorials and examples cover additional aspects of DeepArchitect not shown in the code above. This is a slightly more complex example using searchers and logging. These are minimal architecture search examples in DeepArchitect across deep learning frameworks. They should be straightforward to adapt for your use cases.

Framework components

The main concepts in DeepArchitect are:

  • Search spaces: Search spaces are constructed by arranging modules (both basic and substitution) and hyperparameters (independent and dependent). Modules are composed of inputs, outputs, and hyperparameters. A search space is passed around as a dictionary of inputs and a dictionary of outputs, allowing us to seamlessly deal with search spaces with multiple modules. Substitution modules rely heavily on delayed evaluation. Search space transitions result from value assignments to independent hyperparameters. Relevant code references to read for these ideas are deep_architect/core.py and deep_architect/modules.py.

  • Searchers: Searchers interact with search spaces through a simple API. A searcher samples a model from the search space by assigning values to each independent hyperparameter, until there are no unassigned independent hyperparameters left. A searcher is instantiated with a search space. The base API for the searcher has two methods sample, which samples an architecture from the search space, and update, which takes the results for a sampled architecture and updates the state of the searcher. Examples for the searcher API can be found at deep_architect/searchers/common.py, deep_architect/searchers/random.py, and deep_architect/searchers/smbo.py. It is also worth to look at deep_architect/core.py and for the traversal functionality to iterate over the independent hyperparameters in the search space.

  • Evaluators: Evaluators take a sampled architecture from the search space and compute performance metrics for it. Evaluators often have a single method named eval that takes an architecture and returns a dictionary with evaluation results. In the simplest case, there is a single performance metric of interest (e.g., validation accuracy). See here for an example implementation of an evaluator.

  • Logging: When we run an architecture search workload, we evaluate multiple architectures in the search space. We maintain a folder per evaluation to keep track of the generated results (e.g., validation accuracy, number of parameters, example predictions, and model checkpoints). Code for logging can be found in deep_architect/search_logging.py. A simple example using logging is found here.

  • Visualization: Visualization allows us to inspect the structure of a search space and to visualize search space transitions. These visualizations can be useful for debugging, e.g., checking if a search space was correctly encoded. There are also visualizations to calibrate the necessary evaluation effort to recover the correct performance ordering for architectures in the search space, e.g., how many epochs do we need to invest to identify the best architecture (e.g., lies in the top 5). Code for visualization can be found in deep_architect/visualization.py.

Main folder structure

The most important source files live in the deep_architect folder. The tutorials cover much of the information needed to extend the framework. See below for a high-level tour of the repo.

  • core.py: Most important classes to define search spaces.
  • hyperparameters.py: Basic hyperparameters and auxiliary hyperparameter sharer class.
  • modules.py: Definition of substitution modules along with auxiliary functionality to connect modules or construct larger search spaces from simpler search spaces.
  • search_logging.py: Functionality to keep track of the results of the architecture search workload, allowing us to maintain structured folders for each search experiment.
  • utils.py: Utility functions not directly related to architecture search, but useful in many related contexts such as logging and visualization.
  • visualization.py: Simple visualizations to inspect search spaces as graphs or sequences of graphs.

There are also a few folders in the deep_architect folder.

  • contrib: Useful code that may or may not be maintained over time. Contributions by the community will live in this folder. See here for an in-depth explanation for the rationale behind the project organization and the contrib folder.
  • helpers: Helpers for the current frameworks that we support. This allows us to take the base functionality defined in core.py and expand it to provide compilation functionality for computational graphs across frameworks. It should be instructive to compare support for different frameworks. One file per framework.
  • searchers: Searchers that can be used for search spaces defined in DeepArchitect. One searcher per file.
  • surrogates: Surrogate functions over architectures in the search space. searchers based on sequential model based optimization are used frequently in DeepArchitect.

Roadmap for the future

The community will have a fundamental role in extending DeepAchitect. For example, authors of existing architecture search algorithms can reimplement them in DeepArchitect, allowing the community to use them widely. This sole fact will allow progress on architecture search to be measured more reliably. New search spaces for new tasks can be implemented, allowing users to use them (either directly or in the construction of new search spaces) in their experiments. New evaluators and visualizations can be implemented.

Willing contributors should reach out and check the contributing guide. We expect to continue extending and maintaining the DeepArchitect and use it for our research.

Reaching out

You can reach me at [email protected] or at @rmpnegrinho. If you tweet about DeepArchitect, please use the tag #DeepArchitect and/or mention me (@rmpnegrinho) in the tweet. For bug reports, questions, and suggestions, use Github issues. Use the Google group for more casual usage questions.

License

DeepArchitect is licensed under the MIT license as found here. Contributors agree to license their contributions under the MIT license.

Contributors and acknowledgments

The lead researcher for DeepArchitect is Renato Negrinho. Daniel Ferreira played an important initial role in designing APIs through discussions and contributions. This work benefited immensely from the involvement and contributions of talented CMU undergraduate students (Darshan Patil, Max Le, Kirielle Singajarah, Zejie Ai, Yiming Zhao, Emilio Arroyo-Fang). This work benefited greatly from discussions with faculty (Geoff Gordon, Matt Gormley, Graham Neubig, Carolyn Rose, Ruslan Salakhutdinov, Eric Xing, and Xue Liu), and fellow PhD students (Zhiting Hu, Willie Neiswanger, Christoph Dann, and Matt Barnes). This work was partially done while Renato Negrinho was a research scientist at Petuum. This work was partially supported by NSF grant IIS 1822831. We thank a generous GCP grant for both CPU and TPU compute.

References

If you use this work, please cite:

@article{negrinho2017deeparchitect,
  title={Deeparchitect: Automatically designing and training deep architectures},
  author={Negrinho, Renato and Gordon, Geoff},
  journal={arXiv preprint arXiv:1704.08792},
  year={2017}
}

@article{negrinho2019towards,
  title={Towards modular and programmable architecture search},
  author={Negrinho, Renato and Patil, Darshan and Le, Nghia and Ferreira, Daniel and Gormley, Matthew and Gordon, Geoffrey},
  journal={Neural Information Processing Systems},
  year={2019}
}

The code for negrinho2017deeparchitect can be found here. The ideas and implementation of negrinho2017deeparchitect evolved into the work of negrinho2019towards, found in this repo. See the paper, documentation, and blog post. The code for the experiments reported in negrinho2019towards can be found here, but it will not be actively maintained. For your work, please build on top of the deep_architect repo instead.

deep_architect's People

Contributors

dapatil211 avatar dcferreira avatar fizzxed avatar negrinho avatar nle18 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deep_architect's Issues

A issues about the enas_searcher.py

hi, i am trying to run your code which is error:
deep_architect-master\dev\enas\searcher\enas_searcher.py.
I wonder if you have the same bug. let's look at this: in enas_searcher.py line90
the code is : outputs, list(hs.values()))):
pycharm notes : Unresolved reference 'hs'
I sincerely hope that you can help to solve this problem. Thanks

Performance issue in /dev/evaluators (by P3)

Hello! I've found a performance issue in estimator_classification.py: dataset.batch(batch_size)(line 51) should be called before dataset.map(augmentation, num_parallel_calls=8)(line 50), which could make your program more efficient.

Here is the tensorflow document to support it.

Besides, you need to check the function augmentation called in dataset.map(augmentation, num_parallel_calls=8) whether to be affected or not to make the changed code work properly. For example, if augmentation needs data with shape (x, y, z) as its input before fix, it would require data with shape (batch_size, x, y, z) after fix.

Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.

A little questuon about ENAS

hi,i'm new in your project,and it is the first time i ask question on github,Please bear with me if there are any mistakes. I find something may be wrong in your code about ENAS.
when i finish a code about ENAS by imitating your "main_tensorflow.py" and i run it and find some error.

Traceback (most recent call last):

File "C:/Users/JssI/Desktop/deep_architect-masterPZJ/ENAS.py", line 322, in main()

File "C:/Users/JssI/Desktop/deep_architect-masterPZJ/ENAS.py", line 312, in main inputs, outputs, _ ,searcher_eval_token = searcher.sample()

File "C:\Users\JssI\Desktop\deep_architect-masterPZJ\dev\enas\searcher\enas_searcher.py", line 104, in sample outputs, list(hs.values()))):
NameError: name 'hs' is not defined

so i click into the "enas_searcher.py" i find that there is one variable is not declared which named "hs"

`

 def sample(self):
    arc = self._sample()
    idx = 0
    hyp_values = {}
    for i in range(1, self.num_layers + 1):
        hyp_values['op_' + str(i)] = arc[idx]
        idx += 1
        for j in range(i - 1):
            hyp_values['H.skip_' + str(j) + '_' + str(i) + '-0'] = arc[idx]
            idx += 1

    inputs, outputs = self.search_space_fn()
    vs = []
    for i, h in enumerate(
            unassigned_independent_hyperparameter_iterator(
                outputs, list(hs.values()))):
        if h.get_name() in hyp_values:
            v = h.vs[hyp_values[h.get_name()]]
            h.assign_value(v)
            vs.append(v)
        else:
            v = random_specify_hyperparameter(h)
            vs.append(v)
    return inputs, outputs, vs, {'arc': arc}

`

What you do is a really nice work and i want to use your code to finish my experiment during my holiday, i will very very appreciate you if you could help me solve the problem. Thank u!

Some issues when combining `deep_architect` and `ray.tune`

Hi, first of all, I'd like to thank you for building and releasing deep_architect.

I am opening this issue because I'd like to use deep_architect together with ray.tuneto get the best of both worlds, but I encountered some issues. Feel free to close this if you think it is out of the scope of the project.

My goal is to use the sampling capabilities of deep_architect and the tools for multiprocessing and logging of ray and ray.tune. Therefore I'm using tune.run and tune.Trainable with the searchers, helpers and modules of deep_architect.

If I write my code with the call to the sampling function inside the _setup method of a tune.Trainable

https://gist.github.com/iacolippo/1262c8afbfd9f5e491add5fbae105afa (line 124)

then I have an issue with ray(tensorboard) logging. I'd say this is not an issue of deep_architect, and it shouldn't be too hard to fix in the source code of ray if need be.

If I write my code as ray wants it (the config["model"] is the model object, in this case, a PytorchModel from deep_architect), then I have a different error.

RuntimeError: Only Tensors created explicitly by the user (graph leaves) support the deepcopy protocol at the moment

https://gist.github.com/iacolippo/3f815fa90c254f7a065bdc446406233a (not that the () disappeared at line 124)

This might be an issue with deep_architect and multiprocessing, or Pytorch itself, I don't know, I didn't dig into it too much for lack of time. Here is the traceback.

traceback.log

I am using

-e [email protected]:negrinho/deep_architect.git@3427c5d45b0cbdc9c2fe1f4e5213f6961ef41749#egg=deep_architect
ray==0.8.4
torch==1.5.0
torchvision==0.6.0a0+82fd1c8

Stay safe!

Can it generate an architecture from search space without defining a fixed network?

Hi~ it's a nice work! However, according to document, I find the search space is a super-net, you have already define the connection between each modules or blocks. I just curious whether the framework can generate a architecture without prior knowledge about architecture structure? I just want to get modules from search space and automatically connect them together.

Question about RNN search spaces

Hi, I was playing around a little with your framework and wanted to create a search space for a RNN but i failed to get a working RNN out-of-box: I could create a cell and plug that cell manually into an vanilla RNN that is encapsulating the cell and iterating over the temporal dimension.

Now I'm wondering if there is any way to get a complete RNN (not just the cell) as the result of an architecture search with your framework.

I tried various approaches like expressing the outer temporal loop as a substitution module and tried to instantiate the cell within the compile function. That wasn't working (probably because not all variables of the cell's search space have been set before the call to compile_fn()).

Another idea (that is quickly rejected) was to connect the cell's output with the cell's input, but that would produce a cyclic graph...

To me it seems that with the current framework it is not possible to create true recursive search spaces. The kind of recursion you show in your paper is no true recursive search space, the result is a linear/sequential one (referring to fig. 18 in your paper) .

Is what I'm trying to do possible?
Could you help me to make this work somehow (in an elegant, non-hacky way)?
And how would you define e.g. a multi-layer RNN with individual, independent cell search spaces for each layer, so your don't reuse the same cell on every layer?

FYI: I have also posted this on the google groups board.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.