Giter VIP home page Giter VIP logo

aognet's Introduction

AOGNets: Deep AND-OR Grammar Network for Visual Recognition

This repository contains the code (in MXNet) for: "AOGNets: Deep AND-OR Grammar Networks for Visual Recognition" paper by Xilai Li, Tianfu Wu*, Xi Song and Hamid Krim. (* Corresponding Author)

Citation

If you find our project useful in your research, please consider citing:

@article{li2017aognet,
  title={AOGNets: Deep AND-OR Grammar Networks for Visual Recognition},
  author={Xilai Li, Tianfu Wu, Xi Song, Hamid Krim},
  journal={arXiv preprint arXiv:1711.05847},
  year={2017}
}

Contents

  1. Introduction
  2. Usage
  3. Results
  4. Contacts

Introduction

An AOGNet consists of a number of stages each of which is composed of a number of AOG building blocks. An AOG building block is designed based on a principled AND-OR grammar and represented by a hierarchical and compositional AND-OR graph. There are three types of nodes: an AND-node explores composition, whose input is computed by concatenating features of its child nodes; an OR-node represents alternative ways of composition in the spirit of exploitation, whose input is the element-wise sum of features of its child nodes; and a Terminal-node takes as input a channel-wise slice of the input feature map of the AOG building block. AOGNets aim to harness the best of two worlds (grammar models and deep neural networks) in representation learning with end-to-end training.

Usage

Install MXNet

please follow the official instruction to install MXNet.

Train on CIFAR-10/100 dataset

As an example, use the following command to train an AOGNet on CIFAR-10 with training setup and network configuration defined in cfgs/cifar10/aognet_cifar10_ps_4_bottleneck_1M.yaml, using two GPUs (gpu_id=0,1).

python main.py --cfg cfgs/cifar10/aognet_cifar10_ps_4_bottleneck_1M.yaml --gpus 0,1

Train on ImageNet-1K dataset

To prepare the trainin dataset (.rec file) for ImageNet-1k, please follow the mxnet image classfication repo or this mxnet resnet implementation by Tornadomeet.

Use following command to train an AOGNet on ImageNet-1K with training setup and network configuration defined in cfgs/imagenet/aognet_imagenet_1k_v1.yaml, using four GPUs and memonger. Memonger[1] is an effective way to save GPU memory when the GPU resource is limited.

python main.py --cfg cfgs/imagenet/aognet_imagenet_1k_v1.yaml --gpus 0,1,2,3 --memonger

Results

Results on CIFAR

Model Params CIFAR-10 (%) CIFAR-100 (%)
AOGNet-4-(1,1,1) 1.0M 5.29 25.98
AOGNet-4-(1,1,1) 8.1M 4.02 20.64
AOGNet-4-(1,1,1) 16.0M 3.79 19.50
AOGNet-BN-4-(1,1,1) 1.0M 4.74 22.81
AOGNet-BN-4-(1,1,1) 8.0M 3.99 18.71
AOGNet-BN-4-(1,2,1) 16.0M 3.78 17.82

The training is done with standard random crop and flip data augmentation.

Results on ImageNet

Model Params Top-1 Err. Top-5 Err. MXNet Model
AOGNet-BN-4-(1,1,1,1) 79.5M 21.49 5.76 Download

Training logs and pretrained models

Our trained models and training logs are downloadable at Google Drive

References

  1. Tianqi Chen, Bing Xu, Chiyuan Zhang, Carlos Guestrin, Training Deep Nets with Sublinear Memory Cost, arXiv:1604.06174, https://github.com/dmlc/mxnet-memonger

Contacts

email: [email protected]

Any discussions and contribution are welcomed!

aognet's People

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.