Giter VIP home page Giter VIP logo

awesome-emdl's Introduction

EMDL

Embedded and mobile deep learning research notes.

Papers

Survey

  1. EfficientDNNs [Repo]
  2. Awesome ML Model Compression [Repo]
  3. TinyML Papers and Projects [Repo]
  4. TinyML Benchmark: Executing Fully Connected Neural Networks on Commodity Microcontrollers [IEEE '21]
  5. Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better [arXiv '21]
  6. Benchmarking TinyML Systems: Challenges and Direction [arXiv '20]
  7. Model Compression and Hardware Acceleration for Neural Networks: A Comprehensive Survey [IEEE '20]
  8. The Deep Learning Compiler: A Comprehensive Survey [arXiv '20]
  9. Recent Advances in Efficient Computation of Deep Convolutional Neural Networks [arXiv '18]
  10. A Survey of Model Compression and Acceleration for Deep Neural Networks [arXiv '17]

Model

  1. SkyNet: a Hardware-Efficient Method for Object Detection and Tracking on Embedded Systems [MLSys '20, IBM]
  2. Model Rubik's Cube: Twisting Resolution, Depth and Width for TinyNets [arXiv '20, Huawei]
  3. Once for All: Train One Network and Specialize it for Efficient Deployment [arXiv '19, MIT]
  4. GhostNet: More Features from Cheap Operations [arXiv '19, Huawei]
  5. Searching for MobileNetV3 [arXiv '19, Google]
  6. MobilenetV2: Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation [arXiv '18, Google]
  7. ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware [arXiv '18, MIT]
  8. DeepRebirth: Accelerating Deep Neural Network Execution on Mobile Devices [AAAI'18, Samsung]
  9. NasNet: Learning Transferable Architectures for Scalable Image Recognition [arXiv '17, Google]
  10. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices [arXiv '17, Megvii]
  11. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications [arXiv '17, Google]
  12. CondenseNet: An Efficient DenseNet using Learned Group Convolutions [arXiv '17]

System

  1. DeepMon: Mobile GPU-based Deep Learning Framework for Continuous Vision Applications [MobiSys '17]
  2. DeepEye: Resource Efficient Local Execution of Multiple Deep Vision Models using Wearable Commodity Hardware [MobiSys '17]
  3. MobiRNN: Efficient Recurrent Neural Network Execution on Mobile GPU [EMDL '17]
  4. DeepSense: A GPU-based deep convolutional neural network framework on commodity mobile devices [WearSys '16]
  5. DeepX: A Software Accelerator for Low-Power Deep Learning Inference on Mobile Devices [IPSN '16]
  6. EIE: Efficient Inference Engine on Compressed Deep Neural Network [ISCA '16]
  7. MCDNN: An Approximation-Based Execution Framework for Deep Stream Processing Under Resource Constraints [MobiSys '16]
  8. DXTK: Enabling Resource-efficient Deep Learning on Mobile and Embedded Devices with the DeepX Toolkit [MobiCASE '16]
  9. Sparsification and Separation of Deep Learning Layers for Constrained Resource Inference on Wearables [SenSys ’16]
  10. An Early Resource Characterization of Deep Learning on Wearables, Smartphones and Internet-of-Things Devices [IoT-App ’15]
  11. CNNdroid: GPU-Accelerated Execution of Trained Deep Convolutional Neural Networks on Android [MM '16]
  12. fpgaConvNet: A Toolflow for Mapping Diverse Convolutional Neural Networks on Embedded FPGAs [NIPS '17]

Quantization

  1. Quantizing deep convolutional networks for efficient inference: A whitepaper [arXiv '18]
  2. LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks [ECCV'18]
  3. The ZipML Framework for Training Models with End-to-End Low Precision: The Cans, the Cannots, and a Little Bit of Deep Learning [ICML'17]
  4. Compressing Deep Convolutional Networks using Vector Quantization [arXiv'14]
  5. Quantized Convolutional Neural Networks for Mobile Devices [CVPR '16]
  6. Fixed-Point Performance Analysis of Recurrent Neural Networks [ICASSP'16]
  7. Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations [arXiv'16]
  8. Loss-aware Binarization of Deep Networks [ICLR'17]
  9. Towards the Limit of Network Quantization [ICLR'17]
  10. Deep Learning with Low Precision by Half-wave Gaussian Quantization [CVPR'17]
  11. ShiftCNN: Generalized Low-Precision Architecture for Inference of Convolutional Neural Networks [arXiv'17]
  12. Training and Inference with Integers in Deep Neural Networks [ICLR'18]

Pruning

  1. Awesome-Pruning [Repo]
  2. Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration [CVPR'19]
  3. Learning both Weights and Connections for Efficient Neural Networks [NIPS'15]
  4. Pruning Filters for Efficient ConvNets [ICLR'17]
  5. Pruning Convolutional Neural Networks for Resource Efficient Inference [ICLR'17]
  6. Soft Weight-Sharing for Neural Network Compression [ICLR'17]
  7. Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding [ICLR'16]
  8. Dynamic Network Surgery for Efficient DNNs [NIPS'16]
  9. Designing Energy-Efficient Convolutional Neural Networks using Energy-Aware Pruning [CVPR'17]
  10. ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression [ICCV'17]
  11. To prune, or not to prune: exploring the efficacy of pruning for model compression [ICLR'18]

Approximation

  1. Efficient and Accurate Approximations of Nonlinear Convolutional Networks [CVPR'15]
  2. Accelerating Very Deep Convolutional Networks for Classification and Detection (Extended version of above one)
  3. Convolutional neural networks with low-rank regularization [arXiv'15]
  4. Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation [NIPS'14]
  5. Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications [ICLR'16]
  6. High performance ultra-low-precision convolutions on mobile devices [NIPS'17]

Characterization

  1. A First Look at Deep Learning Apps on Smartphones [WWW'19]
  2. Machine Learning at Facebook: Understanding Inference at the Edge [HPCA'19]
  3. NetAdapt: Platform-Aware Neural Network Adaptation for Mobile Applications [ECCV 2018]
  4. Latency and Throughput Characterization of Convolutional Neural Networks for Mobile Computer Vision [MMSys’18]

Libraries

Inference Framework

  1. alibaba/MNN
  2. TensorFlow Lite GPU
  3. TensorFlow Lite
  4. XiaoMi/mace: MACE is a deep learning inference framework optimized for mobile heterogeneous computing platforms.
  5. Tencent/ncnn: ncnn is a high-performance neural network inference framework optimized for the mobile platform
  6. baidu/paddle-mobile
  7. BERT and GPT-2 on iPhone
  8. Apple CoreML
  9. Snapdragon Neural Processing Engine
  10. ARM-software/ComputeLibrary: The ARM Computer Vision and Machine Learning library is a set of functions optimised for both ARM CPUs and GPUs using SIMD technologies, Intro
  11. Microsoft Embedded Learning Library
  12. MXNet Amalgamation
  13. OAID/Tengine: Tengine is a lite, high performance, modular inference engine for embedded device
  14. xmartlabs/Bender: Easily craft fast Neural Networks on iOS! Use TensorFlow models. Metal under the hood.
  15. JDAI-CV/dabnn: dabnn is an accelerated binary neural networks inference framework for mobile platform

Optimization Tools

  1. Neural Network Distiller
  2. An Automatic Model Compression (AutoMC) framework for developing smaller and faster AI applications

Research Demos

  1. RSTensorFlow: GPU Accelerated TensorFlow for Commodity Android Devices

Web

  1. mil-tokyo/webdnn: Fastest DNN Execution Framework on Web Browser

Tutorials

General

  1. Squeezing Deep Learning Into Mobile Phones
  2. Deep Learning – Tutorial and Recent Trends
  3. Tutorial on Hardware Architectures for Deep Neural Networks
  4. Efficient Convolutional Neural Network Inference on Mobile GPUs

NEON

  1. NEON™ Programmer’s Guide

OpenCL

  1. ARM® Mali™ GPU OpenCL Developer Guide, pdf
  2. Optimal Compute on ARM Mali™ GPUs
  3. GPU Compute for Mobile Devices
  4. Compute for Mobile Devices Performance focused
  5. Hands On OpenCL
  6. Adreno OpenCL Programming Guide
  7. Better OpenCL Performance on Qualcomm Adreno GPU

Courses

  1. UW Deep learning systems
  2. Berkeley Machine Learning Systems

General

  1. TensorFlow Android Camera Demo
  2. TensorFlow iOS Example
  3. Caffe2 AICamera

Vulkan

  1. Vulkan API Examples and Demos
  2. Neural Machine Translation on Android

OpenCL

  1. DeepMon

RenderScript

  1. Mobile_ConvNet: RenderScript CNN for Android

Tools

GPU

  1. Bifrost GPU architecture and ARM Mali-G71 GPU
  2. Midgard GPU Architecture, ARM Mali-T880 GPU
  3. Mobile GPU market share

Driver

  1. [Adreno] csarron/qcom_vendor_binaries: Common Proprietary Qualcomm Binaries
  2. [Mali] Fevax/vendor_samsung_hero2ltexx: Blobs from s7 Edge G935F

Related Repos

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.