Giter VIP home page Giter VIP logo

cudainference's Introduction

CudaInference

Cuda NN inference. Example: ResNet18 in source/main.cpp.

Functionality implemented:

  • Convolution (via im2col) - with/without bias, arbitrary padding, arbitrary stride. Uses cuBLAS and thrust
  • Linear - with/without bias. Uses cuBLAS and thrust.
  • BatchNorm.
  • ReLU.
  • MaxPool - arbitrary padding, arbitrary stride.
  • AvgPool - arbitrary padding, arbitrary stride.
  • Tensor operations:
    • common operations (+, -, *, /).
    • transpose - arbitrary number of dimentions, arbitrary axes permutation.
    • reshape.

Features:

  • Inference works with arbitrary batch size.
  • NN weights are read from files on the disk. python directory contains weights and scripts to save pretrained weights to the disk.
  • Any ResNet can be implemented with this functionality.
  • Result is fully equivalent to Pytorch forward pass.
  • Input image must:
    • be RGB image with 3 channels
    • be in PPM format
    • be exactly 224x224

Build:

mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release .. 
make -j

Usage:

./Release/cuda_proj --input ../images/cat.ppm --weights_dir ../python/weights/ --batch_size 16 --iters 100

The program will fill all inputs in the batch with image ../images/cat.ppm and will perform 100 forward passes. Predicted labels and FPS will be prited.

Benchmarks:

Benchmarks were done with batch_size == 16.

FPS:

Mode FPS
CPU (Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz) (Pytorch, 4 threads) 48
CPU (Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz) (Pytorch, 16 threads) 81
GPU (GeForce GTX 1080 Ti) (Pytorch) 2050
GPU (GeForce GTX 1080 Ti) (This repo) 445

Memory:

Mode Memory usage
Pytorch 1317 MB
This Repo 2571 MB

cudainference's People

Contributors

borislestsov avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.