Giter VIP home page Giter VIP logo

lucasmazz / deploying_cnn_on_fpga_using_opencl Goto Github PK

View Code? Open in Web Editor NEW

This project forked from er1cz/deploying_cnn_on_fpga_using_opencl

0.0 0.0 0.0 32.46 MB

Squeezenet V1.1 on Cyclone V SoC-FPGA at 450ms/image, 20x faster than ARM A9 processor alone. A project for 2017 Innovate FPGA design contest.

License: MIT License

C++ 1.20% Python 0.01% C 0.11% Objective-C 97.56% Jupyter Notebook 1.12%

deploying_cnn_on_fpga_using_opencl's Introduction

Deploying CNN on FPGA using OpenCL

This is a project for 2017 Innovate FPGA design contest. We hope this project can somehow help those who want to accelerate CNN on resouce-limited embedded systems with FPGA using OpenCL. Origin project link: PR065.

Prerequisites:

  • Board: Terasic DE10-Nano with Cyclone V SoC-FPGA (800MHz Dual-core Cortex-A9 processor & 110K LEs FPGA)
  • Software: Intel FPGA SDK for OpenCL 17.1

System diagram:

System diagram The whole system is running on DE10-Nano board. Another version with GUI will be updated later.

To use:

  • Copy 2 files in /bin/v2.0 folder & /src/common/synset_words.txt to /your_path on the TF card for DE10-Nano with Terasic Offical OpenCL BSP image
  • Set up UART connection between DE10-Nano and PC
  • Login as root
  • Type in commands:
    • cd ~
    • source ./init_opencl.sh
    • cd /your_path/
    • aocl program /dev/acl0 squeezenet.aocx
    • chmod +x squeezenet
    • ./squeezenet

Input image:

Result should be like this:

SqueezeNet on FPGA start:
kernel version 2.0

conv1 takes: 57.173 ms
block1 takes: 84.526 ms
block2 takes: 81.311 ms
block3 takes: 113.345 ms
classifier takes: 115.184 ms
total: 451.539 ms

predicted label: n02106662 German shepherd, German shepherd dog, German police dog, alsatian

done

Resource usage:

+--------------------------------------------------------------------+
; Estimated Resource Usage Summary                                   ;
+----------------------------------------+---------------------------+
; Resource                               + Usage                     ;
+----------------------------------------+---------------------------+
; Logic utilization                      ;   86%                     ;
; ALUTs                                  ;   57%                     ;
; Dedicated logic registers              ;   34%                     ;
; Memory blocks                          ;   68%                     ;
; DSP blocks                             ;   50%                     ;
+----------------------------------------+---------------------------;  

We believe Cyclone V FPGA on DE10-Nano board can be a perfect solution to deploy CNN on embedded systems: using its ARM processor as traditional controller and FPGA as a low power & low latency accelerator. Our simple 120-line OpenCL implement of FPGA accelerator for CNN vividly demonstrates the accessibility and effectiveness of FPGA in high performance computing. Also, we hope our tutorial can help those who want to set foot on this topic and are having a hard time getting started.

For more details, please read A getting started tutorial on FPGA implement of CNN using OpenCL.

deploying_cnn_on_fpga_using_opencl's People

Contributors

er1cz avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.