xilinx / finn Goto Github PK

View Code? Open in Web Editor NEW

682.0 31.0 218.0 79.21 MB

Dataflow compiler for QNN inference on FPGAs

Home Page: https://xilinx.github.io/finn

License: BSD 3-Clause "New" or "Revised" License

Python 73.71% Shell 0.79% Jupyter Notebook 11.32% C++ 0.80% Verilog 3.47% Tcl 0.38% SystemVerilog 9.52%

dataflow quantization fpga compiler neural-network

finn's Introduction

Fast, Scalable Quantized Neural Network Inference on FPGAs

FINN is an experimental framework from Integrated Communications and AI Lab of AMD Research & Advanced Development to explore deep neural network inference on FPGAs. It specifically targets quantized neural networks, with emphasis on generating dataflow-style architectures customized for each network. The resulting FPGA accelerators are highly efficient and can yield high throughput and low latency. The framework is fully open-source in order to give a higher degree of flexibility, and is intended to enable neural network research spanning several layers of the software/hardware abstraction stack.

We have a separate repository finn-examples that houses pre-built examples for several neural networks. For more general information about FINN, please visit the project page and check out the publications.

Getting Started

Please see the Getting Started page for more information on requirements, installation, and how to run FINN in different modes. Due to the complex nature of the dependencies of the project, we only support Docker-based execution of the FINN compiler at this time.

What's New in FINN?

Please find all news under GitHub discussions Announcements.

Documentation

You can view the documentation on readthedocs. Additionally, there is a series of Jupyter notebook tutorials, which we recommend running from inside Docker for a better experience.

Community

We have GitHub discussions where you can ask questions. You can use the GitHub issue tracker to report bugs, but please don't file issues to ask questions as this is better handled in GitHub discussions.

We also heartily welcome contributions to the project, please check out the contribution guidelines and the list of open issues. Don't hesitate to get in touch over GitHub discussions to discuss your ideas.

In the past, we also had a Gitter channel. Please be aware that this is no longer maintained by us but can still be used to search for questions previous users had.

Citation

The current implementation of the framework is based on the following publications. Please consider citing them if you find FINN useful.

@article{blott2018finn,
  title={FINN-R: An end-to-end deep-learning framework for fast exploration of quantized neural networks},
  author={Blott, Michaela and Preu{\ss}er, Thomas B and Fraser, Nicholas J and Gambardella, Giulio and O’brien, Kenneth and Umuroglu, Yaman and Leeser, Miriam and Vissers, Kees},
  journal={ACM Transactions on Reconfigurable Technology and Systems (TRETS)},
  volume={11},
  number={3},
  pages={1--23},
  year={2018},
  publisher={ACM New York, NY, USA}
}

@inproceedings{finn,
author = {Umuroglu, Yaman and Fraser, Nicholas J. and Gambardella, Giulio and Blott, Michaela and Leong, Philip and Jahre, Magnus and Vissers, Kees},
title = {FINN: A Framework for Fast, Scalable Binarized Neural Network Inference},
booktitle = {Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays},
series = {FPGA '17},
year = {2017},
pages = {65--74},
publisher = {ACM}
}

Old version

We previously released an early-stage prototype of a toolflow that took in Caffe-HWGQ binarized network descriptions and produced dataflow architectures. You can find it in the v0.1 branch in this repository. Please be aware that this version is deprecated and unsupported, and the main branch does not share history with that branch so it should be treated as a separate repository for all purposes.

finn's People

Contributors

Stargazers

Watchers

Forkers

rollingstone openlian chucq hpc-ken cathalmccabe michaelablott jinyige riple hyunwookim2 luisjalabert xiftai hectorkutleng zhaoqxcn honorpeter elegz lsy105 shiruizhao sspeng agileehsan andrearigoni ailearnerli flavio58it tallurips91 danielm322 gitter-badger liuwenbo3 l4es ahn-github mingsung111 samux87 realnedsanders vishwas1234567 makerkrunner woonhian ironmagnus anantabalaji robertdigital haoxi1999 rspwfpgas zhihong8888 giuseppe5 volcacius zjru wxbbuaa2011 drheli heborras qianyong12 quetric spontaneousduck axdy jpaudel8 zeta1999 zhangkai0121 zxh0717 microrasus xuchengustb invywyh jkhhai tealteal fullmoon12185 godsaysno tenvinc jterry-x sjain-stanford narcisaguran alinavalinav zhuangzhuangwu felixhsu123 eric-zhang1990 gowgos5 jonassundseth umav1511 mmrahorovic erronknight yuanchunyu qducasse ddanag googol-lab liyue2ppy fpjentzsch therealcs1010 yuki-katashina saitej25 masterscott neilkimn wenlong-qi aclich lp247 flyingdorothia aidins1 lirui-shanghaitech ocakgun mcevik0 arash1902 damienberriaud deepware-ai rbcarlos jalezeta niiice surangamh

finn's Issues

Question about layers of cnv network

Hello,

I have two questions regarding the layers of cnv network.

Why is there a max-pool layer after the last 256 filter convolution layer (conv6)?
As I understand there is no padding in the convolution layers thus the inputs for layers would be: 32x32(conv1)->30x30(for conv2)->28x28(maxpool1)->14x14(conv3)->12x12(conv4)->10x10(max-pool2)->5x5(conv5)->3x3(conv6)->1x1(max-pool3)->1x1(FullyConnected1). Isn't max-pool3 redundant since there is only one input?
Why there is no Softmax layer placed at the end? Wouldn't it make the end results clearly defined?

Thank you!

cannot stat file when running the repo synthesis example

When running:

$ python FINN/bin/finn --device=pynqz1 --prototxt=FINN/inputs/cnv-w1a1.prototxt --caffemodel=FINN/inputs/cnv-w1a1.caffemodel --mode=synth

I get the following error:

cp: cannot stat '/tmp/finn-build-DpC6wt/finnaccel/finnaccel.runs/impl_1/procsys_wrapper.bit': No such file or directory Traceback (most recent call last): File "FINN/bin/finn", line 192, in <module> process_args(args) File "FINN/bin/finn", line 92, in process_args generate_hardware(net, dev, gen_bitfile=True) File "FINN/bin/finn", line 161, in generate_hardware ret.synthesis_bitfile() File "/mnt/terabyte/pmousoul_data/sw/FINN/FINN/backend/fpga/backend_util.py", line 71, in synthesis_bitfile subprocess.check_call(["sh", self.getBitfileSynthScriptPath()], cwd=self.path) File "/usr/lib/python2.7/subprocess.py", line 541, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['sh', '/tmp/finn-build-DpC6wt/make_pynq_bitfile.sh']' returned non-zero exit status 1

Any idea on how to resolve this?

Any limitation on layer array size?

I am trying to input a 227x227 image to the network.
Is it possible?

Problem with Vivado

Hello,

I am trying to test the following command:
"python FINN/bin/finn --device=pynqz1 --prototxt=FINN/inputs/cnv-w1a1.prototxt --caffemodel=FINN/inputs/cnv-w1a1.caffemodel --mode=synth"

However I am having problems getting it to run, here are the details from the terminal:
"
Most computationally expensive layer is #1, a FPGABipolarConvThresholdLayer with 57802752 operations.
Achieved fps of 7086.167800 with 11.216347% LUT utilisation and 54.239471% BRAM utilisation
LUTS: 5967/53200, BRAM 151/279
Outputting to: /tmp/finn-build-xbtloK
/tmp/finn-build-xbtloK/make_pynq_bitfile.sh: 11: /tmp/finn-build-xbtloK/make_pynq_bitfile.sh: vivado_hls: not found
/tmp/finn-build-xbtloK/make_pynq_bitfile.sh: 13: /tmp/finn-build-xbtloK/make_pynq_bitfile.sh: vivado: not found
cp: cannot stat '/tmp/finn-build-xbtloK/finnaccel/finnaccel.runs/impl_1/procsys_wrapper.bit': No such file or directory
Traceback (most recent call last):
File "FINN/bin/finn", line 192, in
process_args(args)
File "FINN/bin/finn", line 92, in process_args
generate_hardware(net, dev, gen_bitfile=True)
File "FINN/bin/finn", line 161, in generate_hardware
ret.synthesis_bitfile()
File "/app/FINN/FINN/backend/fpga/backend_util.py", line 71, in synthesis_bitfile
subprocess.check_call(["sh", self.getBitfileSynthScriptPath()], cwd=self.path)
File "/usr/lib/python2.7/subprocess.py", line 190, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sh', '/tmp/finn-build-xbtloK/make_pynq_bitfile.sh']' returned non-zero exit status 1
root@4d94b8fd7427:/app/FINN#
"

From my understanding, this is to do with Vivado not being located.
I'm using Linux Ubuntu 18.04 with Vivado 2018.2 installed.

I am very new to linux, so I believe it is my lack of knowledge in this area where I am really struggling on.

Kind regards

Input data format

Hi,

a while ago I downloaded what is now version 0.1 and finally found some time to have a look at it, only to find it is more or less deprecated by now. But I do hope you can still help me with me problem...

Some Background: I tried to implement the CNV example on a ZED Board and to understand how to input the data to the network, I started with the generated HLS code and wrote a little function that reads in test data, creates the input data array and calls the generated function, everything still in software/C++. The program runs without errors, however the results I read back are always the same numbers, no matter if I provide real test image data or randomly generated values.

I had a closer look at the code and stumbled across this part in wrapper.h:

// convert to bipolar encoding, mapping {-1, 1} -> {0, 1} stream<T> FPGABipolarConvThresholdLayer_0_bipolarenc; ScaleShiftByConstant(singleInStrm, FPGABipolarConvThresholdLayer_0_bipolarenc, 0.500000f, 0.500000f, 3072); // cast to single input element data type stream<ap_uint<1> > FPGABipolarConvThresholdLayer_0_iconvert; Cast(FPGABipolarConvThresholdLayer_0_bipolarenc, FPGABipolarConvThresholdLayer_0_iconvert, 3072);

Can you please explain that? It seems as if the expected input (for CNV 32x32x3=3072 elements) is already constrained to values -1 and 1, instead of e.g. 8bit integers. From the papers I would have expected the input layer to handle integers instead of binarised values.

So my question is: What is the input data supposed to look like?

P.S.: In case you are wondering, that's the output values I read back each and every time:

Reading output
0: -64
1: 106
2: 16
3: 34
4: -98
5: 86
6: 42
7: -76
8: 218
9: 76

"Example" folder isnt here

Hi, when i try to do the make-sw.sh. I have an error because inside the code say:

export XILINX_RPNN_ROOT=$FINN_ROOT/backend/fpga

RPNN_PATH=$XILINX_RPNN_ROOT/example

but if you see the git, example folder no exist.

Vs BNN-PYNQ

I wonder what is the different between BNN-PYNQ and this git?

What is the current status of this?

And any successful example can be run on board?

Thanks

Using the guide from Xilinx

Hi, im using this guide/tutorial from Xilinx:

https://xilinx-wiki.atlassian.net/wiki/spaces/A/pages/18841949/Zynq+UltraScale+MPSoC+Accelerated+Image+Classification+via+Binary+Neural+Network+TechTip

This Tutorial use a CNVW1A1 for Road Signs. I'm trying to change the weights and thres with another training, for example the params of BNN-PYNQ, but i saw some problems to use it.

The params in this proyect ( /aic/src_hw/bnn-pynq-mpsoc/params/road-signs )have a different format that Params used in the Git of BNN-PYNQ( https://github.com/Xilinx/BNN-PYNQ/tree/master/bnn/params/road-signs/cnvW1A1 ) so, How can i make a training for this Tutorial or reuse the Params in BNN-PYNQ ?

Create the drivers for the bitstream implementation

Hi. I have the final bitstream that i will use to program the FPGA adapted to Ultra96. I so that you have a driver folder in ** FINN/backend/fpga/driver **
I want to know how can i use it , to use FINN in my Ultra96.
Let me know if you dont understand my question.
Thanks.

float model using FINN

Hi,

I wish to know what is the design flow if a float point model (.prototxt and .caffemodel) is given.
How can I automatically quantize to use HWGQ caffe?
And then implement using FINN?

Thanks

Set clock frequency during PYNQ shell integration

Currently, no clock period target is passed to the PYNQ shell integration scripts when creating the Vivado project for building an overlay. This should be made a parameter in the generated ip_config.tcl file, and then the shell creation scripts here must be updated to actually use the specified value.

Python (PYNQ-Z1-Board) driver support for custom network

Hi,
Beside this repository there is another FINN repository (BNN-PYNQ) with an overlay driver for the example networks. According to some issues here, it is possible to use the bitstream with the built-in weights in combination with the Python driver by commenting the loading of the parameters in the bnn.py module. So far so good.

Currently I have trained my own network (own topology) with HWGQ-Caffe and processed it with FINN. Currently I have the bitstream and tcl file belonging to the network. With the standard PYNQ classes I can also load these files onto the Zynq of the Z1 board, but without any driver support. In bnn.py dynamic libraries are loaded. I guess I can't use my network with these libraries. Any attempt to use the existing so-files with my own net where unsuccessful.

Can this repo or the BNN-PYNQ repo be used to generate the required dynamic library using the "make-sw.sh" scripts? Is there a way to automatically generate the necessary PYNQ drivers for any network topologies?

If you have any further questions, please do not hesitate to contact me and I look forward to your response.

How to run inference from python interface

Hi,
I have succesfully generated the bitstream and the tcl file on my own executing the following command (suggested in the guide of this repo):
"python FINN/bin/finn --device=pynqz1 --prototxt=FINN/inputs/cnv-w1a1.prototxt --caffemodel=FINN/inputs/cnv-w1a1.caffemodel --mode=synth"

I thought to load the bitstream on PYNQ-Z1 and try to run inference.
Can I use BNN-PYNQ (https://github.com/Xilinx/BNN-PYNQ), changing the loading of bitstream, to make inference of any network or you provide a python interface, like BNN-PYNQ, to run inference on a network?

Thanks,
Sara

Error while running FINN in estimate mode

Hi,
I am trying to understand FINN framework and I have tried to run the example mentioned in the FINN's github page. The example I have tried is to estimate performance of LFC MLP network and I have run into the following error. I kindly request you to please help me understand the reason for the error and also can you please let me know which directory should i see for output of the flow in the estimate. My use case is to get an estimate of no of LUT's or hardware modules for the example BNN network.

Error :-
jayanth5178@bohr3:~/mywork/finn/FINN$ python FINN/bin/finn --device=pynqz1 --prototxt=FINN/inputs/lfc-w1a1.prototxt --mode=estimate
[libprotobuf ERROR google/protobuf/text_format.cc:298] Error parsing text-format caffe.NetParameter: 32:15: Message type "caffe.LayerParameter" has no field named "quant_param".
WARNING: Logging before InitGoogleLogging() is written to STDERR
F0122 06:25:26.846015 3394 upgrade_proto.cpp:90] Check failed: ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: FINN/inputs/lfc-w1a1.prototxt
*** Check failure stack trace: ***
Aborted (core dumped)

FYI :- I have cloned the HWGQ repo(https://github.com/zhaoweicai/hwgq) and installed caffe following the instruction from the caffe page(http://caffe.berkeleyvision.org/installation.html) .Please let me know if my installation procedure is wrong and I should follow some other steps for installation in-order to be in compatible with FINN

Can you share with us the roadmap of this FINN?

I am working on a network with bias.
And it seems like the current compiler do not support convolution with bias.

If possible, can I have some documentation on what layer is currently supported
And corresponding limitation.

installation instruction ambiguity

It's not clear to me where it says "Install HWGQ Caffe in the same directory as FINN: https://github.com/zhaoweicai/hwgq" if that means my directories should be organized as user/FINN and user/hwgq or as user/FINN/hwgq... Please advise. In the mean time I will attempt to figure this out with trial and error. I am using an LXD container if that matters.

reference design implementation failed

When I used "python FINN/bin/finn --device=pynqz1 --prototxt=FINN/inputs/cnv-w1a1.prototxt --caffemodel=FINN/inputs/cnv-w1a1.caffemodel --mode=synth" to try the framework with Vivado 2018.2, It reported place_design ERROR.
log:
Phase 1.2 IO Placement/ Clock Placement/ Build Placer Device
ERROR: [Place 30-640] Place Check : This design requires more RAMB36/FIFO cells than are available in the target device. This design requires 144 of such cell types but only 140 compatible sites are available in the target device. Please analyze your synthesis results and constraints to ensure the design is mapped to Xilinx primitives as expected. If so, please consider targeting a larger device.
ERROR: [Place 30-640] Place Check : This design requires more RAMB18 and RAMB36/FIFO cells than are available in the target device. This design requires 384 of such cell types but only 280 compatible sites are available in the target device. Please analyze your synthesis results and constraints to ensure the design is mapped to Xilinx primitives as expected. If so, please consider targeting a larger device.
ERROR: [Place 30-640] Place Check : This design requires more RAMB36E1 cells than are available in the target device. This design requires 144 of such cell types but only 140 compatible sites are available in the target device. Please analyze your synthesis results and constraints to ensure the design is mapped to Xilinx primitives as expected. If so, please consider targeting a larger device.
INFO: [Timing 38-35] Done setting XDC timing constraints.
Phase 1.2 IO Placement/ Clock Placement/ Build Placer Device | Checksum: 1497648bb

Time (s): cpu = 00:00:22 ; elapsed = 00:00:15 . Memory (MB): peak = 3569.887 ; gain = 0.000 ; free physical = 7360 ; free virtual = 478279
Phase 1 Placer Initialization | Checksum: 1497648bb

Time (s): cpu = 00:00:22 ; elapsed = 00:00:15 . Memory (MB): peak = 3569.887 ; gain = 0.000 ; free physical = 7361 ; free virtual = 478279
ERROR: [Place 30-99] Placer failed with error: 'Implementation Feasibility check failed, Please see the previously displayed individual error or warning messages for more details.'
Please review all ERROR, CRITICAL WARNING, and WARNING messages during placement to understand the cause for failure.
Ending Placer Task | Checksum: 1497648bb

Time (s): cpu = 00:00:22 ; elapsed = 00:00:15 . Memory (MB): peak = 3569.887 ; gain = 0.000 ; free physical = 7367 ; free virtual = 478285
INFO: [Common 17-83] Releasing license: Implementation
60 Infos, 21 Warnings, 0 Critical Warnings and 5 Errors encountered.
place_design failed
ERROR: [Common 17-69] Command failed: Placer could not place all instances
INFO: [Common 17-206] Exiting Vivado at Mon Apr 22 05:28:05 2019...
[Mon Apr 22 05:28:06 2019] impl_1 finished
wait_on_run: Time (s): cpu = 00:13:35 ; elapsed = 00:15:15 . Memory (MB): peak = 1694.383 ; gain = 0.000 ; free physical = 7590 ; free virtual = 478508
INFO: [Common 17-206] Exiting Vivado at Mon Apr 22 05:28:06 2019...
cp: cannot stat '/tmp/finn-build-7wMqOe/finnaccel/finnaccel.runs/impl_1/procsys_wrapper.bit': No such file or directory
Traceback (most recent call last):
File "FINN/bin/finn", line 192, in
process_args(args)
File "FINN/bin/finn", line 92, in process_args
generate_hardware(net, dev, gen_bitfile=True)
File "FINN/bin/finn", line 161, in generate_hardware
ret.synthesis_bitfile()
File "/app/FINN/FINN/backend/fpga/backend_util.py", line 71, in synthesis_bitfile
subprocess.check_call(["sh", self.getBitfileSynthScriptPath()], cwd=self.path)
File "/usr/lib/python2.7/subprocess.py", line 190, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sh', '/tmp/finn-build-7wMqOe/make_pynq_bitfile.sh']' returned non-zero exit status 1

resource report post synthesis:

Resource	Utilization	Available	Utilization %
LUT	40210	53200	75.58
LUTRAM	2148	17400	12.34
FF	49600	106400	46.62
BRAM	192	140	137.14
DSP	38	220	17.27

By the way, the bin file FINN/bin/finn shows it supports VU9P device which has more BRAM resource as I know, but when I was tring to use this parameter, it didn't work. The vivado-hls had created the VU9P BlackBoxjam IP successfully, but the "FINN/backend/fpga/scripts/make-pynq-vivado-proj.tcl" only support pynqz1.

run-docker.sh cannot run if UNAME contains uppercase characters

A very small issue, but when running sh run-docker.sh, it failed because my username has uppercase characters.

invalid argument "finn_AdminCOOP" for --tag=finn_AdminCOOP: Error parsing refere nce: "finn_AdminCOOP" is not a valid repository/tag: repository name must be low ercase See 'docker build --help'. C:\Program Files\Docker\Docker\Resources\bin\docker.exe: Error parsing reference : "finn_AdminCOOP" is not a valid repository/tag: repository name must be lowerc ase.

I fixed it by simply changing DOCKER_UNAME=admin_coop in run-docker.sh. Perhaps this case should be caught by run-docker.sh?

Nice work and would like to explore more

Hi,

I have some question on this repo

Which vivado version is tested/supported?
Training should be done using HWGQ or ordinary Caffe?
How to simulate the inference result before going to long synthesis cycle?
While working on cnv with vu9p in vivado 2018.2, the compile script failed.
Not enough BRAM when running cnv with pynqz1 in vivado 2018.2

Thanks

What level of precision does FINN support?

Hi, I'm wondering what level of bit precision this version of FINN supports. I've noticed that a lot of the templates for various layer types are parameterized by the precision, but I recall reading in FINN-R that arbitrary bit precision was a new contribution and that code hasn't been released yet, implying that this FINN shouldn't support arbitrary precision.

Building the SDx

Hi. im following the steps in FINN/FINN/backend/fpga/README.md. Im doing the build of SDx executing this steps:

" ./make-sw.sh lfc-max bnn hlsweights" choosing HLS (h) [OK]
"./example/lfc-max_bnn_hlsweights/rawhls-lfc-max" choosing 10 and 1 as test. [OK]
"./make-sw.sh lfc-max bnn sdx " choosing SDx (x) [ERROR]

HLS Simulation or host for HW accelerator or SDx (S/H/X)? x Building host 0 Building xclbin make SDA_FLOW=hw run_hw_int -f sdaccel.mk make[1]: Entering directory '/home/franco/Downloads/GIT/FINN/FINN/backend/fpga/scripts' make SDA_FLOW=hw xbin -f sdaccel.mk make[2]: Entering directory '/home/franco/Downloads/GIT/FINN/FINN/backend/fpga/scripts' make[2]: Nothing to be done for 'xbin'. make[2]: Leaving directory '/home/franco/Downloads/GIT/FINN/FINN/backend/fpga/scripts' source setup.sh;/ ../BlackBoxJam_hw.xclbin **/bin/bash: setup.sh: No such file or directory** /bin/bash: /: Is a directory common.mk:106: recipe for target 'run_hw_int' failed make[1]: *** [run_hw_int] Error 126 make[1]: Leaving directory '/home/franco/Downloads/GIT/FINN/FINN/backend/fpga/scripts' common.mk:100: recipe for target 'run_hw' failed make: *** [run_hw] Error 2 Output file: /home/franco/Downloads/GIT/FINN/FINN/backend/fpga/example/lfc-max_bnn_sdx//sdx-lfc-max Done!

So tried commenting the line 104 in /FINN/FINN/backend/fpga/scripts/common.mk because i dont see any setup.sh in the folders, i also check some commits ( i have the source of SDx source/opt/Xilinx/SDxSoc/SDx/2018.2/settings64.sh ) :

run_hw_int : host xbin_hw source ${BOARD_SETUP_FILE};${HOST_EXE_DIR}/${HOST_EXE} ${HOST_ARGS}

without that line i got:

But if i do the execution:

`
fpga git:(f4e48af) ✗ ./example/lfc-max_bnn_sdx/sdx-lfc-max
Building network...
Setting network weights and thresholds in accelerator...
Enter number of images from MNIST test set to check:
10
Running prebinarized test for 10 images...
Enter number of times to repeat test:
1
ERROR: No devices found
Failed to detect platform
[1] 10191 segmentation fault (core dumped) ./example/lfc-max_bnn_sdx/sdx-lfc-max

It seems like this repo is not active anymore?

Implementation fail using the CNV network

Hi, i do the estimate and synthesize CNV network as the README.md say in Line 28:
python FINN/bin/finn --device=pynqz1 --prototxt=FINN/inputs/cnv-w1a1.prototxt --caffemodel=FINN/inputs/cnv-w1a1.caffemodel --mode=synth

But i coudnt see the implementation. Im new with this and i cant find my error.
(I have Caffe , i did the test,runtest,etc) and the others Prerequisites.

Here is the terminal text:
Time (s): cpu = 00:00:39 ; elapsed = 00:00:22 . Memory (MB): peak = 3556.273 ; gain = 0.000 ; free physical = 5047 ; free virtual = 20775 INFO: [Common 17-83] Releasing license: Implementation 60 Infos, 21 Warnings, 0 Critical Warnings and 5 Errors encountered. place_design failed ERROR: [Common 17-69] Command failed: Placer could not place all instances INFO: [Common 17-206] Exiting Vivado at Tue Oct 30 12:48:13 2018... [Tue Oct 30 12:48:14 2018] impl_1 finished wait_on_run: Time (s): cpu = 00:21:10 ; elapsed = 00:22:14 . Memory (MB): peak = 1659.152 ; gain = 0.000 ; free physical = 5081 ; free virtual = 20809 INFO: [Common 17-206] Exiting Vivado at Tue Oct 30 12:48:14 2018... cp: cannot stat '/tmp/finn-build-v4KuS_/finnaccel/finnaccel.runs/impl_1/procsys_wrapper.bit': No such file or directory Traceback (most recent call last): File "FINN/bin/finn", line 192, in <module> process_args(args) File "FINN/bin/finn", line 92, in process_args generate_hardware(net, dev, gen_bitfile=True) File "FINN/bin/finn", line 161, in generate_hardware ret.synthesis_bitfile() File "/home/franco/Downloads/GIT/FINN/FINN/backend/fpga/backend_util.py", line 71, in synthesis_bitfile subprocess.check_call(["sh", self.getBitfileSynthScriptPath()], cwd=self.path) File "/usr/lib/python2.7/subprocess.py", line 541, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['sh', '/tmp/finn-build-v4KuS_/make_pynq_bitfile.sh']' returned non-zero exit status 1

I dont want to do a big post so if you need see all the implementation text i will post in a link.
Fully Implementation text:
https://textuploader.com/dw3uo

HWGQ Caffe is necessary for deployment?

Hi,
I would use your framework only for running inference on networks. Is it necessary to install HWGQ Caffe? I'm just going to use prototxt and caffemodel provided.
Thanks,
Sara

Check for PE_(L-1) = SIMD(L) requirement

The current implementation of IP stitching assumes the widths of all internal streams are equal, which implicitly requires PE of layer L-1 = SIMD of layer L. The stitching will silently fail otherwise. This should be caught during the transform using an assertion.

Example network cnv-w1a1 fails on BNN-PYNQ

Hello,

I have tried to implement example convolution network (cnv-w1a1.caffemodel) on PYNQ, but it always fails.
I have used BNN-PYNQ project (commits: 1365efa233fa8a66b1c9e68e2f1d00e3bf834f8b and 2fe7f6ad3968bcb4b496203b3e7e73e325d3b4a2) to deploy .bit file created by FINN.
Steps to reproduce:

Start PYNQ board
Comment out line self.interface.load_parameters(params.encode()) in bnn/bnn.py of BNN-PYNQ.
Replace bnn/bitstreams/cnv-pynq-pynq.bit (or CNV-W1A1-pynq-Z1-Z2 in latest releases) with FINN generated .bit file
Start Cifar10 notebook example
Launch BNN in hardware
Change pictures in Launch BNN in hardware

The results for each picture will be always Deer.

Use smaller accumulators

Currently, FINN always use (U)INT32 accumulators while instantiating HLS layers. Since the accumulator width is configurable, we should instead generate code to instantiate accumulators that are smaller but still wide enough to cover the dynamic range of the dot product. This will save resources in synthesis.

Difference between the two tests

Hi,

I have a question, can you please explain to me what's the difference exactly between the FPGA Deployment (CIFAR 10 Layer) and the FPGA Deployment (Quick Test Function)? Also the same question for LeNet.

Yours sincerely.

Failed to parse NetParameter file

Hello
When I tried to synthesize one of the network examples
root@d9322fb83d76:/app/FINN# python FINN/bin/finn --device=pynqz1 --prototxt=FINN/inputs/sfc-w1a1.prototxt --caffemodel=FINN/inputs/sfc-w1a1.caffemodel --mode=synth

I got that error

WARNING: Logging before InitGoogleLogging() is written to STDERR
F0227 14:36:23.989655 52 upgrade_proto.cpp:95] Check failed: ReadProtoFromBinaryFile(param_file, param) Failed to parse NetParameter file: FINN/inputs/sfc-w1a1.caffemodel
*** Check failure stack trace: ***
Aborted (core dumped)

Any recommendations ?
noting vivado tools are installed

New network and overlay deployment on PYNQ

Hi,

I have modified existing CIFAR-10 network to detect 5 new classes and generated .bit file.
How can I deploy the network on PYNQ?

I have executed make_pynq_standalone_exe.sh on PYNQ board however it returned this error:
usr/lib/gcc/arm-linux-gnueabihf/5/../../../arm-linux-gnueabihf/crt1.o: In function _start': (.text+0x28): undefined reference to main'
collect2: error: ld returned 1 exit status
I added -c option at the end of the make_pynq_standalone_exe.sh file, but only makes ondevice object file.

YOLO accelerated with FINN

Dear FINN developers,

I am trying to accelerate YOLO network (this one: https://github.com/xiaohu2015/DeepLearning_tutorials/blob/master/ObjectDetections/yolo/yolo_tf.py) on PYNQ FPGA to achieve real-time object classfication.
I found FINN framework and very excited about it. However, I am not sure how exactly I can implement the accelerator and if it is possible with FINN for the mentioned network.

Could you please suggest the way I can tackle this?

P.S. I don't have much experience in working with NN, so if you could answer it in simple terms, I very appreciate this.

Thanks in advance,
Aliaksei

Check Vivado version

FINN has a Vivado version requirements, e.g. 2019.1 in the 0.2b release. The available Vivado version should be checked before any Vivado-related commands are launched, and an assertion should be raised if there is a version mismatch.

HLS-Code

I cannot make sense of this pragma:

#define CASSERT_DATAFLOW(x) ;

I searched the internet and did not find an answer to what this does exactly.

Thank you for your help.

Kind regards

Non-int16 thresholding not yet supported

Hi,

I am working on custom network and I faced the non-int16 problem.

My question is, what is the relationship between prototxt and thresholding bitwidth.
As far as my understanding, the only parameters I can play around is the num_output and kernel_size.
How to calculate the bitwidth of the threshold value

Thanks

Multilabel Classification using FINN

Hi,

I would like to obtain some help in training a custom network with SigmoidCrossEntropyLoss as loss function.
It seems like the BinaryInnerProduct layer cannot pass on to SigmoidCrossEntropyLoss layer.

Thanks

XNOR-popcount

Hello,

I have a basic question to the implementation of the dot product. I try to comprehend the XNOR-popcount operation on an example.

1 -1 1 -1 1 1
1 1 -1 1 1 -1

The result should be -2. By simply using XNOR and popcount I get a result of 2.

I found the formel
2*p-N
which gives exactly the expected -2 with the XNOR-popcount result. Did I overlook something in the FINN-Paper? I could not find an explanation in the paper. Maybe I misunderstood something completely.

Thank you for your help

Kind regards

Better structure for generated PYNQ driver code

Currently, the generated PYNQ drivers are quite rudimentary, implementing an imperative program that flashes the bitfile, reads in input.npy, calls the packing functions, launches the DMA, unpacks the output, then writes output.npy.

The generated driver should be better organized and more modular. The driver itself should have key functionality (init/bitfile, pack/unpack data, launch DMA) organized into separate callable functions. The code that handles user-level I/O (e.g for the current remote execution protocol with input.npy/output.npy) should be generated as a separate file.

This will permit more flexible use of the generated driver, e.g. flashing the bitfile only once, time measurements including or excluding data (un)packing, and so on.

Non-int16 thresholding not yet supported

Hello,

I have been trying to do some simple experiments trying to create other networks using the hwgq tool and then asking FINN to create hardware for it but I am not getting very far with errors like with a binarized alexnet from hwgq: Exception("Threshold does not preserve max")

To try something simple I used the provided example for CIFAR CNV W1A1 and increase the width by doubling the number of channels in each layer. This fits perfectly in a VU9P device but when doing :
python FINN/bin/finn --device=vu9p --prototxt=FINN/inputs/alexnet2.prototxt --mode=estimate
I get :
Exception: Non-int16 thresholding not yet supported.
can someone clarify if something can be done about these errors ?

Error while synthesizing lenet-hwgq-w1a2

Hi,

I am trying to run python FINN/bin/finn --device=pynqz1 --prototxt=FINN/inputs/lenet-hwgq-w1a2.prototxt --caffemodel=FINN/inputs/lenet-hwgq-w1a2.caffemodel --mode=synth But it gives 2 error while synthesizing:

error: no matching function for call to 'ConvLayerMMV_BNN_Batch'

error: no matching function for call to 'MatrixVector_Precision_Batch'

any ideas how solve this?

Thank You!

Release date of FINN-R

Hello,

Would you be able to give a release date for when you think the second generation of FINN will be released?

Kind regards

finn

Hello,
How can Finn transform CNN network to FPGA circuit structure?

Model simulation fails during synthesis

Hello,
I tried running https://github.com/Xilinx/FINN/blob/master/FINN/test/test_hwgq_cnv_w1a1.py to simulate my model. During synthesis part I get the following error:
$ python FINN/test/test_hwgq_cnv_w1a1.py
test_fpgabackend_rawhls (main.TestHWGQCNVw1a1) ... [ConvolutionLayer, LinearLayer, PoolingLayer, LinearLayer, LinearLayer, BipolarThresholdingLayer, ConvolutionLayer, LinearLayer, PoolingLayer, LinearLayer, LinearLayer, BipolarThresholdingLayer, ConvolutionLayer, LinearLayer, LinearLayer, LinearLayer, BipolarThresholdingLayer, FullyConnectedLayer, LinearLayer, LinearLayer, LinearLayer, BipolarThresholdingLayer, FullyConnectedLayer, LinearLayer, LinearLayer, LinearLayer, BipolarThresholdingLayer, FullyConnectedLayer, LinearLayer, LinearLayer, LinearLayer]
[ConvolutionLayer, BipolarThresholdingLayer, PoolingLayer, ConvolutionLayer, BipolarThresholdingLayer, PoolingLayer, ConvolutionLayer, BipolarThresholdingLayer, FullyConnectedLayer, BipolarThresholdingLayer, FullyConnectedLayer, BipolarThresholdingLayer, FullyConnectedLayer, LinearLayer]
Synthesising
ERROR

======================================================================
ERROR: test_fpgabackend_rawhls (main.TestHWGQCNVw1a1)

Traceback (most recent call last):
File "FINN/test/test_hwgq_cnv_w1a1.py", line 90, in test_fpgabackend_rawhls
hlslayers, self.net, self.dev, res_alloc_predetermined, dirpath, "sfcall-")
File "/home/steven/FINN/FINN/backend/fpga/backend_fpga.py", line 363, in synthesize
pipeline = convert(pipeline_in, net, dev, res_alloc, pipeline_ibits)
File "/home/steven/FINN/FINN/backend/fpga/backend_fpga.py", line 340, in convert
pipeline = prepare(pipeline, pipeline_ibits)
File "/home/steven/FINN/FINN/backend/fpga/backend_fpga.py", line 163, in prepare
pipeline = trns.apply_repeated(pipeline, passConvertToFPGALayers)
File "/home/steven/FINN/FINN/transforms/transformations.py", line 48, in apply_repeated
(ret, numChanges) = pass_to_apply(ret)
File "/home/steven/FINN/FINN/backend/fpga/backend_fpga.py", line 76, in passConvertToFPGALayers
raise Exception("Unsupported layer type in FPGA backend: %s" % L.get_type())
Exception: Unsupported layer type in FPGA backend: BipolarThresholdingLayer

The same error occurs if I try to simulate cnv-w1a1 model. However both models can synthesize hardware and create .bit files (when FINN is called with --mode=synth).

Streamline input name mismatch in transformed matmul nodes

The Streamline transform generates strange names in inputs of the transformed MatMul nodes that are actually missing in the graph causing a disconnected network.
For instance the following code generate a disconnected graph:

    lfc = LFC(weight_bit_width=1, act_bit_width=1, in_bit_width=1)
    checkpoint = torch.load(trained_lfc_checkpoint, map_location="cpu")
    lfc.load_state_dict(checkpoint["state_dict"])
    bo.export_finn_onnx(lfc, (1, 1, 28, 28), export_onnx_path)
    model = ModelWrapper(export_onnx_path)
    model = model.transform(InferShapes())
    model = model.transform(FoldConstants())
    model = model.transform(GiveUniqueNodeNames())
    model = model.transform(GiveReadableTensorNames())
    model = model.transform(Streamline())

the corresponding net shown in netron is:

The problem seems to happen in BatchNormToAffine()

Fresh docker install: synthesis example fails

Hello,

I have just installed the Docker image, strictly following the insctructions and then installed Vivado 18.3 in command line (https://www.xilinx.com/support/answers/70452.html).

Running the estimate command example works fine (python FINN/bin/finn --device=pynqz1 --prototxt=FINN/inputs/lfc-w1a1.prototxt --mode=estimate) but the synthesis one fails as shown here below:

root@1c985b9d0fd9:/app/FINN# python FINN/bin/finn --device=pynqz1 --prototxt=FINN/inputs/cnv-w1a1.prototxt --caffemodel=FINN/inputs/cnv-w1a1.caffemodel --mode=synth
WARNING: Logging before InitGoogleLogging() is written to STDERR
F0430 12:38:47.166438    36 upgrade_proto.cpp:95] Check failed: ReadProtoFromBinaryFile(param_file, param) Failed to parse NetParameter file: FINN/inputs/cnv-w1a1.caffemodel
*** Check failure stack trace: ***
Aborted (core dumped)

I performed this process twice to be sure I did not do something wrong the problem persists. What could this problem be?

Thanks

Do not instantiate DWCs in PYNQ shell if not required

The Xilinx AXI stream data width converter IP's raise an error if the widths of the input and output streams are the same. This causes FINN designs that have input or output width equal to the DMA output stream width to fail during PYNQ shell Vivado project generation if e.g. SIMD is set to 16 for the first layer of the tfc_w1a2 test. The shell stitcher script should detect this condition and not instantiate any DWCs in this case.

Note: not having DWCs may raise endianness issues in the generated driver, needs to be checked.

Check status after script/command execution

Many transformations and utility functions in FINN call external commands such as g++, vivado and vivado_hls without checking the return status. If the called command fails, the FINN transformation won't be aware of this. Assertions should be introduced on the return status whenever an external command is called.

LSTM with the new version of FINN

Hi,
I'm currently trying to implement an LSTM on FPGA using the "old" version (0.1) of FINN. I'm happy to see that the new version is on its way but I was wondering if the newer version will support LSTM networks. Having looked at brevitas and the finn-hlslib, there do not seem to be any references to LSTM or RNN networks in general.

Do you plan to support RNN / LSTM networks in the future?

Thank you for your time,

Phillip Geier

Add ModelWrapper methods to get network global input and output tensor names

Currently, when we need the names of the global input/output tensors in the graph (e.g. where the input image is fed in, and where the classification outputs are read out) we use code like this:

global_in = model.graph.input[0].name
global_out = model.graph.input[0].name

Instead, we should have helper methods in ModelWrapper similar to:

global_in = model.get_global_in()
global_out = model.get_global_out()

Annotate resources on nodes after synthesis

After synthesis is complete, pull out per-layer resource utilization from the synthesized Vivado project. This can be implemented as a new analysis pass, or the resource counts can be annotated on the ONNX nodes as attributes.

output.npy not found in End-to-end-flow notebook

Under the section Deployment and Remote Execution, at

import numpy as np
from finn.core.onnx_exec import execute_onnx
iname = parent_model.graph.input[0].name
oname = parent_model.graph.output[0].name
ishape = parent_model.get_tensor_shape(iname)
input_dict = {iname: x.reshape(ishape)}
ret = execute_onnx(parent_model, input_dict, True)

I got an error ,

FileNotFoundError                         Traceback (most recent call last)
<ipython-input-63-30954cff648b> in <module>
      5 ishape = parent_model.get_tensor_shape(iname)
      6 input_dict = {iname: x.reshape(ishape)}
----> 7 ret = execute_onnx(parent_model, input_dict, True)

/workspace/finn/src/finn/core/onnx_exec.py in execute_onnx(model, input_dict, return_full_exec_context)
    142         # topologically sorted
    143         for node in graph.node:
--> 144             execute_node(node, execution_context, graph)
    145     elif model_exec_mode == "remote_pynq":
    146         # use remote exec metadata built into model to execute on a remote PYNQ

/workspace/finn/src/finn/core/onnx_exec.py in execute_node(node, context, graph)
     47         sdp_node = getCustomOp(node)
     48         model = ModelWrapper(sdp_node.get_nodeattr("model"))
---> 49         ret = execute_onnx(model, context, True)
     50         context.update(ret)
     51     else:

/workspace/finn/src/finn/core/onnx_exec.py in execute_onnx(model, input_dict, return_full_exec_context)
    145     elif model_exec_mode == "remote_pynq":
    146         # use remote exec metadata built into model to execute on a remote PYNQ
--> 147         remote_exec(model, execution_context)
    148     elif model_exec_mode == "rtlsim":
    149         # use stitched IP for rtlsim

/workspace/finn/src/finn/core/remote_exec.py in remote_exec(model, execution_context)
     51     process_compile = subprocess.Popen(bash_command, stdout=subprocess.PIPE)
     52     process_compile.communicate()
---> 53     outp = np.load("{}/output.npy".format(deployment_dir))
     54     execution_context[model.graph.output[0].name] = outp

/opt/conda/lib/python3.6/site-packages/numpy/lib/npyio.py in load(file, mmap_mode, allow_pickle, fix_imports, encoding)
    420         own_fid = False
    421     else:
--> 422         fid = open(os_fspath(file), "rb")
    423         own_fid = True
    424 

FileNotFoundError: [Errno 2] No such file or directory: '/tmp/finn_own3d/pynq_deployment_473cmbj1/output.npy'

I could see that it is connected to the PYNQ board as ! sshpass -p {password} ssh {username}@{ip} 'ls -l {target_dir}/*' list out the directory. When I look into the PYNQ directory, the output.py is indeed absent. I also read that the output.py should be generated by the driver.py.

xilinx / finn Goto Github PK

finn's Introduction

Fast, Scalable Quantized Neural Network Inference on FPGAs

Getting Started

What's New in FINN?

Documentation

Community

Citation

Old version

finn's People

Contributors

Stargazers

Watchers

Forkers

finn's Issues

Hi. im following the steps in FINN/FINN/backend/fpga/README.md. Im doing the build of SDx executing this steps:

So tried commenting the line 104 in /FINN/FINN/backend/fpga/scripts/common.mk because i dont see any setup.sh in the folders, i also check some commits ( i have the source of SDx source/opt/Xilinx/SDxSoc/SDx/2018.2/settings64.sh ) :

without that line i got:

But if i do the execution:

====================================================================== ERROR: test_fpgabackend_rawhls (main.TestHWGQCNVw1a1)

Recommend Projects

Recommend Topics

Recommend Org

======================================================================
ERROR: test_fpgabackend_rawhls (main.TestHWGQCNVw1a1)