fastmachinelearning / hls4ml-tutorial Goto Github PK

View Code? Open in Web Editor NEW

275.0 16.0 123.0 18.82 MB

Tutorial notebooks for hls4ml

Home Page: http://fastmachinelearning.org/hls4ml-tutorial/

Python 0.48% Jupyter Notebook 99.30% Dockerfile 0.07% Shell 0.15%

fpga hls4ml machine-learning pruning quantization-aware-training tutorial

hls4ml-tutorial's Introduction

hls4ml-tutorial: Tutorial notebooks for `hls4ml`

There are several ways to run the tutorial notebooks:

Online

Conda

The Python environment used for the tutorials is specified in the environment.yml file. It can be setup like:

conda env create -f environment.yml
conda activate hls4ml-tutorial

Docker without Vivado

Pull the prebuilt image from the GitHub Container Registry:

docker pull ghcr.io/fastmachinelearning/hls4ml-tutorial/hls4ml-0.8.0:latest

Follow these steps to build a Docker image that can be used locally, or on a JupyterHub instance. You can build the image (without Vivado):

docker build https://github.com/fastmachinelearning/hls4ml-tutorial -f docker/Dockerfile

Alternatively, you can clone the repository and build locally:

git clone https://github.com/fastmachinelearning/hls4ml-tutorial
cd hls4ml-tutorial
docker build -f docker/Dockerfile -t ghcr.io/fastmachinelearning/hls4ml-tutorial/hls4ml-0.8.0:latest .

Then to start the container:

docker run -p 8888:8888 ghcr.io/fastmachinelearning/hls4ml-tutorial/hls4ml-0.8.0:latest

When the container starts, the Jupyter notebook server is started, and the link to open it in your browser is printed. You can clone the repository inside the container and run the notebooks.

Docker with Vivado

Pull the prebuilt image from the GitHub Container Registry:

docker pull ghcr.io/fastmachinelearning/hls4ml-tutorial/hls4ml-0.8.0-vivado-2019.2:latest

To build the image with Vivado, run (Warning: takes a long time and requires a lot of disk space):

docker build -f docker/Dockerfile.vivado -t ghcr.io/fastmachinelearning/hls4ml-tutorial/hls4ml-0.8.0-vivado-2019.2:latest .

Then to start the container:

docker run -p 8888:8888 ghcr.io/fastmachinelearning/hls4ml-tutorial/hls4ml-0.8.0-vivado-2019.2:latest

Companion material

We have prepared a set of slides with some introduction and more details on each of the exercises. Please find them here.

Notebooks

hls4ml-tutorial's People

Contributors

Stargazers

Watchers

Forkers

duchstf stbordoni jmduarte thesps yluo39github husnainmubarik oolaya1815 andrediluca rateixei jjteoh84 tommasodiotalevi slezki aidins1 bdtmnk jeongeun koodg123 erezmano zacharyho897 msneubauer julesmuhizi oliviaweng ayo-wemida umair772 thaarres leosf sukikrishna amamory-ml valeriopagliarino anvitpatil nghielme tahuang1991 maxpark jmitrevs mohammadreza-ebrahimi tanvirarafin shakthivels300 ejokar zuob-1861523 zewenzhu zht247200 williamliao28 rattokiller tmilanes edwinzapata annabellachong eneriz-daniel embedevelop solufast zhangyuqaq kfoysalhaque fpgaq anthonyaportela brainz22 quinnanm cgnaj lostecho365 lloo099 honguyen21 lemwill hftsoi cms-p2l1trigger-tau3mu zen-source 13269252368 luisspader mohamedelhewehy yummy0929 fktang prateeth8 weiyj-postgraduate davidecarini walkieq azzam-alhussain yonghuazhang-buaa priyansh2011 tochengai ben-hawks minnivan nabrandman lcrypto hafizabc77 deyh2020 linhduongtuan rajeevbotadra thulasiramvarma tripathy12345 gregorysehr wang712251ua fahadmuslim moisesstevend gursharan2803 buddha-zhang arkodip-hep 5l1v3r1 jorgeisur daniaguirre lisongxuan-code jingluwang dowling7 webclinic017 praveen686

hls4ml-tutorial's Issues

hls4ml dosent work for me

my model is:
model = Sequential()
model.add(BatchNormalization(input_shape=(1408,1)))
model.add(Conv1D(3, kernel_size=(100),strides=2))
model.add(Activation("relu"))
model.add(MaxPooling1D(pool_size=(2),strides=2))
model.add(Conv1D(50, (10)))
model.add(MaxPooling1D(pool_size=(2),strides=2))
model.add(Activation("relu"))
model.add(Conv1D(30, (30)))
model.add(MaxPooling1D(pool_size=(2)))
model.add(Activation("relu"))
model.add(BatchNormalization())
model.add(Flatten())
model.add(Dropout(0.25))
model.add(Dense(4,activation='softmax'))

and my hls4ml configuration:

config = hls4ml.utils.config_from_keras_model(model, granularity='model')

print("-----------------------------------")
print("Configuration")
print_dict(config)
print("-----------------------------------")

hls_model = hls4ml.converters.convert_from_keras_model(model,
hls_config=config,
output_dir='model_1/hls4ml_prj_2',
part='xcvu9p-flgb2104-2-i',
io_type='io_stream')

hls_model.build(csim=False)

error
[XFORM 203-504] Stop unrolling loop

why ? help me

Tutorial file for Convolutional Networks

Please post a tutorial for cnn.

Model Accuracy

Hi,

How to find out the accuracies of the example models you provided?

Stuck at hls_model.compile()

Hello, I am trying to run the tutorial notebooks and on the first tutorial, the notebook gets stuck at:
hls_model.compile()

I have tried to monitor the resource usage and I do not see vivado being used so could that be the problem?

Tutorial on pytorch MLP conversion?

I have been using pytorch a lot more than tensorflow, so writing in pytorch and converting to hls4ml would be more convenient. I tried to simply convert some of the code written in the jupyter notebook tensorflow tutorial, but to no avail.

Are you guys thinking of introducing pytorch tutorial. Even just a quick one that does the same thing as the jupyter notebook tutorial would be great, so I can immediately start experimenting.

Thanks in advance

Resource vs Latency strategy

Hi all,
I am wondering what the difference is between the Resource and the Latency optimization strategy. Because I think my grasp of these concepts is wrong.
Initially, I thought that the Latency strategy is used to obtain the lowest latency possible, at the expense of resource usage (use more resources to obtain the lowest latency possible). To fully parallelize the model, you need a ReuseFactor = 1, and the amount of parameters in a single layer can't exceed the vivado unroll limit of 4096 (right?). But is it also possible to use a RF > 1 when using the Latency strategy? What would be the result?
On the other hand, I would think that selecting the Resource strategy would imply using the least amount of resources, at the expense of latency (use less resources, but have a higher latency).

But now comes the problem, in tutorial 7 a model is deployed on a PYNQ-Z2 board using the VivadoAccelerator backend. The strategy is not explicitly set, so by default it uses Latency. RF is also set to 64. Now to test, I changed the strategy to Resource using the following code:

for layer in ['fc1', 'fc2', 'fc3', 'output']:
    config['LayerName'][layer]['Strategy'] = 'Resource'
    config['LayerName'][layer]['ReuseFactor'] = 64

But at my surprise, it uses MORE resources than the original build (which used the Latency strategy):

================================================================
== Utilization Estimates
================================================================
* Summary: 
+-----------------+---------+-------+--------+-------+-----+
|       Name      | BRAM_18K| DSP48E|   FF   |  LUT  | URAM|
+-----------------+---------+-------+--------+-------+-----+
|DSP              |        -|      -|       -|      -|    -|
|Expression       |        -|      -|      40|   5483|    -|
|FIFO             |        -|      -|       -|      -|    -|
|Instance         |       16|     21|   17842|  41920|    -|
|Memory           |        -|      -|       -|      -|    -|
|Multiplexer      |        -|      -|       -|    128|    -|
|Register         |        0|      -|    2485|    352|    -|
+-----------------+---------+-------+--------+-------+-----+
|Total            |       16|     21|   20367|  47883|    0|
+-----------------+---------+-------+--------+-------+-----+
|Available        |      280|    220|  106400|  53200|    0|
+-----------------+---------+-------+--------+-------+-----+
|Utilization (%)  |        5|      9|      19|     90|    0|
+-----------------+---------+-------+--------+-------+-----+

Could someone please shed some light onto this? Are my intuitions of these concepts wrong?

AttributeError: module 'hls4ml.model.optimizer' has no attribute 'OutputRoundingSaturationMode'

Hello

I've just build hls4ml from the main branch (fe9d3e71b03e0422c7643027880310bd2cc02cb1) and wanted to run the tutorial

https://github.com/fastmachinelearning/hls4ml-tutorial/blob/master/part6_cnns.ipynb

However get this attibute error:

AttributeError: module 'hls4ml.model.optimizer' has no attribute 'OutputRoundingSaturationMode'

for this syntax:

hls4ml.model.optimizer.OutputRoundingSaturationMode.layers = ['Activation']

The attribute OutputRoundingSaturationMode does not seem to be defined. This is how my hls4ml looks:

import hls4ml
hls4ml.version
'0.6.0.dev196+gfe9d3e71'
dir(hls4ml.model.optimizer)
['ConfigurableOptimizerPass', 'GlobalOptimizerPass', 'LayerOptimizerPass', 'ModelOptimizerPass', 'OptimizerPass', 'builtins', 'cached', 'doc', 'file', 'loader', 'name', 'package', 'path', 'spec', 'extract_optimizers_from_object', 'extract_optimizers_from_path', 'get_available_passes', 'get_backend_passes', 'get_optimizer', 'layer_optimizer', 'model_optimizer', 'opt', 'opt_name', 'optimize_model', 'optimizer', 'optimizer_pass', 'os', 'passes', 'qkeras', 'register_flow', 'register_pass']

There is no attribute OutputRoundingSaturationMode.

Can anyone tell me what I am doing wrong?

cheers

Manny

HLS model will not compile

Hello,

When I am attempting to run the Jupyter tutorial notebooks (using the provided environment), everything works up until the line hls_model.compile() where it produces the exception

Exception Traceback (most recent call last)
Input In [13], in <cell line: 1>()
----> 1 hls_model.compile()
2 X_test = np.ascontiguousarray(X_test)
3 y_hls = hls_model.predict(X_test)

File ~\Anaconda3\envs\hls4ml-tutorial\lib\site-packages\hls4ml\model\hls_model.py:534, in HLSModel.compile(self)
532 ret_val = os.system('bash build_lib.sh')
533 if ret_val != 0:
--> 534 raise Exception('Failed to compile project "{}"'.format(self.config.get_project_name()))
535 lib_name = 'firmware/{}-{}.so'.format(self.config.get_project_name(), self.config.get_config_value('Stamp'))
536 if self._top_function_lib is not None:

Exception: Failed to compile project "myproject"

Would you happen to know whether this can be rectified? Thanks

Bad result

I just run the part1_getting_started.ipynb, but it does not seem to work well.

Is the result of part1 like above plot?

Pytorch converter doesn't work.

I trained VGG16 model on CIFAR100 dataset on pytorch. When I run:

import hls4ml
import plotting

config = hls4ml.utils.config_from_pytorch_model(model, granularity='layer')
print("-----------------------------------")
print("Configuration")
plotting.print_dict(config)
print("-----------------------------------")
hls_model = hls4ml.converters.convert_from_pytorch_model(
    model, hls_config=config, output_dir='model_3/hls4ml_prj', part='xcu250-figd2104-2L-e'
)

I get the error on the last line:
TypeError: cannot unpack non-iterable NoneType object

While I ran pre-trained VGG16 model of keras on hls4ml, it runs smoothly without any error. The cause of the error I found out is the config file generated from config = hls4ml.utils.config_from_pytorch_model(model, granularity='layer'). When I print this variable config, it shows:
{'Model': {'Precision': 'ap_fixed<16,6>', 'ReuseFactor': 1, 'Strategy': 'Latency'}}
which shows there is no information regarding the layers. In case of Keras i.e. config = hls4ml.utils.config_from_keras_model(model, granularity='layer') generates following output:


Interpreting Model
Topology:
Layer name: input_1, layer type: InputLayer, input shapes: [[None, 224, 224, 3]], output shape: [None, 224, 224, 3]
Layer name: block1_conv1, layer type: Conv2D, input shapes: [[None, 224, 224, 3]], output shape: [None, 224, 224, 64]
Layer name: block1_conv2, layer type: Conv2D, input shapes: [[None, 224, 224, 64]], output shape: [None, 224, 224, 64]
Layer name: block1_pool, layer type: MaxPooling2D, input shapes: [[None, 224, 224, 64]], output shape: [None, 112, 112, 64]
Layer name: block2_conv1, layer type: Conv2D, input shapes: [[None, 112, 112, 64]], output shape: [None, 112, 112, 128]
Layer name: block2_conv2, layer type: Conv2D, input shapes: [[None, 112, 112, 128]], output shape: [None, 112, 112, 128]
Layer name: block2_pool, layer type: MaxPooling2D, input shapes: [[None, 112, 112, 128]], output shape: [None, 56, 56, 128]
Layer name: block3_conv1, layer type: Conv2D, input shapes: [[None, 56, 56, 128]], output shape: [None, 56, 56, 256]
Layer name: block3_conv2, layer type: Conv2D, input shapes: [[None, 56, 56, 256]], output shape: [None, 56, 56, 256]
Layer name: block3_conv3, layer type: Conv2D, input shapes: [[None, 56, 56, 256]], output shape: [None, 56, 56, 256]
Layer name: block3_pool, layer type: MaxPooling2D, input shapes: [[None, 56, 56, 256]], output shape: [None, 28, 28, 256]
Layer name: block4_conv1, layer type: Conv2D, input shapes: [[None, 28, 28, 256]], output shape: [None, 28, 28, 512]
Layer name: block4_conv2, layer type: Conv2D, input shapes: [[None, 28, 28, 512]], output shape: [None, 28, 28, 512]
Layer name: block4_conv3, layer type: Conv2D, input shapes: [[None, 28, 28, 512]], output shape: [None, 28, 28, 512]
Layer name: block4_pool, layer type: MaxPooling2D, input shapes: [[None, 28, 28, 512]], output shape: [None, 14, 14, 512]
Layer name: block5_conv1, layer type: Conv2D, input shapes: [[None, 14, 14, 512]], output shape: [None, 14, 14, 512]
Layer name: block5_conv2, layer type: Conv2D, input shapes: [[None, 14, 14, 512]], output shape: [None, 14, 14, 512]
Layer name: block5_conv3, layer type: Conv2D, input shapes: [[None, 14, 14, 512]], output shape: [None, 14, 14, 512]
Layer name: block5_pool, layer type: MaxPooling2D, input shapes: [[None, 14, 14, 512]], output shape: [None, 7, 7, 512]
Layer name: flatten, layer type: Reshape, input shapes: [[None, 7, 7, 512]], output shape: [None, 25088]
Layer name: fc1, layer type: Dense, input shapes: [[None, 25088]], output shape: [None, 4096]
Layer name: fc2, layer type: Dense, input shapes: [[None, 4096]], output shape: [None, 4096]
Layer name: predictions, layer type: Dense, input shapes: [[None, 4096]], output shape: [None, 1000]
{'Model': {'Precision': 'fixed<16,6>', 'ReuseFactor': 1, 'Strategy': 'Latency', 'BramFactor': 1000000000, 'TraceOutput': False}}

Please resolve this issue.

Unable to generate bitfile

I am now trying to reproduce this tutorial, however I am unable to generate a bitfile and there is an error when I run the following code in part 7a

hls_model.build(csim=False, export=True, bitfile=True)

ERROR:

INFO: [BD 41-1029] Generation completed for the IP Integrator block hier_0/myproject_axi_0 .
Exporting to file /home/jovyan/hls4ml-tutorial/model_3/hls4ml_prj_pynq/myproject_vivado_accelerator/project_1.srcs/sources_1/bd/design_1/hw_handoff/design_1.hwh
Generated Block Design Tcl file /home/jovyan/hls4ml-tutorial/model_3/hls4ml_prj_pynq/myproject_vivado_accelerator/project_1.srcs/sources_1/bd/design_1/hw_handoff/design_1_bd.tcl
Generated Hardware Definition File /home/jovyan/hls4ml-tutorial/model_3/hls4ml_prj_pynq/myproject_vivado_accelerator/project_1.srcs/sources_1/bd/design_1/synth/design_1.hwdef
INFO: [IP_Flow 19-5642] Done with IP cache export for multiple IPs
realloc(): invalid old size
Abnormal program termination (6)
Please check '/home/jovyan/hls4ml-tutorial/model_3/hls4ml_prj_pynq/hs_err_pid21926.log' for details

I am using Ubuntu and using the provided Docker image with Vivado.

Here is the log file:

#
# An unexpected error has occurred (6)
#
Stack:
/lib/x86_64-linux-gnu/libc.so.6(+0x42520) [0x7f8d3cc1a520]
/lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x12c) [0x7f8d3cc6ea7c]
/lib/x86_64-linux-gnu/libc.so.6(raise+0x16) [0x7f8d3cc1a476]
/lib/x86_64-linux-gnu/libc.so.6(abort+0xd3) [0x7f8d3cc007f3]
/lib/x86_64-linux-gnu/libc.so.6(+0x896f6) [0x7f8d3cc616f6]
/lib/x86_64-linux-gnu/libc.so.6(+0xa0d7c) [0x7f8d3cc78d7c]
/lib/x86_64-linux-gnu/libc.so.6(realloc+0x36c) [0x7f8d3cc7db2c]
/lib/x86_64-linux-gnu/libudev.so.1(+0x15707) [0x7f8d3eeab707]
/lib/x86_64-linux-gnu/libudev.so.1(+0x1bb1b) [0x7f8d3eeb1b1b]
/lib/x86_64-linux-gnu/libudev.so.1(+0x75ff) [0x7f8d3ee9d5ff]
/lib/x86_64-linux-gnu/libudev.so.1(+0x7b6b) [0x7f8d3ee9db6b]
/lib/x86_64-linux-gnu/libudev.so.1(+0x10192) [0x7f8d3eea6192]
/lib/x86_64-linux-gnu/libudev.so.1(+0x105d3) [0x7f8d3eea65d3]
/lib/x86_64-linux-gnu/libudev.so.1(udev_enumerate_scan_devices+0x2a1) [0x7f8d3eea7341]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/libXil_lmgr11.so(+0x10f927) [0x7f8d33d0f927]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/libXil_lmgr11.so(xilinxd_52bd858d5acf2fc4+0x9) [0x7f8d33d0fd89]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/libXil_lmgr11.so(+0xc6566) [0x7f8d33cc6566]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/libXil_lmgr11.so(xilinxd_52bd853912de43c2+0xc8) [0x7f8d33cc6098]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/libXil_lmgr11.so(+0xb33a2) [0x7f8d33cb33a2]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/libXil_lmgr11.so(xilinxd_52bd995765656b48+0x2a) [0x7f8d33cbd5da]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/libXil_lmgr11.so(xilinxd_52bd700d1bd3c616+0x73) [0x7f8d33cbd6c3]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/librdi_commonxillic.so(XilReg::Utils::GetHostInfo[abi:cxx11](XilReg::Utils::HostInfoType, bool) const+0x208) [0x7f8d376674f8]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/librdi_commonxillic.so(XilReg::Utils::GetHostInfoFormatted[abi:cxx11](XilReg::Utils::HostInfoType, bool) const+0x52) [0x7f8d3766b2c2]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/librdi_commonxillic.so(XilReg::Utils::GetHostInfo[abi:cxx11]() const+0x183) [0x7f8d3766b583]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/librdi_commonxillic.so(XilReg::Utils::GetRegInfo(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool, bool)+0xc6) [0x7f8d37675e06]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/librdi_commonxillic.so(XilReg::Utils::GetRegInfoWebTalk(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x60) [0x7f8d37676090]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/librdi_project.so(HAPRWebtalkHelper::getRegistrationId[abi:cxx11]() const+0x3a) [0x7f8d02e37a2a]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/librdi_project.so(HAPRWebtalkHelper::HAPRWebtalkHelper(HAPRProject*, HAPRDesign*, HWEWebtalkMgr*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xb0) [0x7f8d02e37ea0]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/librdi_tcltasks.so(+0x180cbc6) [0x7f8d2e20cbc6]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/librdi_tcltasks.so(+0x1817914) [0x7f8d2e217914]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/librdi_common.so(+0x86eca2) [0x7f8d3e26eca2]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/libtcl8.5.so(+0x334af) [0x7f8d36a334af]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/libtcl8.5.so(+0x34b38) [0x7f8d36a34b38]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/libtcl8.5.so(Tcl_EvalEx+0x13) [0x7f8d36a350a3]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/libtcl8.5.so(Tcl_FSEvalFileEx+0x1da) [0x7f8d36a99c5a]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/librdi_commontasks.so(+0x2b1d8d) [0x7f8d30ab1d8d]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/librdi_common.so(+0x86eca2) [0x7f8d3e26eca2]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/libtcl8.5.so(+0x334af) [0x7f8d36a334af]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/libtcl8.5.so(Tcl_EvalObjv+0x32) [0x7f8d36a335e2]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/libtcl8.5.so(TclEvalObjEx+0x322) [0x7f8d36a35402]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/librdi_commontasks.so(+0x2de870) [0x7f8d30ade870]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/librdi_commontasks.so(+0x2e011e) [0x7f8d30ae011e]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/librdi_common.so(+0x86eca2) [0x7f8d3e26eca2]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/libtcl8.5.so(+0x334af) [0x7f8d36a334af]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/libtcl8.5.so(Tcl_EvalObjv+0x32) [0x7f8d36a335e2]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/libtcl8.5.so(TclEvalObjEx+0x322) [0x7f8d36a35402]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/librdi_commonmain.so(+0x7af3) [0x7f8d3d607af3]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/libtcl8.5.so(Tcl_Main+0x1d0) [0x7f8d36aa0210]
/opt/Xilinx/Vivado/2019.2/lib/lnx64.o/librdi_common.so(+0x8b30cb) [0x7f8d3e2b30cb]
/lib/x86_64-linux-gnu/libc.so.6(+0x94b43) [0x7f8d3cc6cb43]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x44) [0x7f8d3ccfdbb4]

docker build error

Hello,
I found a build error in the RHEL8 environment(pull is Ok).

$sudo docker build -f docker/Dockerfile.vivado -t ghcr.io/fastmachinelearning/hls4ml-tutorial/hls4ml-0.7.1-vivado-2019.2:latest .
Sending build context to Docker daemon 67.18GB
Error response from daemon: unexpected error reading Dockerfile: read /var/lib/docker/tmp/docker-builder548070090/docker/Dockerfile.vivado: is a directory

Part7b Deployment failing

nn = NeuralNetworkOverlay('hls4ml_nn.bit', X_test.shape, y_test.shape)
The above function fails with the error:

`---------------------------------------------------------------------------
error Traceback (most recent call last)
in ()
----> 1 nn = NeuralNetworkOverlay('hls4ml_nn.bit', X_test.shape, y_test.shape)

/home/xilinx/jupyter_notebooks/package/axi_stream_driver.py in init(self, bitfile_name, x_shape, y_shape, dtype, dtbo, download, ignore_version, device)
9 self, bitfile_name, x_shape, y_shape, dtype=np.float32, dtbo=None, download=True, ignore_version=False, device=None
10 ):
---> 11 super().init(bitfile_name, dtbo=None, download=True, ignore_version=False, device=None)
12 self.sendchannel = self.hier_0.axi_dma_0.sendchannel
13 self.recvchannel = self.hier_0.axi_dma_0.recvchannel

/usr/local/lib/python3.6/dist-packages/pynq/overlay.py in init(self, bitfile_name, dtbo, download, ignore_version, device)
323
324 if download:
--> 325 self.download()
326
327 self.doc = _build_docstring(self._ip_map._description,

/usr/local/lib/python3.6/dist-packages/pynq/overlay.py in download(self, dtbo)
372 Clocks.set_pl_clk(i)
373
--> 374 super().download(self.parser)
375 if dtbo:
376 super().insert_dtbo(dtbo)

/usr/local/lib/python3.6/dist-packages/pynq/bitstream.py in download(self, parser)
141
142 """
--> 143 self.device.download(self, parser)
144
145 def remove_dtbo(self):

/usr/local/lib/python3.6/dist-packages/pynq/pl_server/device.py in download(self, bitstream, parser)
568 def download(self, bitstream, parser=None):
569 if not bitstream.binfile_name:
--> 570 _preload_binfile(bitstream)
571
572 if not bitstream.partial:

/usr/local/lib/python3.6/dist-packages/pynq/pl_server/device.py in _preload_binfile(bitstream)
507 bitstream.firmware_path = os.path.join('/lib/firmware',
508 bitstream.binfile_name)
--> 509 bit_dict = parse_bit_header(bitstream.bitfile_name)
510 if bit_dict != bitstream.bit_data:
511 bitstream.bit_data = bit_dict

/usr/local/lib/python3.6/dist-packages/pynq/pl_server/device.py in parse_bit_header(bitfile)
457
458 # Strip the (2+n)-byte first field (2-bit length, n-bit data)
--> 459 length = struct.unpack('>h', contents[offset:offset + 2])[0]
460 offset += 2 + length
461

error: unpack requires a buffer of 2 bytes`

My device is the Pynq-z2 board running the 4.19.0-xilinx-v2019.1 kernel

Compiled on host using vivado 2019.2 on Ubuntu 20.04.1.

What is the password for root privileges?

In Docker without vivado . I want to install sqlite3 , so I use "sudo apt-get install sqlite3". But I don't know username "jovyan" 's password. Could you please give me password?

conda error

I can't tell if this is a me problem. My connectivity to GitHub is fine.

conda env create -f environment.yml 
Collecting package metadata (repodata.json): done
Solving environment: done
Preparing transaction: done
Verifying transaction: done
Executing transaction: \ 
\ 
done
Installing pip dependencies: \ Ran pip subprocess with arguments:
['/home/jgwohlbier/devel/packages/anaconda3/envs/hls4ml-tutorial-0.4.0/bin/python', '-m', 'pip', 'install', '-U', '-r', '/home/jgwohlbier/devel/DSSoC/EPOCHS/hls4ml-tutorial/condaenv.l7k4mvnp.requirements.txt']
Pip subprocess output:
Collecting qkeras
  Cloning git://github.com/google/qkeras.git to /tmp/pip-install-kv0qrtrv/qkeras_c6c16b1e77134d2d8963d09e53618ebd
Collecting jupyter
  Using cached jupyter-1.0.0-py2.py3-none-any.whl (2.7 kB)
Collecting tensorflow==2.3.1
  Using cached tensorflow-2.3.1-cp37-cp37m-manylinux2010_x86_64.whl (320.4 MB)
Collecting hls4ml[profiling]==0.4.0
  Using cached hls4ml-0.4.0-py3-none-any.whl (215 kB)

Pip subprocess error:
  Running command git clone -q git://github.com/google/qkeras.git /tmp/pip-install-kv0qrtrv/qkeras_c6c16b1e77134d2d8963d09e53618ebd
  fatal: unable to connect to github.com:
  github.com[0: 140.82.113.3]: errno=Connection timed out

WARNING: Discarding git+git://github.com/google/qkeras.git#egg=qkeras. Command errored out with exit status 128: git clone -q git://github.com/google/qkeras.git /tmp/pip-install-kv0qrtrv/qkeras_c6c16b1e77134d2d8963d09e53618ebd Check the logs for full command output.
ERROR: Could not find a version that satisfies the requirement qkeras (unavailable)
ERROR: No matching distribution found for qkeras (unavailable)

failed

CondaEnvException: Pip failed

ImportError: cannot import name 'all_callbacks'

Hi!
i was going through with the hls4ml tutorial found here: https://github.com/fastmachinelearning/hls4ml-tutorial/blob/master/part1_getting_started.ipynb

upon doing :
from callbacks import all_callbacks
i got
`---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
in
3 from tensorflow.keras.optimizers import Adam
4 from tensorflow.keras.regularizers import l1
----> 5 from callbacks import all_callbacks

ImportError: cannot import name 'all_callbacks'`

Please help me solve this issue

q

Incompatible environment

With the current environment.yml,

conda env create -f environment.yml

fails with

Pip subprocess error:
  Running command git clone --filter=blob:none --quiet https://github.com/thesps/conifer.git /tmp/pip-req-build-wgx0dt_a
ERROR: Package 'conifer' requires a different Python: 3.7.12 not in '>=3.8'

due to a recent update in conifer: thesps/conifer@1087153#diff-60f61ab7a8d1910d86d9fda2261620314edcae5894d5aaa236b821c7256badd7L20-R21

I think the simplest solution is to update our environment.yml to use python 3.8.

cc: @quinnanm @thesps

part 2 error: HLSModel must have tracing on

in the second part of the tutorial for this line

hls4ml.model.profiling.numerical(model=model, hls_model=hls_model, X=X_test[:1000])

I see this error

RuntimeError: HLSModel must have tracing on for at least 1 layer (this can be set in its config)

When the config is generated do I need to pass a flag to enable tracing?

config = hls4ml.utils.config_from_keras_model(model, granularity='name')

Difference between in encoded and buffer implementation in 2dCNN

In the nnet_conv2d_stream, there are two methods to implement the 2dCNN model: linebuffer and encoded. I read the code of both methods. However I did not understand the implementation of encoded method and what is the difference between both of two methods.

Reports not found

I was running the sample code found on https://github.com/fastmachinelearning/hls4ml-tutorial, and the following error occured. The spec of my setup are the following:
OS ;Ubuntu 18.04.2
Vivado hls4ml 2019.2
python 3.9

Why co-simulation not found?
Can someone help.

I have added the last portion of my output.

INFO: [HLS 200-42] -- Implementing module 'dense_latency_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_s'
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [SCHED 204-11] Starting scheduling ...
INFO: [SCHED 204-61] Pipelining function 'dense_latency.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.'.
INFO: [SCHED 204-61] Pipelining result : Target II = 1, Final II = 1, Depth = 2.
INFO: [SCHED 204-11] Finished scheduling.
INFO: [HLS 200-111] Elapsed time: 260.08 seconds; current allocated memory: 430.198 MB.
INFO: [BIND 205-100] Starting micro-architecture generation ...
INFO: [BIND 205-101] Performing variable lifetime analysis.
INFO: [BIND 205-101] Exploring resource sharing.
INFO: [BIND 205-101] Binding ...
INFO: [BIND 205-100] Starting global binding ...
INFO: [BIND 205-100] Finished micro-architecture generation.
INFO: [HLS 200-111] Elapsed time: 10.79 seconds; current allocated memory: 503.324 MB.
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [HLS 200-42] -- Implementing module 'relu_ap_fixed_16_6_5_3_0_ap_fixed_16_6_5_3_0_relu_config3_s'
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [SCHED 204-11] Starting scheduling ...
INFO: [SCHED 204-61] Pipelining function 'relu<ap_fixed<16, 6, 5, 3, 0>, ap_fixed<16, 6, 5, 3, 0>, relu_config3>'.
INFO: [SCHED 204-61] Pipelining result : Target II = 1, Final II = 1, Depth = 1.
INFO: [SCHED 204-11] Finished scheduling.
INFO: [HLS 200-111] Elapsed time: 6.99 seconds; current allocated memory: 506.867 MB.
INFO: [BIND 205-100] Starting micro-architecture generation ...
INFO: [BIND 205-101] Performing variable lifetime analysis.
INFO: [BIND 205-101] Exploring resource sharing.
INFO: [BIND 205-101] Binding ...
INFO: [BIND 205-100] Finished micro-architecture generation.
INFO: [HLS 200-111] Elapsed time: 0.62 seconds; current allocated memory: 508.371 MB.
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [HLS 200-42] -- Implementing module 'dense_latency_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_1'
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [SCHED 204-11] Starting scheduling ...
INFO: [SCHED 204-61] Pipelining function 'dense_latency.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.1'.
INFO: [SCHED 204-61] Pipelining result : Target II = 1, Final II = 1, Depth = 2.
INFO: [SCHED 204-11] Finished scheduling.
INFO: [HLS 200-111] Elapsed time: 6.03 seconds; current allocated memory: 536.746 MB.
INFO: [BIND 205-100] Starting micro-architecture generation ...
INFO: [BIND 205-101] Performing variable lifetime analysis.
INFO: [BIND 205-101] Exploring resource sharing.
INFO: [BIND 205-101] Binding ...
INFO: [BIND 205-100] Starting global binding ...
INFO: [BIND 205-100] Finished micro-architecture generation.
INFO: [HLS 200-111] Elapsed time: 36.55 seconds; current allocated memory: 616.398 MB.
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [HLS 200-42] -- Implementing module 'relu_ap_fixed_16_6_5_3_0_ap_fixed_16_6_5_3_0_relu_config5_s'
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [SCHED 204-11] Starting scheduling ...
INFO: [SCHED 204-61] Pipelining function 'relu<ap_fixed<16, 6, 5, 3, 0>, ap_fixed<16, 6, 5, 3, 0>, relu_config5>'.
INFO: [SCHED 204-61] Pipelining result : Target II = 1, Final II = 1, Depth = 1.
INFO: [SCHED 204-11] Finished scheduling.
INFO: [HLS 200-111] Elapsed time: 16.48 seconds; current allocated memory: 621.579 MB.
INFO: [BIND 205-100] Starting micro-architecture generation ...
INFO: [BIND 205-101] Performing variable lifetime analysis.
INFO: [BIND 205-101] Exploring resource sharing.
INFO: [BIND 205-101] Binding ...
INFO: [BIND 205-100] Finished micro-architecture generation.
INFO: [HLS 200-111] Elapsed time: 0.32 seconds; current allocated memory: 622.325 MB.
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [HLS 200-42] -- Implementing module 'dense_latency_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0'
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [SCHED 204-11] Starting scheduling ...
INFO: [SCHED 204-61] Pipelining function 'dense_latency.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0'.
INFO: [SCHED 204-61] Pipelining result : Target II = 1, Final II = 1, Depth = 2.
INFO: [SCHED 204-11] Finished scheduling.
INFO: [HLS 200-111] Elapsed time: 2.94 seconds; current allocated memory: 636.112 MB.
INFO: [BIND 205-100] Starting micro-architecture generation ...
INFO: [BIND 205-101] Performing variable lifetime analysis.
INFO: [BIND 205-101] Exploring resource sharing.
INFO: [BIND 205-101] Binding ...
INFO: [BIND 205-100] Starting global binding ...
INFO: [BIND 205-100] Finished micro-architecture generation.
INFO: [HLS 200-111] Elapsed time: 12.28 seconds; current allocated memory: 715.576 MB.
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [HLS 200-42] -- Implementing module 'relu_ap_fixed_16_6_5_3_0_ap_fixed_16_6_5_3_0_relu_config7_s'
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [SCHED 204-11] Starting scheduling ...
INFO: [SCHED 204-61] Pipelining function 'relu<ap_fixed<16, 6, 5, 3, 0>, ap_fixed<16, 6, 5, 3, 0>, relu_config7>'.
INFO: [SCHED 204-61] Pipelining result : Target II = 1, Final II = 1, Depth = 1.
INFO: [SCHED 204-11] Finished scheduling.
INFO: [HLS 200-111] Elapsed time: 8.24 seconds; current allocated memory: 718.534 MB.
INFO: [BIND 205-100] Starting micro-architecture generation ...
INFO: [BIND 205-101] Performing variable lifetime analysis.
INFO: [BIND 205-101] Exploring resource sharing.
INFO: [BIND 205-101] Binding ...
INFO: [BIND 205-100] Finished micro-architecture generation.
INFO: [HLS 200-111] Elapsed time: 0.32 seconds; current allocated memory: 719.281 MB.
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [HLS 200-42] -- Implementing module 'dense_latency_ap_fixed_ap_fixed_16_6_5_3_0_config8_0_0_0_0_0_0'
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [SCHED 204-11] Starting scheduling ...
INFO: [SCHED 204-61] Pipelining function 'dense_latency<ap_fixed,ap_fixed<16,6,5,3,0>,config8>.0.0.0.0.0.0'.
INFO: [SCHED 204-61] Pipelining result : Target II = 1, Final II = 1, Depth = 2.
INFO: [SCHED 204-11] Finished scheduling.
INFO: [HLS 200-111] Elapsed time: 0.72 seconds; current allocated memory: 721.840 MB.
INFO: [BIND 205-100] Starting micro-architecture generation ...
INFO: [BIND 205-101] Performing variable lifetime analysis.
INFO: [BIND 205-101] Exploring resource sharing.
INFO: [BIND 205-101] Binding ...
INFO: [BIND 205-100] Starting global binding ...
INFO: [BIND 205-100] Finished micro-architecture generation.
INFO: [HLS 200-111] Elapsed time: 1.48 seconds; current allocated memory: 726.397 MB.
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [HLS 200-42] -- Implementing module 'softmax_stable_ap_fixed_ap_fixed_16_6_5_3_0_softmax_config9_s'
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [SCHED 204-11] Starting scheduling ...
INFO: [SCHED 204-61] Pipelining function 'softmax_stable<ap_fixed,ap_fixed<16,6,5,3,0>,softmax_config9>'.
INFO: [SCHED 204-61] Pipelining result : Target II = 1, Final II = 1, Depth = 5.
INFO: [SCHED 204-11] Finished scheduling.
INFO: [HLS 200-111] Elapsed time: 1.46 seconds; current allocated memory: 727.396 MB.
INFO: [BIND 205-100] Starting micro-architecture generation ...
INFO: [BIND 205-101] Performing variable lifetime analysis.
INFO: [BIND 205-101] Exploring resource sharing.
INFO: [BIND 205-101] Binding ...
INFO: [BIND 205-100] Finished micro-architecture generation.
INFO: [HLS 200-111] Elapsed time: 0.3 seconds; current allocated memory: 727.998 MB.
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [HLS 200-42] -- Implementing module 'myproject'
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [SCHED 204-11] Starting scheduling ...
INFO: [SCHED 204-61] Pipelining function 'myproject'.
INFO: [SCHED 204-61] Pipelining result : Target II = 1, Final II = 1, Depth = 14.
INFO: [SCHED 204-11] Finished scheduling.
INFO: [HLS 200-111] Elapsed time: 0.4 seconds; current allocated memory: 728.779 MB.
INFO: [BIND 205-100] Starting micro-architecture generation ...
INFO: [BIND 205-101] Performing variable lifetime analysis.
INFO: [BIND 205-101] Exploring resource sharing.
INFO: [BIND 205-101] Binding ...
INFO: [BIND 205-100] Finished micro-architecture generation.
INFO: [HLS 200-111] Elapsed time: 10.84 seconds; current allocated memory: 745.543 MB.
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [HLS 200-10] -- Generating RTL for module 'dense_latency_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_s'
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [RTGEN 206-104] Estimated max fanout for 'dense_latency_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_s' is 14096 from HDL expression: (1'b0 == ap_block_pp0_stage0)
INFO: [RTGEN 206-100] Finished creating RTL model for 'dense_latency_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_s'.
INFO: [HLS 200-111] Elapsed time: 6.66 seconds; current allocated memory: 787.382 MB.
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [HLS 200-10] -- Generating RTL for module 'relu_ap_fixed_16_6_5_3_0_ap_fixed_16_6_5_3_0_relu_config3_s'
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [RTGEN 206-100] Finished creating RTL model for 'relu_ap_fixed_16_6_5_3_0_ap_fixed_16_6_5_3_0_relu_config3_s'.
INFO: [HLS 200-111] Elapsed time: 11.03 seconds; current allocated memory: 862.080 MB.
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [HLS 200-10] -- Generating RTL for module 'dense_latency_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_1'
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [RTGEN 206-104] Estimated max fanout for 'dense_latency_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_1' is 35936 from HDL expression: (1'b0 == ap_block_pp0_stage0)
INFO: [RTGEN 206-100] Finished creating RTL model for 'dense_latency_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_1'.
INFO: [HLS 200-111] Elapsed time: 2.47 seconds; current allocated memory: 923.258 MB.
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [HLS 200-10] -- Generating RTL for module 'relu_ap_fixed_16_6_5_3_0_ap_fixed_16_6_5_3_0_relu_config5_s'
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [RTGEN 206-100] Finished creating RTL model for 'relu_ap_fixed_16_6_5_3_0_ap_fixed_16_6_5_3_0_relu_config5_s'.
INFO: [HLS 200-111] Elapsed time: 27.32 seconds; current allocated memory: 1.043 GB.
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [HLS 200-10] -- Generating RTL for module 'dense_latency_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0'
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [RTGEN 206-104] Estimated max fanout for 'dense_latency_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0' is 18736 from HDL expression: (1'b0 == ap_block_pp0_stage0)
INFO: [RTGEN 206-100] Finished creating RTL model for 'dense_latency_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0'.
INFO: [HLS 200-111] Elapsed time: 2.71 seconds; current allocated memory: 1.073 GB.
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [HLS 200-10] -- Generating RTL for module 'relu_ap_fixed_16_6_5_3_0_ap_fixed_16_6_5_3_0_relu_config7_s'
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [RTGEN 206-100] Finished creating RTL model for 'relu_ap_fixed_16_6_5_3_0_ap_fixed_16_6_5_3_0_relu_config7_s'.
INFO: [HLS 200-111] Elapsed time: 13.95 seconds; current allocated memory: 1.143 GB.
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [HLS 200-10] -- Generating RTL for module 'dense_latency_ap_fixed_ap_fixed_16_6_5_3_0_config8_0_0_0_0_0_0'
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [RTGEN 206-100] Finished creating RTL model for 'dense_latency_ap_fixed_ap_fixed_16_6_5_3_0_config8_0_0_0_0_0_0'.
INFO: [HLS 200-111] Elapsed time: 3.03 seconds; current allocated memory: 1.150 GB.
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [HLS 200-10] -- Generating RTL for module 'softmax_stable_ap_fixed_ap_fixed_16_6_5_3_0_softmax_config9_s'
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [RTGEN 206-100] Generating core module 'myproject_mul_mul_18s_17ns_26_1_1': 5 instance(s).
INFO: [RTGEN 206-100] Finished creating RTL model for 'softmax_stable_ap_fixed_ap_fixed_16_6_5_3_0_softmax_config9_s'.
INFO: [HLS 200-111] Elapsed time: 4.58 seconds; current allocated memory: 1.163 GB.
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [HLS 200-10] -- Generating RTL for module 'myproject'
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [RTGEN 206-500] Setting interface mode on port 'myproject/input_1_V' to 'ap_vld'.
INFO: [RTGEN 206-500] Setting interface mode on port 'myproject/layer9_out_0_V' to 'ap_vld'.
INFO: [RTGEN 206-500] Setting interface mode on port 'myproject/layer9_out_1_V' to 'ap_vld'.
INFO: [RTGEN 206-500] Setting interface mode on port 'myproject/layer9_out_2_V' to 'ap_vld'.
INFO: [RTGEN 206-500] Setting interface mode on port 'myproject/layer9_out_3_V' to 'ap_vld'.
INFO: [RTGEN 206-500] Setting interface mode on port 'myproject/layer9_out_4_V' to 'ap_vld'.
INFO: [RTGEN 206-500] Setting interface mode on function 'myproject' to 'ap_ctrl_hs'.
INFO: [RTGEN 206-100] Finished creating RTL model for 'myproject'.
INFO: [HLS 200-111] Elapsed time: 3.5 seconds; current allocated memory: 1.184 GB.
INFO: [HLS 200-789] **** Estimated Fmax: 250.92 MHz
INFO: [RTMG 210-279] Implementing memory 'softmax_stable_ap_fixed_ap_fixed_16_6_5_3_0_softmax_config9_s_exp_table1_rom' using block ROMs.
INFO: [RTMG 210-279] Implementing memory 'softmax_stable_ap_fixed_ap_fixed_16_6_5_3_0_softmax_config9_s_invert_table2_rom' using auto ROMs.
INFO: [HLS 200-111] Finished generating all RTL models Time (s): cpu = 00:07:23 ; elapsed = 00:07:47 . Memory (MB): peak = 5205.551 ; gain = 4772.445 ; free physical = 2071 ; free virtual = 7091
INFO: [VHDL 208-304] Generating VHDL RTL for myproject.
INFO: [VLOG 209-307] Generating Verilog RTL for myproject.
***** C/RTL SYNTHESIS COMPLETED IN 0h7m44s *****
INFO: [HLS 200-112] Total elapsed time: 467.13 seconds; peak allocated memory: 1.184 GB.
INFO: [Common 17-206] Exiting vivado_hls at Fri Jul 5 17:34:54 2024...
Vivado synthesis report not found.
Cosim report not found.
Timing report not found.
Found 1 solution(s) in my-hls-test/myproject_prj.
Reports for solution "solution1":

C SIMULATION RESULT:
INFO: [SIM 2] *************** CSIM start ***************
INFO: [SIM 4] CSIM will launch GCC as the compiler.
make: 'csim.exe' is up to date.
INFO: Unable to open input/predictions file, using default input.
0.0302734 0.799805 0.0576172 0.147461 0.0390625
INFO: Saved inference results to file: tb_data/csim_results.log
INFO: [SIM 1] CSim done with 0 errors.
INFO: [SIM 3] *************** CSIM finish ***************

SYNTHESIS REPORT:

== Vivado HLS Report for 'myproject'

Date: Fri Jul 5 17:34:39 2024
Version: 2019.2 (Build 2704478 on Wed Nov 06 22:10:23 MST 2019)
Project: myproject_prj
Solution: solution1
Product family: virtexuplus
Target device: xcvu13p-flga2577-2-e

================================================================
== Performance Estimates

Timing:
- Summary:
  +--------+---------+----------+------------+
  | Clock | Target | Estimated| Uncertainty|
  +--------+---------+----------+------------+
  |ap_clk | 5.00 ns | 3.985 ns | 0.62 ns |
  +--------+---------+----------+------------+
Latency:
- Summary:
  +---------+---------+-----------+-----------+-----+-----+----------+
  | Latency (cycles) | Latency (absolute) | Interval | Pipeline |
  | min | max | min | max | min | max | Type |
  +---------+---------+-----------+-----------+-----+-----+----------+
  | 13| 13| 65.000 ns | 65.000 ns | 1| 1| function |
  +---------+---------+-----------+-----------+-----+-----+----------+
- Detail:
  - Instance:
    +----------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------+---------+---------+-----------+-----------+-----+-----+----------+
    | | | Latency (cycles) | Latency (absolute) | Interval | Pipeline |
    | Instance | Module | min | max | min | max | min | max | Type |
    +----------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------+---------+---------+-----------+-----------+-----+-----+----------+
    |grp_dense_latency_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_1_fu_97 |dense_latency_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_1 | 1| 1| 5.000 ns | 5.000 ns | 1| 1| function |
    |grp_dense_latency_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_fu_165 |dense_latency_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0 | 1| 1| 5.000 ns | 5.000 ns | 1| 1| function |
    |grp_dense_latency_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_s_fu_201 |dense_latency_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_s | 1| 1| 5.000 ns | 5.000 ns | 1| 1| function |
    |grp_dense_latency_ap_fixed_ap_fixed_16_6_5_3_0_config8_0_0_0_0_0_0_fu_207 |dense_latency_ap_fixed_ap_fixed_16_6_5_3_0_config8_0_0_0_0_0_0 | 1| 1| 5.000 ns | 5.000 ns | 1| 1| function |
    |call_ret1_relu_ap_fixed_16_6_5_3_0_ap_fixed_16_6_5_3_0_relu_config3_s_fu_243 |relu_ap_fixed_16_6_5_3_0_ap_fixed_16_6_5_3_0_relu_config3_s | 0| 0| 0 ns | 0 ns | 1| 1| function |
    |grp_softmax_stable_ap_fixed_ap_fixed_16_6_5_3_0_softmax_config9_s_fu_311 |softmax_stable_ap_fixed_ap_fixed_16_6_5_3_0_softmax_config9_s | 4| 4| 20.000 ns | 20.000 ns | 1| 1| function |
    |call_ret3_relu_ap_fixed_16_6_5_3_0_ap_fixed_16_6_5_3_0_relu_config5_s_fu_324 |relu_ap_fixed_16_6_5_3_0_ap_fixed_16_6_5_3_0_relu_config5_s | 0| 0| 0 ns | 0 ns | 1| 1| function |
    |call_ret5_relu_ap_fixed_16_6_5_3_0_ap_fixed_16_6_5_3_0_relu_config7_s_fu_360 |relu_ap_fixed_16_6_5_3_0_ap_fixed_16_6_5_3_0_relu_config7_s | 0| 0| 0 ns | 0 ns | 1| 1| function |
    +----------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------+---------+---------+-----------+-----------+-----+-----+----------+
  - Loop:
    N/A

================================================================
== Utilization Estimates

Summary:
+---------------------+---------+-------+---------+---------+------+
| Name | BRAM_18K| DSP48E| FF | LUT | URAM |
+---------------------+---------+-------+---------+---------+------+
|DSP | -| -| -| -| -|
|Expression | -| -| 0| 6| -|
|FIFO | -| -| -| -| -|
|Instance | 4| 3317| 11970| 107013| -|
|Memory | -| -| -| -| -|
|Multiplexer | -| -| -| 36| -|
|Register | -| -| 3424| -| -|
+---------------------+---------+-------+---------+---------+------+
|Total | 4| 3317| 15394| 107055| 0|
+---------------------+---------+-------+---------+---------+------+
|Available SLR | 1344| 3072| 864000| 432000| 320|
+---------------------+---------+-------+---------+---------+------+
|Utilization SLR (%) | ~0 | 107| 1| 24| 0|
+---------------------+---------+-------+---------+---------+------+
|Available | 5376| 12288| 3456000| 1728000| 1280|
+---------------------+---------+-------+---------+---------+------+
|Utilization (%) | ~0 | 26| ~0 | 6| 0|
+---------------------+---------+-------+---------+---------+------+

Detail:
- Instance:
  +----------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------+---------+-------+------+-------+-----+
  | Instance | Module | BRAM_18K| DSP48E| FF | LUT | URAM|
  +----------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------+---------+-------+------+-------+-----+
  |grp_dense_latency_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_fu_165 |dense_latency_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0 | 0| 806| 3041| 24665| 0|
  |grp_dense_latency_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_1_fu_97 |dense_latency_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_1 | 0| 1500| 5915| 50468| 0|
  |grp_dense_latency_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_s_fu_201 |dense_latency_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_s | 0| 878| 2297| 23681| 0|
  |grp_dense_latency_ap_fixed_ap_fixed_16_6_5_3_0_config8_0_0_0_0_0_0_fu_207 |dense_latency_ap_fixed_ap_fixed_16_6_5_3_0_config8_0_0_0_0_0_0 | 0| 128| 481| 3951| 0|
  |call_ret1_relu_ap_fixed_16_6_5_3_0_ap_fixed_16_6_5_3_0_relu_config3_s_fu_243 |relu_ap_fixed_16_6_5_3_0_ap_fixed_16_6_5_3_0_relu_config3_s | 0| 0| 0| 1792| 0|
  |call_ret3_relu_ap_fixed_16_6_5_3_0_ap_fixed_16_6_5_3_0_relu_config5_s_fu_324 |relu_ap_fixed_16_6_5_3_0_ap_fixed_16_6_5_3_0_relu_config5_s | 0| 0| 0| 896| 0|
  |call_ret5_relu_ap_fixed_16_6_5_3_0_ap_fixed_16_6_5_3_0_relu_config7_s_fu_360 |relu_ap_fixed_16_6_5_3_0_ap_fixed_16_6_5_3_0_relu_config7_s | 0| 0| 0| 896| 0|
  |grp_softmax_stable_ap_fixed_ap_fixed_16_6_5_3_0_softmax_config9_s_fu_311 |softmax_stable_ap_fixed_ap_fixed_16_6_5_3_0_softmax_config9_s | 4| 5| 236| 664| 0|
  +----------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------+---------+-------+------+-------+-----+
  |Total | | 4| 3317| 11970| 107013| 0|
  +----------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------+---------+-------+------+-------+-----+

Co-simulation report not found.

bitstream tutorial model build error

Hi all,

I'm currently facing this issue even after installing the required IP for the pynq-z2 FPGA, which is the exact setup detailed by the tutorial. My vivado version, 2020.1, is also capable of detecting the board files.

Here is a snippet from the jupyter notebook tutorial 7a on bitstream. I have left everything else untouched less changing the fpga part.
Particularly, the error stems from this code in the tutorial hls_model.build(csim=False, export=True, bitfile=True).

` Slave segment '/processing_system7_0/S_AXI_HP0/HP0_DDR_LOWOCM' is being assigned into address space '/axi_dma_0/Data_S2MM' at <0x0000_0000 [ 512M ]>.
# startgroup
# create_bd_cell -type ip -vlnv xilinx.com:hls:${project_name}_axi:1.0 ${project_name}_axi_0
ERROR: [Common 17-39] 'create_bd_cell' failed due to earlier errors.

    while executing
"create_bd_cell -type ip -vlnv xilinx.com:hls:${project_name}_axi:1.0 ${project_name}_axi_0"
    (file "design.tcl" line 39)
INFO: [Common 17-17] undo 'startgroup'
INFO: [Common 17-206] Exiting Vivado at Tue Jul  4 11:26:48 2023...

ERROR: [BD 5-390] IP definition not found for VLNV: xilinx.com:hls:myproject_axi:1.0 

Vivado synthesis report not found.
Cosim report not found.
Timing report not found.

I'm relatively new to Vivado and the FPGA build process, although I suspect its something to do with where i source the files. Do let me know if there are files I can provide to aid your assistance!

Cant launch csim and cosim on tutorial scripts

Greetings,

I was trying to to run the buld function with c and RTL simulations on the first "getting started tutorial", using the build function:

hls_model.build(csim=True,cosim=True)

however i get the following output with an error:

****** Vivado(TM) HLS - High-Level Synthesis from C, C++ and SystemC v2019.2 (64-bit)
  **** SW Build 2708876 on Wed Nov  6 21:39:14 MST 2019
  **** IP Build 2700528 on Thu Nov  7 00:09:20 MST 2019
    ** Copyright 1986-2019 Xilinx, Inc. All Rights Reserved.

source /opt/Xilinx/Vivado/2019.2/scripts/vivado_hls/hls.tcl -notrace
INFO: [HLS 200-10] Running '/opt/Xilinx/Vivado/2019.2/bin/unwrapped/lnx64.o/vivado_hls'
INFO: [HLS 200-10] For user 'jovyan' on host 'jupyter-sbasam' (Linux_x86_64 version 5.4.129+) on Mon Dec 13 12:24:29 UTC 2021
INFO: [HLS 200-10] In directory '/home/jovyan/model_1/hls4ml_prj'
Sourcing Tcl script 'build_prj.tcl'
INFO: [HLS 200-10] Creating and opening project '/home/jovyan/model_1/hls4ml_prj/myproject_prj'.
INFO: [HLS 200-10] Adding design file 'firmware/myproject.cpp' to the project
INFO: [HLS 200-10] Adding test bench file 'myproject_test.cpp' to the project
INFO: [HLS 200-10] Adding test bench file 'firmware/weights' to the project
INFO: [HLS 200-10] Adding test bench file 'tb_data' to the project
INFO: [HLS 200-10] Creating and opening solution '/home/jovyan/model_1/hls4ml_prj/myproject_prj/solution1'.
INFO: [XFORM 203-101] Allowed max sub elements number after partition is 4096.
INFO: [XFORM 203-1161] The maximum of name length is set into 60.
INFO: [HLS 200-10] Setting target device to 'xcu250-figd2104-2L-e'
INFO: [SYN 201-201] Setting up clock 'default' with a period of 5ns.
***** C SIMULATION *****
INFO: [SIM 211-2] *************** CSIM start ***************
INFO: [SIM 211-4] CSIM will launch GCC as the compiler.
   Compiling ../../../../myproject_test.cpp in debug mode
csim.mk:78: recipe for target 'obj/myproject_test.o' failed
In file included from /opt/Xilinx/Vivado/2019.2/tps/lnx64/gcc-6.2.0/include/c++/6.2.0/x86_64-pc-linux-gnu/bits/os_defines.h:39:0,
                 from /opt/Xilinx/Vivado/2019.2/tps/lnx64/gcc-6.2.0/include/c++/6.2.0/x86_64-pc-linux-gnu/bits/c++config.h:495,
                 from /opt/Xilinx/Vivado/2019.2/tps/lnx64/gcc-6.2.0/include/c++/6.2.0/iosfwd:38,
                 from /opt/Xilinx/Vivado/2019.2/tps/lnx64/gcc-6.2.0/include/c++/6.2.0/ios:38,
                 from /opt/Xilinx/Vivado/2019.2/tps/lnx64/gcc-6.2.0/include/c++/6.2.0/istream:38,
                 from /opt/Xilinx/Vivado/2019.2/tps/lnx64/gcc-6.2.0/include/c++/6.2.0/fstream:38,
                 from ../../../../myproject_test.cpp:19:
/usr/include/features.h:424:25: fatal error: sys/cdefs.h: No such file or directory
 #  include <sys/cdefs.h>
                         ^
compilation terminated.
make: *** [obj/myproject_test.o] Error 1
ERROR: [SIM 211-100] 'csim_design' failed: compilation error(s).
INFO: [SIM 211-3] *************** CSIM finish ***************
4
    while executing
"source build_prj.tcl"
    ("uplevel" body line 1)
    invoked from within
"uplevel \#0 [list source $arg] "

INFO: [Common 17-206] Exiting vivado_hls at Mon Dec 13 12:24:33 2021...
Synthesis report not found.

This used to work on the earlier release (0.5.0), i hope this could be resolved soon.

Best regards,

Sami,

2nd tutorial does not seem to work

Hello,

I am using tensorflow 2.3.1. I completed the first tutorial but the second tutorial fails with the config line. It seems that granulary 'name' is the problem:

config = hls4ml.utils.config_from_keras_model(model, granularity='Name')
File "/hdd2/programs/prunning/tensorflow/lib/python3.6/site-packages/hls4ml/utils/config.py", line 200, in config_from_keras_model
layer_config = make_layer_config(layer)
File "/hdd2/programs/prunning/tensorflow/lib/python3.6/site-packages/hls4ml/utils/config.py", line 147, in make_layer_config
if layer['config']['activation'] == 'softmax':
KeyError: 'config'

Also on tutorial 1 the network is using a 16 bit precision:

{'Model': {'Precision': 'ap_fixed<16,6>', 'ReuseFactor': 1, 'Strategy': 'Latency'}}

but the accuracy reported is very low for hls4ml:

Keras Accuracy: 0.7500542168674699
hls4ml Accuracy: 0.20060240963855422

incoherent tensorflow version

Hi,

in environment.yaml Tensorflow version 2.3.1 is required. However when I run the notebooks on jupiter hub this happens:

print(tf.__version__)
2.1.0

This is the cause of a weird problem I am seeing when I run the code locally on my machine, i.e. a different number of training samples with respect to what is used on jupiter hub.

Vivado installation in Dockerfile

I would like to make my own Dockerfile, so I am looking for the specific line that installs Vivado. Does it exist in the dockerfile or do I have to manually install it?

Thanks

Integrating generated ip core to vivado

Hi, for my final year project I'm implementing an mnist cnn classifier model. I found your framework very interesting, I tried the first notebook but then I couldn't figure out how to integrate the generated ip core to my hardware design in Vivado. Could someone help me on this please ?

Docker pull doesn't work, unauthorized error

Docker pull doesn't seem to work:

 docker pull ghcr.io/fastmachinlearning/hls4ml-tutorial/hls4ml-0.7.1:latest
Error response from daemon: Head "https://ghcr.io/v2/fastmachinlearning/hls4ml-tutorial/hls4ml-0.7.1/manifests/latest": denied

Following the link in the browser:

{"errors":[{"code":"UNAUTHORIZED","message":"authentication required"}]}

'xcku115-flvb2104-2-i'

Hello! I am not an expert, but I am trying to run a notebook inside this docker container. I have 2 issues.

When I run the "build" function, I get this output:
ERROR: [HLS 200-70] Part 'xcku115-flvb2104-2-i' is not installed.
command 'ap_source' returned error code
while executing
"source build_prj.tcl"
("uplevel" body line 1)
invoked from within
"uplevel #0 [list source $arg] "
`import hls4ml
from hls4ml.converters import convert_from_keras_model
import plotting

Then the QKeras model

'''hls4ml.model.optimizer.OutputRoundingSaturationMode.layers = ['Activation']
hls4ml.model.optimizer.OutputRoundingSaturationMode.rounding_mode = 'AP_RND'
hls4ml.model.optimizer.OutputRoundingSaturationMode.saturation_mode = 'AP_SAT'
'''

reuse_model=256

q_hls_config = hls4ml.utils.config_from_keras_model(qmodel_pruned, granularity='name')
q_hls_config['Model']['ReuseFactor'] = reuse_model
q_hls_config['Model']['Precision'] = 'ap_fixed<16,6>'
q_hls_config['Model']['Strategy'] = 'Resource'

#q_hls_config['LayerName']['output_softmax']['Strategy'] = 'Resource'
q_hls_config['LayerName']['Input_layer']['Precision'] = 'ap_fixed<16,16>'

for layer in qmodel_pruned.layers:
if ('CONV' in layer.name.upper()) or ('DENSE' in layer.name.upper()):
q_hls_config['LayerName'][layer.name]['ReuseFactor'] = reuse_model
#if 'POOL' in layer.name.upper():
# q_hls_config['LayerName'][layer.name]['Precision']='ap_fixed<32,16>'

q_hls_config['LayerName']['output_dense']['ReuseFactor'] = reuse_model
q_hls_config['LayerName']['output_softmax']['ReuseFactor'] = reuse_model
q_hls_config['LayerName']['output_dense_linear']['ReuseFactor'] = reuse_model

q_hls_config['LayerName']['output_dense_linear']['ReuseFactor'] = reuse_model

q_cfg = hls4ml.converters.create_config(backend='Vivado')
q_cfg['IOType'] = 'io_stream' # Must set this if using CNNs!
q_cfg['HLSConfig'] = q_hls_config
q_cfg['KerasModel'] = qmodel_pruned
q_cfg['OutputDir'] = 'q_cnn_pruned/'
q_cfg['XilinxPart'] = 'xczu7ev-ffvc1156-2-e'

q_hls_model= hls4ml.converters.keras_to_hls(q_cfg)
q_hls_model.compile()

'''q_hls_model_test = convert_from_keras_model(
qmodel_pruned, hls_config=q_cfg, output_dir='model_final/hls4ml_model', part='xcu250-figd2104-2L-e'
)'''
print("----------------------------------------------------------------------------------")
#q_hls_model_test.compile()
print("---------------------------")
os.environ['PATH'] = os.environ['XILINX_VIVADO'] + '/bin:' + os.environ['PATH']
q_hls_model.build(csim=False, synth=True, vsynth=True)
`
2. I saw in tutorial part 6 you use the keras_to_hls function, but in all other parts, you use the convert_from_keras_model. Why? And When I tried to convert my CNN model by the second one, holping this could magically fix the problem, the compilation runs forever. Anyone could help me, please??

part 5_bdt - Type error 'module' object is not callable

When I run this code, I stuck at this code and got this kind of error. How to fix it?

bit file not generated & wait_on_run -timeout 360 impl_1

hello everyone, I'm new in this field and I need help concerning using hls4ml with vivado accelerator backend " imusing vivado2019.2 ", after I build my CNN following tuto 6 and trying to generate the bit file regarding part 7 in tutorial, I'm blocked in impl_1 this process glitch and the bot file not generated, also when I read the log file I found that:

EXPORT IP COMPLETED IN 0h0m23s *****
INFO: [HLS 200-112] Total elapsed time: 1634.17 seconds; peak allocated memory: 2.233 GB.
INFO: [Common 17-206] Exiting vivado_hls at Sat Dec  2 23:28:50 2023...
Vivado synthesis report not found.
Cosim report not found.
Timing report not found.

also this :

[Sun Dec 10 22:37:01 2023] Launched impl_1...
Run output will be captured here: /home/abdo/PycharmProjects/lenet5/qmodel/model_hls4ml/myproject_vivado_accelerator/project_1.runs/impl_1/runme.log
launch_runs: Time (s): cpu = 00:00:21 ; elapsed = 00:00:23 . Memory (MB): peak = 2100.492 ; gain = 219.098 ; free physical = 915 ; free virtual = 3406
# wait_on_run -timeout 360 impl_1
[Sun Dec 10 22:37:01 2023] Waiting for impl_1 to finish (timeout in 360 minutes)...

this is the configuration that i use :


import hls4ml
import tensorflow as tf
from tensorflow.keras.models import load_model
from qkeras.utils import _add_supported_quantized_objects

# Load the model
co = {}
_add_supported_quantized_objects(co)
model = load_model('LeNet5_MNIST_model_n.h5', custom_objects=co)

# Convert the model to HLS using hls4ml
config = hls4ml.utils.config_from_keras_model(model, granularity='name')

config['Model']['ReuseFactor'] = 1
config['Model']['Strategy'] = 'Resource'
config['Model']['Precision'] = 'ap_fixed<16,6>'

#print("-----------------------------------")
#plotting.print_dict(config)
#print("-----------------------------------")


hls_model = hls4ml.converters.convert_from_keras_model(
    model, hls_config=config, output_dir='model_hls4ml', backend='VivadoAccelerator', board='pynq-z2',io_type='io_stream',
)

#hls_model.compile()
hls_model.build(csim=False, export=True, bitfile=True)

also this is the function i used to generate my model:


input_shape = (28, 28, 1)

x = x_in = Input(input_shape)

for i, f in enumerate(filters_per_conv_layer):
    print(('Adding convolutional block {} with N={} filters').format(i, f))
    x = Conv2D(
        int(f),
        kernel_size=(3, 3),
        strides=(1, 1),
        kernel_initializer='lecun_uniform',
        kernel_regularizer=l1(0.0001),
        use_bias=False,
        name='conv_{}'.format(i),
    )(x)
    x = BatchNormalization(name='bn_conv_{}'.format(i))(x)
    x = Activation('relu', name='conv_act_%i' % i)(x)
    x = MaxPooling2D(pool_size=(2, 2), name='pool_{}'.format(i))(x)
x = Flatten()(x)

for i, n in enumerate(neurons_per_dense_layer):
    print(('Adding dense block {} with N={} neurons').format(i, n))
    x = Dense(n, kernel_initializer='lecun_uniform', kernel_regularizer=l1(0.0001), name='dense_%i' % i, use_bias=False)(x)
    x = BatchNormalization(name='bn_dense_{}'.format(i))(x)
    x = Activation('relu', name='dense_act_%i' % i)(x)
x = Dense(10, name='output_dense')(x)
x_out = Activation('softmax', name='output_softmax')(x)

model = Model(inputs=[x_in], outputs=[x_out], name='LeNet5_MNIST')

# Print model summary
model.summary()

+---------------------+------------+---------+--------+
|        Layer        | Parameters | Weights | Biases |
+---------------------+------------+---------+--------+
|       input_1       |      0     |    0    |   0    |
|   fused_convbn_0    |      88    |    80   |   8    |
|       pool_0        |      0     |    0    |   0    |
|   fused_convbn_1    |     1184   |   1168  |   16   |
|       pool_1        |      0     |    0    |   0    |
|   fused_convbn_2    |     1488   |   1472  |   16   |
|       pool_2        |      0     |    0    |   0    |
|      flatten        |      0     |    0    |   0    |
|       dense_0       |     3468   |   3456  |   12   |
|      bn_dense_0     |      24    |    0    |   24   |
|     dense_act_0     |      0     |    0    |   0    |
|       dense_1       |      624   |   576   |   48   |
|      bn_dense_1     |      96    |    0    |   96   |
|     dense_act_1     |      0     |    0    |   0    |
|     output_dense    |      114   |   96    |   18   |
+---------------------+------------+---------+--------+

log+model.zip

part6 autoqkeras error

im getting the following error after epoch 15 is finished in this codeblock:

from qkeras.autoqkeras import AutoQKeras

autoqk = AutoQKeras(baseline_model, output_dir="autoq_cnn", metrics=["acc"], custom_objects={}, **run_config)
autoqk.fit(train_data, validation_data=val_data, epochs=15)

aqmodel = autoqk.get_best_model()
print_qmodel_summary(aqmodel)

# Train for the full epochs
callbacks = [
    tf.keras.callbacks.EarlyStopping(patience=10, verbose=1),
    tf.keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=3, verbose=1),
]

start = time.time()
history = aqmodel.fit(train_data, epochs=n_epochs, validation_data=val_data, callbacks=callbacks, verbose=1)
end = time.time()
print('\n It took {} minutes to train!\n'.format((end - start) / 60.0))

Epoch 1/15

Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/site-packages/keras_tuner/src/engine/base_tuner.py", line 273, in _try_run_and_update_trial
    self._run_and_update_trial(trial, *fit_args, **fit_kwargs)
  File "/opt/conda/lib/python3.10/site-packages/keras_tuner/src/engine/base_tuner.py", line 238, in _run_and_update_trial
    results = self.run_trial(trial, *fit_args, **fit_kwargs)
  File "/opt/conda/lib/python3.10/site-packages/keras_tuner/src/engine/tuner.py", line 314, in run_trial
    obj_value = self._build_and_fit_model(trial, *args, **copied_kwargs)
  File "/opt/conda/lib/python3.10/site-packages/keras_tuner/src/engine/tuner.py", line 233, in _build_and_fit_model
    results = self.hypermodel.fit(hp, model, *args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/keras_tuner/src/engine/hypermodel.py", line 149, in fit
    return model.fit(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/tmp/__autograph_generated_file18rk8csp.py", line 15, in tf__train_function
    retval_ = ag__.converted_call(ag__.ld(step_function), (ag__.ld(self), ag__.ld(iterator)), None, fscope)
tensorflow.python.autograph.pyct.error_utils.MultilineMessageKeyError: in user code:

    File "/opt/conda/lib/python3.10/site-packages/keras/engine/training.py", line 1249, in train_function  *
        return step_function(self, iterator)
    File "/opt/conda/lib/python3.10/site-packages/keras/engine/training.py", line 1233, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "/opt/conda/lib/python3.10/site-packages/keras/engine/training.py", line 1222, in run_step  **
        outputs = model.train_step(data)
    File "/opt/conda/lib/python3.10/site-packages/keras/engine/training.py", line 1027, in train_step
        self.optimizer.minimize(loss, self.trainable_variables, tape=tape)
    File "/opt/conda/lib/python3.10/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 527, in minimize
        self.apply_gradients(grads_and_vars)
    File "/opt/conda/lib/python3.10/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1140, in apply_gradients
        return super().apply_gradients(grads_and_vars, name=name)
    File "/opt/conda/lib/python3.10/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 634, in apply_gradients
        iteration = self._internal_apply_gradients(grads_and_vars)
    File "/opt/conda/lib/python3.10/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1166, in _internal_apply_gradients
        return tf.__internal__.distribute.interim.maybe_merge_call(
    File "/opt/conda/lib/python3.10/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1216, in _distributed_apply_gradients_fn
        distribution.extended.update(
    File "/opt/conda/lib/python3.10/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1213, in apply_grad_to_update_var  **
        return self._update_step(grad, var)
    File "/opt/conda/lib/python3.10/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 216, in _update_step
        raise KeyError(

    KeyError: 'The optimizer cannot recognize variable conv_0/kernel:0. This usually means you are trying to call the optimizer to update different parts of the model separately. Please call `optimizer.build(variables)` with the full list of trainable variables before the training loop or use legacy optimizer `tf.keras.optimizers.legacy.{self.__class__.__name__}.'

im using this container: ghcr.io/fastmachinelearning/hls4ml-tutorial/hls4ml-0.8.0-vivado-2019.1

Build Process has been interrupted in 'part6_cnn.ipynb'

Thanks for your support hls4ml team,

When building with codes in 'part6_cnn.ipynb' of hls4ml-tutorial below,

synth = True  # Only if you want to synthesize the models yourself (>1h per model) rather than look at the provided reports.
if synth:
    hls_model.build(csim=False, synth=True, vsynth=True)
    hls_model_q.build(csim=False, synth=True, vsynth=True)

Build Process in both hls_model and hls_model_q has been interrupted in 'part6_cnn.ipynb' with messages like followings,

    ...
    INFO: [XFORM 203-101] Partitioning array 'kernel_data.V.1'  in dimension 1 completely.
    INFO: [XFORM 203-101] Partitioning array 'res_out'  in dimension 1 completely.
    INFO: [XFORM 203-101] Partitioning array 'b10.V'  in dimension 1 completely.
    INFO: [XFORM 203-101] Partitioning array 'mult.V' (firmware/nnet_utils/nnet_dense_latency.h:17) in dimension 1 completely.
    INFO: [XFORM 203-101] Partitioning array 'acc.V' (firmware/nnet_utils/nnet_dense_latency.h:18) in dimension 1 completely.
    INFO: [XFORM 203-101] Partitioning array 'shift_buffer.V' (firmware/nnet_utils/nnet_conv_stream.h:229) in dimension 1 completely.
    INFO: [XFORM 203-101] Partitioning array 'kernel_data.V'  in dimension 1 completely.
    INFO: [XFORM 203-101] Partitioning array 'res_out'  in dimension 1 completely.
    INFO: [XFORM 203-101] Partitioning array 'b6.V'  in dimension 1 completely.
    /opt/Xilinx/Vivado/2019.2/bin/rdiArgs.sh: line 280:  2524 Killed                  "$RDI_PROG" "$@"
    CSynthesis report not found.
    Vivado synthesis report not found.
    Cosim report not found.
    Timing report not found.

when building the models, original settings in project.tcl was part = 'xcvu13p-flga2577-2-e'. That resulted in error, so I changed part as 'xcu250-figd2104-2L-e', as it works OK.

Here are info of dev enviroment.

Environmet

Docker Image: 0.8.0 2019.1: latest

Board Files

    (base) jovyan@a67776abed7b:~/work$ ll /opt/Xilinx/Vivado/2019.2/data/boards/board_files/
    total 44
    drwxr-xr-x 11 root root 4096 Jan  3 08:03 ./
    drwxr-xr-x  4 root root 4096 Jan  3 08:02 ../
    drwxr-xr-x  5 root root 4096 Jan  3 08:02 ac701/
    drwxr-xr-x  6 root root 4096 Jan  3 07:55 au200/
    drwxr-xr-x  6 root root 4096 Jan  3 07:55 au250/
    drwxr-xr-x  2 root root 4096 Jan  3 07:57 li-imx274-mipi/
    drwxr-xr-x  3 root root 4096 Jun 27  2018 pynq-z2/
    drwxr-xr-x  3 root root 4096 Jan  3 07:52 sp701/
    drwxr-xr-x  3 root root 4096 Jan  3 07:57 xm105/
    drwxr-xr-x  5 root root 4096 Jan  3 08:03 zc702/
    drwxr-xr-x  4 root root 4096 Jan  3 08:03 zed/

    (base) jovyan@a67776abed7b:~/work$ ll /opt/Xilinx/Vivado/2019.2/data/boards/board_files/pynq-z2/A.0/
    total 324
    drwxr-xr-x 2 root root   4096 Jun 27  2018 ./
    drwxr-xr-x 3 root root   4096 Jun 27  2018 ../
    -rw-r--r-- 1 root root  54479 May 31  2018 board.xml
    -rw-r--r-- 1 root root   9007 Jun 27  2018 part0_pins.xml
    -rw-r--r-- 1 root root  74492 Jun 19  2018 preset.xml
    -rw-r--r-- 1 root root 175776 Jun 27  2018 pynq_z2.jpg

Please let me know anything that I can handle the issue with

Thanks in advance.

vivado

hello i'm new to using docker i'v downloaded the without vivado version and now i want access vivado in my host machine to synthesis the code !!!! is it possible ? i mean how should we change this line of code "os.environ['PATH'] = os.environ['XILINX_VIVADO'] + '/bin:' + os.environ['PATH']" when we download the without vivado version?