xilinx / vitis-ai-tutorials Goto Github PK
View Code? Open in Web Editor NEWLicense: MIT License
License: MIT License
When I quantize yolo v3 model flow this tutorial, I found that the output nodes in this code block should be conv2d_59/BiasAdd,conv2d_67/BiasAdd,conv2d_75/BiasAdd
while not conv2d_59/convolution,conv2d_67/convolution,conv2d_75/convolution
.
!vai_q_tensorflow quantize \
--input_frozen_graph model_data/yolov3_voc.pb \
--input_nodes input_1 \
--input_shapes ?,416,416,3 \
--output_nodes conv2d_59/convolution,conv2d_67/convolution,conv2d_75/convolution \
--input_fn input_fn.calib_input \
--method 1 \
--gpu 0 \
--calib_iter 100
Hi everybody,
I have first downloaded the Xilinx zcu102 image and flashed it to SD card. Then I executed the MNIST-Classification-TensorFlow with default configurations. Now I can successfully run the resulting program on my ZCU102 board.
Now I want to execute the MNIST-Classification-TensorFlow example on a custom DPU. So I have created a DPU (image) via DPU-TRD Vitis flow. To use the custom DPU I copied BOOT.BIN file and dpu.xclbin file to BOOT-partition (of the previous working xilinx image) and dpu.xclbin to /usr/lib. Thus my DPU is successfully recognized, which I approved by dexplorer -w
command.
For executing the MNIST program on my custom DPU I took the .hwh file (from DPU generation) and used dlet
command to generate .dcf file. In step 6 of the tutorial I added --options "{'dcf':'<my dcf file>'}"
to command vai_c_tensorflow
to get the .elf file that fits to my custom DPU. Now I copy the .elf file to my board and execute the MNIST program. I do not get an error and here is the according output:
Command line options:
--image_dir : images
--threads : 1
--model : model_B512_LowPerformance/dpu_customcnn.elf
Pre-processing 10000 images...
Starting 1 threads...
FPS=2822.20, total frames = 10000 , time=3.5433 seconds
Correct: 980 Wrong: 9020 Accuracy: 0.098
The accuracy is very low which is not normal, so where is the problem here?
When I compare configuration of my custom DPU and output of the ddump
command the elf file should fit perfectly to my DPU.
On executing the MNIST program with default configuration on my custom DPU I get an error. On executing the MNIST program fitting to my custom DPU I do not get an error. So I assume that the .elf file is correct for my custom DPU.
So, why is the accuracy for the 'custom' elf file on the custom DPU so bad? What am I doing wrong?
https://github.com/Xilinx/Vitis-AI-Tutorials/tree/VAI-KERAS-FCN8-SEMSEG
please could we add in "requisites" that also UBUNTU 18.04 is supported?
thanks
gg
Hello,
it appears that the dropout layers are saved in the train_save.py tutorial,
the quantizer seems to crash finding those. Would be nice if the tutorial
could be updated.
Traceback (most recent call last):
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib/python3.6/site-packages/tensorflow_core/python/framework/importer.py", line 501, in _import_graph_def_internal
graph._c_graph, serialized, options) # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.InvalidArgumentError: NodeDef mentions attr 'exponential_avg_factor' not in Op<name=FusedBatchNormV3; signature=x:T, scale:U, offset:U, mean:U, variance:U -> y:T, batch_mean:U, batch_variance:U, reserve_space_1:U, reserve_space_2:U, reserve_space_3:U; attr=T:type,allowed=[DT_HALF, DT_BFLOAT16, DT_FLOAT]; attr=U:type,allowed=[DT_FLOAT]; attr=epsilon:float,default=0.0001; attr=data_format:string,default="NHWC",allowed=["NHWC", "NCHW"]; attr=is_training:bool,default=true>; NodeDef: {{node batch_normalization/FusedBatchNormV3}}. (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/bin/vai_q_tensorflow", line 11, in
sys.exit(run_main())
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib/python3.6/site-packages/tensorflow_core/contrib/decent_q/python/decent_q.py", line 1061, in run_main
app.run(main=my_main, argv=[sys.argv[0]] + unparsed)
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib/python3.6/site-packages/tensorflow_core/python/platform/app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib/python3.6/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib/python3.6/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib/python3.6/site-packages/tensorflow_core/contrib/decent_q/python/decent_q.py", line 1060, in
my_main = lambda unused_args: main(unused_args, FLAGS)
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib/python3.6/site-packages/tensorflow_core/contrib/decent_q/python/decent_q.py", line 676, in main
flags.skip_check, flags.dump_as_xir)
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib/python3.6/site-packages/tensorflow_core/contrib/decent_q/python/decent_q.py", line 375, in quantize_frozen
check_float_graph(input_graph_def, input_fn, q_config, s_config)
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib/python3.6/site-packages/tensorflow_core/contrib/decent_q/python/decent_q.py", line 275, in check_float_graph
importer.import_graph_def(input_graph_def, name='')
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib/python3.6/site-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib/python3.6/site-packages/tensorflow_core/python/framework/importer.py", line 405, in import_graph_def
producer_op_list=producer_op_list)
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib/python3.6/site-packages/tensorflow_core/python/framework/importer.py", line 505, in import_graph_def_internal
raise ValueError(str(e))
ValueError: NodeDef mentions attr 'exponential_avg_factor' not in Op<name=FusedBatchNormV3; signature=x:T, scale:U, offset:U, mean:U, variance:U -> y:T, batch_mean:U, batch_variance:U, reserve_space_1:U, reserve_space_2:U, reserve_space_3:U; attr=T:type,allowed=[DT_HALF, DT
As from subj, the links to WAA (Whole App Acceleration) in the below pages are broken and not working, is that tutorial still available or not?
https://github.com/Xilinx/Vitis-AI-Tutorials/blob/1.4/Design_Tutorials/13-vdpu-pre-post-pl-acc/README.md
https://github.com/Xilinx/Vitis-AI-Tutorials/blob/1.4/Design_Tutorials/18-mpsocdpu-pre-post-pl-acc/README.md
Hi I have the following error while quantization. I wrote the following script:
from future import absolute_import
from future import division
from future import print_function
import numpy as np
from tensorflow.python.keras.utils.data_utils import get_file
from tensorflow.python.util.tf_export import keras_export
with np.load('data_capture_qpsk/frame_esn0_0-0.npz') as f:
rx_data_input_real, rx_data_input_imag = f['rx_data_map_real'], f['rx_data_map_imag']
tx_pilot_input_real, tx_pilot_input_imag = f['tx_pilot_map_real'], f['tx_pilot_map_imag']
raw_ch_est_input_real, raw_ch_est_input_imag = f['raw_ch_map_real'], f['raw_ch_map_imag']
rx_data_input = np.concatenate((rx_data_input_real, rx_data_input_imag), axis=-1)
tx_pilot_input = np.concatenate((tx_pilot_input_real, tx_pilot_input_imag), axis=-1)
raw_ch_est_input = np.concatenate((raw_ch_est_input_real, raw_ch_est_input_imag), axis=-1)
inputs = [rx_data_input, tx_pilot_input, raw_ch_est_input]
from tensorflow import keras
float_model = keras.models.load_model('float/deep_rx.h5')
from tensorflow_model_optimization.quantization.keras import vitis_quantize
quantizer = vitis_quantize.VitisQuantizer(float_model)
quantized_model = quantizer.quantize_model(calib_dataset=inputs, calib_step=100, calib_batch_size=1)
Response
(vitis-ai-tensorflow2) Vitis-AI /workspace/models/ChEstModel > python3 Deep_RX_ptq.py
2022-05-04 08:28:37.972104: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/xilinx/xrt/lib:/usr/lib:/usr/lib/x86_64-linux-gnu:/usr/local/lib:/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib
2022-05-04 08:28:37.972124: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2022-05-04 08:28:39.737146: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/xilinx/xrt/lib:/usr/lib:/usr/lib/x86_64-linux-gnu:/usr/local/lib:/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib
2022-05-04 08:28:39.737213: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2022-05-04 08:28:39.737257: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (dre-elbe-s04): /proc/driver/nvidia/version does not exist
2022-05-04 08:28:39.737915: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Traceback (most recent call last):
File "Deep_RX_ptq.py", line 23, in
float_model = keras.models.load_model('float/deep_rx.h5')
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow2/lib/python3.7/site-packages/keras/saving/save.py", line 201, in load_model
compile)
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow2/lib/python3.7/site-packages/keras/saving/hdf5_format.py", line 199, in load_model_from_hdf5
training_config, custom_objects), from_serialized=True)
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow2/lib/python3.7/site-packages/keras/saving/saving_utils.py", line 202, in compile_args_from_training_config
optimizer = optimizers.deserialize(optimizer_config)
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow2/lib/python3.7/site-packages/keras/optimizers.py", line 99, in deserialize
printable_module_name='optimizer')
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow2/lib/python3.7/site-packages/keras/utils/generic_utils.py", line 660, in deserialize_keras_object
config, module_objects, custom_objects, printable_module_name)
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow2/lib/python3.7/site-packages/keras/utils/generic_utils.py", line 561, in class_and_config_for_serialized_keras_object
.format(printable_module_name, class_name))
ValueError: Unknown optimizer: Addons>LAMB. Please ensure this object is passed to the custom_objects
argument. See https://www.tensorflow.org/guide/keras/save_and_serialize#registering_the_custom_object for details
Hello,
I am getting the following error:
[INFO] parse raw model : 12%|█▎ | 1/8 [00:00<00:00, 2714.76it/s]
[INFO] Namespace(inputs_shape=None, layout='NHWC', model_files=['./build/quantize/deploy_model.pb'], model_type='tensorflow', out_filename='./build/compile_zcu102/customcnn_org.xmodel', proto=None)
[INFO] tensorflow model: build/quantize/deploy_model.pb
Traceback (most recent call last):
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/bin/xnnc-run", line 33, in
sys.exit(load_entry_point('xnnc==1.3.0', 'console_scripts', 'xnnc-run')())
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib/python3.6/site-packages/xnnc/main.py", line 194, in main
normal_run(args)
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib/python3.6/site-packages/xnnc/main.py", line 178, in normal_run
in_shapes=in_shapes if len(in_shapes) > 0 else None,
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib/python3.6/site-packages/xnnc/xconverter.py", line 131, in run
xmodel = CORE.make_xmodel(model_files, model_type, _layout, in_shapes)
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib/python3.6/site-packages/xnnc/core.py", line 104, in make_xmodel
model_files, layout, in_shapes=in_shapes, model_type=model_t
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib/python3.6/site-packages/xnnc/translator/tensorflow_translator.py", line 97, in to_xmodel
model_name, raw_nodes, layout, in_shapes, model_fmt, model_type
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib/python3.6/site-packages/xnnc/translator/tensorflow_translator.py", line 161, in create_xmodel
xmodel = cls.__create_xmodel_from_tf1(name, layers, layout, in_shapes)
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib/python3.6/site-packages/xnnc/translator/tensorflow_translator.py", line 243, in __create_xmodel_from_tf1
xmodel_name, layout, layers, const_layer_dict, super_const_dict, in_shapes
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib/python3.6/site-packages/xnnc/translator/tensorflow_translator.py", line 1847, in __generate_xmodel
), f"[ERROR] TF Conv2d requires two inputs: actual: {bottom}."
AssertionError: [ERROR] TF Conv2d requires two inputs: actual: ['images_in'].
In the section https://github.com/Xilinx/Vitis-AI-Tutorials/tree/master/Introduction/03-Basic/Module_2
The file targeted by :
get_image_video_zcu104.sh
are
wget -O vitis_ai_library_r1.3.0_images.tar.gz https://www.xilinx.com/bin/public/openDownload?filename=vitis_ai_library_r1.3.0_images.tar.gz
wget -O vitis_ai_library_r1.3.0_video.tar.gz https://www.xilinx.com/bin/public/openDownload?filename=vitis_ai_library_r1.3.0_video.tar.gz
However in the explanation the file are
root@xilinx-zcu104-2021_1:~# wget -O vitis_ai_library_r1.4.0_images.tar.gz https://www.xilinx.com/bin/public/openDownload?filename=vitis_ai_library_r1.4.0_images.tar.gz
root@xilinx-zcu104-2021_1:~# wget -O vitis_ai_library_r1.4.0_video.tar.gz https://www.xilinx.com/bin/public/openDownload?filename=vitis_ai_library_r1.4.0_video.tar.gz
When I run ./quantize_and_compile.sh in docker, I find this:
I0215 02:19:57.546900 259 layer_factory.hpp:77] Creating layer data
I0215 02:19:57.546921 259 net.cpp:94] Creating Layer data
I0215 02:19:57.546927 259 net.cpp:409] data -> data
I0215 02:19:57.546942 259 net.cpp:409] data -> label
I0215 02:19:57.547078 259 image_data_layer.cpp:41] Opening file ../calibration.txt
I0215 02:19:57.547338 259 image_data_layer.cpp:51] Shuffling data
I0215 02:19:57.547390 259 image_data_layer.cpp:56] A total of 1000 images.
E0215 02:19:57.547423 259 io.cpp:145] Could not open or find file /data2/datasets/VOCdevkit/VOC2007/JPEGImages/000186.jpg
F0215 02:19:57.547428 259 image_data_layer.cpp:70] Check failed: cv_img.data Could not load 000186.jpg
*** Check failure stack trace: ***
Compiling network: vgg16_ssd
[INFO] Namespace(batchsize=1, inputs_shape=None, layout='NCHW', model_files=['quantize/deploy.caffemodel'], model_type='caffe', named_inputs_shape=None, out_filename='/tmp/vgg16_ssd_org.xmodel', proto='quantize/deploy.prototxt')
[ERROR] Not found the file or directory: /workspace/SSD/VAI/VGG16-SSD/quantize/deploy.caffemodel
(vitis-ai-caffe) Vitis-AI /workspace/SSD/VAI/VGG16-SSD >
The 000186.jpg is in /SSD/data/VOCdevkit/VOC2007/JPEGImages/000186.jpg seems that not in /data2/datasets/VOCdevkit/VOC2007/JPEGImages/000186.jpg. It seems the wrong address.
Another question is that I want to use kv260 arch.json, but arch=/opt/vitis_ai/compiler/arch/DPUCZDX8G/ZCU102/arch.json 2>&1 | tee ${output_dir}/compile.txt in quantize_and_compile.sh could not find anything, even not has /opt/vitis_ai in my ubuntu 20.04.
Could you please help me? Thank you in advance!
I'm following this Xilinx Tutorial about the implementation of a U-Net in the ZCU104 Evaluation Board and I have come up with an error during the compilation step.
I've trained a U-Net in Matlab 2020b and exported to Keras via onnx2keras and followed the steps of the tutorial without any errors:
The full error message is:
Traceback (most recent call last):
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/bin/xnnc-run", line 33, in module
sys.exit(load_entry_point('xnnc==1.4.0', 'console_scripts', 'xnnc-run')())
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib/python3.6/site-packages/xnnc/main.py", line 49, in main
runner.normal_run(args)
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib/python3.6/site-packages/xnnc/runner.py", line 123, in normal_run
target=target,
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib/python3.6/site-packages/xnnc/xconverter.py", line 145, in run
model_files, model_type, _layout, in_shapes, batchsize
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib/python3.6/site-packages/xnnc/core.py", line 123, in make_xmodel
model_type=model_t,
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib/python3.6/site-packages/xnnc/translator/tensorflow_translator.py", line 107, in to_xmodel
model_type,
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib/python3.6/site-packages/xnnc/translator/tensorflow_translator.py", line 173, in create_xmodel
name, layers, layout, in_shapes, batchsize
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib/python3.6/site-packages/xnnc/translator/tensorflow_translator.py", line 289, in __create_xmodel_from_tf1
batchsize,
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib/python3.6/site-packages/xnnc/translator/tensorflow_translator.py", line 3192, in __generate_xmodel
), f"[ERROR] Not found op in super_const_dict: name: {weights_id}"
AssertionError: [ERROR] Not found op in super_const_dict: name: Decoder_Section_1_UpConv_1/kernel
At first, I thought that the compiler may not support certain layers such as Conv2DTransposed (a way of upsampling images) but even though the documentation says that the Tensorflow version needs to be higher than 2.0 and I'm using 1.15.2, the tutorial includes a U-Net made of those layers and I've compiled it without any problem so, that's not the problem, I think.
Then, I've decided to compare both neural networks after freezing and also after quantization, so as to try to find some information that may be missing in my U-Net that does include it the other one.
Inspection results after freezing. Op types used (my U-Net --> tutorial U-Net):
There are differences between the two freezing processes as the two U-Nets are two different modified versions of the original one. However, as I see it, I don't think that LeakyRelu, Pad, AddV2 or Sub (the ones that appear in my model and not in the model of the tutorial) are related to the error.
Similarly, after quantization these are the differences. Op types used (my U-Net --> tutorial U-Net):
I don't know exactly where the error comes from so any kind of help would be highly appreciated.
Thanks in advance,
Jon.
Hello i start with tutorial on Vitis-AI and found some promblem on training state.
###my environment is
Docker : vitis-ai-gpu:2.0.0.1103
GPU : RTX2080
CPU : Ryzen7 2700x
OS : Ubuntu 18.04[WSL2]
Follow from 01-caffe_cats_vs_dogs --> In the 6 topic : Python and shell script , I'm testing script on this step but the .caffemodel isn't here
So I train the model by myself from provided script but I stuck on training process when run this, like this
The log provided logfile_caffe_alexnetBNnoLRN.txt
From my continue debugging the tutorial i think it's stuck on the training process from caffe script link
Can you advise me for continue debug or solution for training AI from this state.
Hi,
I am performing training procedure for the caffe model i.e. 01-caffe_cats_vs_dogs. I am facing below issue during training.
I0210 09:24:31.278432 2794 caffe.cpp:247] Starting Optimization
I0210 09:24:31.278439 2794 solver.cpp:341] Solving alexnetBNnoLRN m2 (as m3 but less DROP and less BN)
I0210 09:24:31.278442 2794 solver.cpp:342] Learning Rate Policy: step
I0210 09:24:31.279312 2794 solver.cpp:424] Iteration 0, Testing net (#0)
I0210 09:24:32.102056 2794 solver.cpp:523] Test net output #0: accuracy = 0.5
I0210 09:24:32.102087 2794 solver.cpp:523] Test net output #1: loss = 0.693147 (* 1 = 0.693147 loss)
I0210 09:24:32.102092 2794 solver.cpp:523] Test net output #2: top-1 = 0.5
F0210 09:24:32.151126 2794 math_functions.cu:27] Check failed: status == CUBLAS_STATUS_SUCCESS (13 vs. 0) CUBLAS_STATUS_EXECUTION_FAILED
*** Check failure stack trace: ***
@ 0x7f60598814dd google::LogMessage::Fail()
@ 0x7f6059889071 google::LogMessage::SendToLog()
@ 0x7f6059880ecd google::LogMessage::Flush()
@ 0x7f605988276a google::LogMessageFatal::~LogMessageFatal()
@ 0x7f605863c24a caffe::caffe_gpu_gemm<>()
@ 0x7f60585e248c caffe::InnerProductLayer<>::Backward_gpu()
@ 0x7f6058458be3 caffe::Net<>::BackwardFromTo()
@ 0x7f6058458d3f caffe::Net<>::Backward()
@ 0x7f60584bdc4c caffe::Solver<>::Step()
@ 0x7f60584be791 caffe::Solver<>::Solve()
@ 0x55d5cd3a35ce train()
@ 0x55d5cd39ca59 main
@ 0x7f6056c29bf7 __libc_start_main
@ 0x55d5cd39d6a8 (unknown)
Aborted (core dumped)
Elapsed time for Caffe training (s): 1077.31017
How can I solve this issue?
Hello, I wanted to do these tutorials but I get this error when I execute the line import caffe
. What should I do?
Dear all,
I am trying to follow the instruction in Desing_Tutorial/07-yolov4-tutorial to run the network on my board (DPUCZDX8G_ISA0_B4096_MAX_BG2). Up to step 2.4 (Model Deployment) everything seems to work fine and I am able to evaluate the network using the tf_eval_yolov4_coco_2017.py
script. Here, the results are not good, but I get no errors.
The quantization and compilation processes finish correctly. The problem is when I try to run the network on the board. Specifically, in the program test_jpeg_yolov4
, the execution seems to stall when it gives the image as input to the network. I have read the code that executes the network, here is the code of the function:
// Entrance of jpeg demo
template <typename FactoryMethod, typename ProcessResult>
int main_for_jpeg_demo(int argc, char *argv[],
const FactoryMethod &factory_method,
const ProcessResult &process_result, int start_pos = 1) {
if (argc <= 1) {
usage_jpeg(argv[0]);
exit(1);
}
auto model = factory_method();
for (int i = start_pos; i < argc; ++i) {
auto image_file_name = std::string{argv[i]};
auto image = cv::imread(image_file_name);
if (image.empty()) {
LOG(FATAL) << "cannot load " << image_file_name << std::endl;
abort();
}
auto result = model->run(image);
image = process_result(image, result, true);
auto out_file =
image_file_name.substr(0, image_file_name.size() - 4) + "_result.jpg";
cv::imwrite(out_file, image);
LOG_IF(INFO, ENV_PARAM(DEBUG_DEMO)) << "result image write to " << out_file;
}
LOG_IF(INFO, ENV_PARAM(DEBUG_DEMO)) << "BYEBYE";
return 0;
}
When it enters result = model->run(image);
the program seems to enter an infinite loop. I tried to wait more than 24 hours to see if the execution could calculate the results, but the program never reaches the next instruction (image = process_result(image, result, true);
).
What can cause this problem? Has anyone already experienced similar problems?
Many thanks
When running the app_mt.py file to run the xmodel file in the ZCU104, I receive "Bus error" message when executing line all_dpu_runners.append(vart.Runner.create_runner(subgraphs[0], "run"))
. What can I do?
hey I'd like to ask something..
I check out this tutorial and found out something to strange in compile_target.sh
copy file to same file... is that right? please check.. help me!!
CNN=miniResNet
cp ./src/top5_tf_main.cc ./tf_main.cc
cp ./model/dpu_${CNN}0.elf ./model/dpu${CNN}0.elf
make clean
make
mv ./${CNN} ./top5${CNN}
cp ./src/fps_tf_main.cc ./tf_main.cc
cp ./model/dpu_${CNN}0.elf ./model/dpu${CNN}0.elf
make clean
make
mv ./${CNN} ./fps${CNN}
~
~
Dear all.
I'm following the tutorial for object detection on a voc based yolov4 Darknet (https://github.com/Xilinx/Vitis-AI-Tutorials/tree/master/Design_Tutorials/07-yolov4-tutorial#31-darknet-model-training-on-voc) and trying to train the net, but this time using a gtsdb dataset (German traffic lights signs), with the command
./darknet detector train cfg/voc.data cfg/yolov4.cfg /yolov4.weights -map -dont_show -show_imgs
of course I edited the "voc.data" in order to point the right gtsdb files, I just forgot to rename that file.
I edited the cfg files as requested, and the voc.data too.
I'm working on a ubuntu VM (god..) I'd need some hints and answers about the training process:
1)After running the train command, should I stop it manually (Ctrl-C) just after I realized the training converged properly, or not?
2)Training convergence in this case means that the loss (or mAP ?) stops decreasing? I used -map parameter but I honestly don't understand where that information is. This is a piece of output:
v3 (iou loss, Normalizer: (iou: 0.07, obj: 1.00, cls: 1.00) Region 133 Avg (IOU: 0.419995), count: 34, class_loss = 3764.068115, iou_loss = 28.210449, total_loss = 3792.278564
v3 (iou loss, Normalizer: (iou: 0.07, obj: 1.00, cls: 1.00) Region 144 Avg (IOU: 0.253230), count: 5, class_loss = 1026.847412, iou_loss = 0.271484, total_loss = 1027.118896
v3 (iou loss, Normalizer: (iou: 0.07, obj: 1.00, cls: 1.00) Region 155 Avg (IOU: 0.000000), count: 1, class_loss = 268.681885, iou_loss = 0.000000, total_loss = 268.681885
total_bbox = 39, rewritten_bbox = 0.000000 %
500504: 1169.513062, 1602.209351 avg loss, 0.000013 rate, 15818.413367 seconds, 32032256 images, 2120438.711756 hours left
MY MAIN PROBLEM: how can I save the weights during training?
3.1) and where can I find those .weights files? in the voc.data file I specified "backup = ./backup"
I've let that process run for all night but still I can't see any weight file saved during training. Maybe is it just a matter of time?
In the output, which is the number of the current iteration?
4.1) 1 iteration == 1 epoch ?
Thank you for your time
Hi,
I am working on using the Vitis AI library with a custom Yolov4 model.
I have followed the steps of this tutorial (convert Darknet to TensorFlow, freeze, quantize, compile) : https://github.com/Xilinx/Vitis-Tutorials/tree/master/Machine_Learning/Design_Tutorials/07-yolov4-tutorial
I am using an Alveo U280 card, the Vitis AI Docker Image for CPU, and the TensorFlow 1 framework.
To deploy the model, I copied the folder obtained from the compilation step to the path " /usr/share/vitis_ai_library/models" (let "yolov4" be the name of the custom model and output folder) in order to be read by the Vitis AI library.
Here is the content of the folder :
And here is the content of a standard model from Model Zoo (https://github.com/Xilinx/Vitis-AI/blob/master/models/AI-Model-Zoo/model-list/dk_yolov3_bdd_288_512_53.7G_1.3/model.yaml) :
It seems that the meta.json "replaces" the model.prototxt.
I then ran the example code from https://github.com/Xilinx/Vitis-AI/tree/master/demo/Vitis-AI-Library/samples/yolov4
cd /usr/share/vitis_ai_library/samples/yolov4
./test_video_yolov4 yolov4
Here is the error message when I try to run the application.
The model name is the parameter of the following line of code :
vitis::ai::YOLOv3::create(model);
Maybe I am missing an argument when I run the vai_c_tensorflow command when compiling the model.
vai_c_tensorflow \
--frozen_pb ${QUANT}/quantize_eval_model.pb \
--arch ${ARCH} \
--output_dir ${COMPILE} \
--net_name ${MODEL_NAME} \
--options "{'mode':'normal','save_kernel':'', 'input_shape':'1,416,416,3'}"
I would greatly appreciate your help.
Best regards,
Luc
Hi all,
I followed the example in https://github.com/Xilinx/Vitis-AI-Tutorials/tree/master/Design_Tutorials/09-mnist_pyt and successfully quantized my model and then compiled it to be CNN_zcu102.xmodel.
However, when I load the xmodel to the pynq dpu overlay (KV260), it showed the following error. Any advice on the problem?
It is noteworthy that I can successfully load the xmodel compiled from the example model.
The problem comes only when I change to my model, and there are no warnings during quantization and compiling.
I attach the quantized model for your reference.
`
import torch
import pytorch_nndct as py_nndct
class PoseResNet(torch.nn.Module):
def init(self):
super(PoseResNet, self).init()
self.module_0 = py_nndct.nn.Input() #PoseResNet::input_0
self.module_1 = py_nndct.nn.Conv2d(in_channels=3, out_channels=64, kernel_size=[7, 7], stride=[2, 2], padding=[3, 3], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Conv2d[conv1]/input.2
self.module_3 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/ReLU[relu]/3604
self.module_4 = py_nndct.nn.MaxPool2d(kernel_size=[3, 3], stride=[2, 2], padding=[1, 1], dilation=[1, 1], ceil_mode=False) #PoseResNet::PoseResNet/MaxPool2d[maxpool]/input.4
self.module_5 = py_nndct.nn.Conv2d(in_channels=64, out_channels=64, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer1]/BasicBlock[0]/Conv2d[conv1]/input.5
self.module_7 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer1]/BasicBlock[0]/ReLU[relu]/input.7
self.module_8 = py_nndct.nn.Conv2d(in_channels=64, out_channels=64, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer1]/BasicBlock[0]/Conv2d[conv2]/input.8
self.module_10 = py_nndct.nn.Add() #PoseResNet::PoseResNet/Sequential[layer1]/BasicBlock[0]/input.9
self.module_11 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer1]/BasicBlock[0]/ReLU[relu]/input.10
self.module_12 = py_nndct.nn.Conv2d(in_channels=64, out_channels=64, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer1]/BasicBlock[1]/Conv2d[conv1]/input.11
self.module_14 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer1]/BasicBlock[1]/ReLU[relu]/input.13
self.module_15 = py_nndct.nn.Conv2d(in_channels=64, out_channels=64, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer1]/BasicBlock[1]/Conv2d[conv2]/input.14
self.module_17 = py_nndct.nn.Add() #PoseResNet::PoseResNet/Sequential[layer1]/BasicBlock[1]/input.15
self.module_18 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer1]/BasicBlock[1]/ReLU[relu]/input.16
self.module_19 = py_nndct.nn.Conv2d(in_channels=64, out_channels=128, kernel_size=[3, 3], stride=[2, 2], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer2]/BasicBlock[0]/Conv2d[conv1]/input.17
self.module_21 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer2]/BasicBlock[0]/ReLU[relu]/input.19
self.module_22 = py_nndct.nn.Conv2d(in_channels=128, out_channels=128, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer2]/BasicBlock[0]/Conv2d[conv2]/input.20
self.module_24 = py_nndct.nn.Conv2d(in_channels=64, out_channels=128, kernel_size=[1, 1], stride=[2, 2], padding=[0, 0], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer2]/BasicBlock[0]/Sequential[downsample]/Conv2d[0]/input.21
self.module_26 = py_nndct.nn.Add() #PoseResNet::PoseResNet/Sequential[layer2]/BasicBlock[0]/input.22
self.module_27 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer2]/BasicBlock[0]/ReLU[relu]/input.23
self.module_28 = py_nndct.nn.Conv2d(in_channels=128, out_channels=128, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer2]/BasicBlock[1]/Conv2d[conv1]/input.24
self.module_30 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer2]/BasicBlock[1]/ReLU[relu]/input.26
self.module_31 = py_nndct.nn.Conv2d(in_channels=128, out_channels=128, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer2]/BasicBlock[1]/Conv2d[conv2]/input.27
self.module_33 = py_nndct.nn.Add() #PoseResNet::PoseResNet/Sequential[layer2]/BasicBlock[1]/input.28
self.module_34 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer2]/BasicBlock[1]/ReLU[relu]/input.29
self.module_35 = py_nndct.nn.Conv2d(in_channels=128, out_channels=256, kernel_size=[3, 3], stride=[2, 2], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer3]/BasicBlock[0]/Conv2d[conv1]/input.30
self.module_37 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer3]/BasicBlock[0]/ReLU[relu]/input.32
self.module_38 = py_nndct.nn.Conv2d(in_channels=256, out_channels=256, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer3]/BasicBlock[0]/Conv2d[conv2]/input.33
self.module_40 = py_nndct.nn.Conv2d(in_channels=128, out_channels=256, kernel_size=[1, 1], stride=[2, 2], padding=[0, 0], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer3]/BasicBlock[0]/Sequential[downsample]/Conv2d[0]/input.34
self.module_42 = py_nndct.nn.Add() #PoseResNet::PoseResNet/Sequential[layer3]/BasicBlock[0]/input.35
self.module_43 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer3]/BasicBlock[0]/ReLU[relu]/input.36
self.module_44 = py_nndct.nn.Conv2d(in_channels=256, out_channels=256, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer3]/BasicBlock[1]/Conv2d[conv1]/input.37
self.module_46 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer3]/BasicBlock[1]/ReLU[relu]/input.39
self.module_47 = py_nndct.nn.Conv2d(in_channels=256, out_channels=256, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer3]/BasicBlock[1]/Conv2d[conv2]/input.40
self.module_49 = py_nndct.nn.Add() #PoseResNet::PoseResNet/Sequential[layer3]/BasicBlock[1]/input.41
self.module_50 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer3]/BasicBlock[1]/ReLU[relu]/input.42
self.module_51 = py_nndct.nn.Conv2d(in_channels=256, out_channels=512, kernel_size=[3, 3], stride=[2, 2], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer4]/BasicBlock[0]/Conv2d[conv1]/input.43
self.module_53 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer4]/BasicBlock[0]/ReLU[relu]/input.45
self.module_54 = py_nndct.nn.Conv2d(in_channels=512, out_channels=512, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer4]/BasicBlock[0]/Conv2d[conv2]/input.46
self.module_56 = py_nndct.nn.Conv2d(in_channels=256, out_channels=512, kernel_size=[1, 1], stride=[2, 2], padding=[0, 0], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer4]/BasicBlock[0]/Sequential[downsample]/Conv2d[0]/input.47
self.module_58 = py_nndct.nn.Add() #PoseResNet::PoseResNet/Sequential[layer4]/BasicBlock[0]/input.48
self.module_59 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer4]/BasicBlock[0]/ReLU[relu]/input.49
self.module_60 = py_nndct.nn.Conv2d(in_channels=512, out_channels=512, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer4]/BasicBlock[1]/Conv2d[conv1]/input.50
self.module_62 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer4]/BasicBlock[1]/ReLU[relu]/input.52
self.module_63 = py_nndct.nn.Conv2d(in_channels=512, out_channels=512, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer4]/BasicBlock[1]/Conv2d[conv2]/input.53
self.module_65 = py_nndct.nn.Add() #PoseResNet::PoseResNet/Sequential[layer4]/BasicBlock[1]/input.54
self.module_66 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer4]/BasicBlock[1]/ReLU[relu]/4125
self.module_67 = py_nndct.nn.ConvTranspose2d(in_channels=512, out_channels=256, kernel_size=[4, 4], stride=[2, 2], padding=[1, 1], output_padding=[0, 0], groups=1, bias=True, dilation=[1, 1]) #PoseResNet::PoseResNet/Sequential[deconv_layers]/ConvTranspose2d[0]/input.55
self.module_69 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[deconv_layers]/ReLU[2]/4151
self.module_70 = py_nndct.nn.ConvTranspose2d(in_channels=256, out_channels=256, kernel_size=[4, 4], stride=[2, 2], padding=[1, 1], output_padding=[0, 0], groups=1, bias=True, dilation=[1, 1]) #PoseResNet::PoseResNet/Sequential[deconv_layers]/ConvTranspose2d[3]/input.57
self.module_72 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[deconv_layers]/ReLU[5]/4177
self.module_73 = py_nndct.nn.ConvTranspose2d(in_channels=256, out_channels=256, kernel_size=[4, 4], stride=[2, 2], padding=[1, 1], output_padding=[0, 0], groups=1, bias=True, dilation=[1, 1]) #PoseResNet::PoseResNet/Sequential[deconv_layers]/ConvTranspose2d[6]/input.59
self.module_75 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[deconv_layers]/ReLU[8]/input.61
self.module_76 = py_nndct.nn.Conv2d(in_channels=256, out_channels=64, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[hm_cen]/Conv2d[0]/input.62
self.module_77 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[hm_cen]/ReLU[1]/input.63
self.module_78 = py_nndct.nn.Conv2d(in_channels=64, out_channels=3, kernel_size=[1, 1], stride=[1, 1], padding=[0, 0], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[hm_cen]/Conv2d[2]/4242
self.module_79 = py_nndct.nn.Conv2d(in_channels=256, out_channels=64, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[cen_offset]/Conv2d[0]/input.64
self.module_80 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[cen_offset]/ReLU[1]/input.65
self.module_81 = py_nndct.nn.Conv2d(in_channels=64, out_channels=2, kernel_size=[1, 1], stride=[1, 1], padding=[0, 0], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[cen_offset]/Conv2d[2]/4281
self.module_82 = py_nndct.nn.Conv2d(in_channels=256, out_channels=64, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[direction]/Conv2d[0]/input.66
self.module_83 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[direction]/ReLU[1]/input.67
self.module_84 = py_nndct.nn.Conv2d(in_channels=64, out_channels=2, kernel_size=[1, 1], stride=[1, 1], padding=[0, 0], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[direction]/Conv2d[2]/4320
self.module_85 = py_nndct.nn.Conv2d(in_channels=256, out_channels=64, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[z_coor]/Conv2d[0]/input.68
self.module_86 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[z_coor]/ReLU[1]/input.69
self.module_87 = py_nndct.nn.Conv2d(in_channels=64, out_channels=1, kernel_size=[1, 1], stride=[1, 1], padding=[0, 0], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[z_coor]/Conv2d[2]/4359
self.module_88 = py_nndct.nn.Conv2d(in_channels=256, out_channels=64, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[dim]/Conv2d[0]/input.70
self.module_89 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[dim]/ReLU[1]/input
self.module_90 = py_nndct.nn.Conv2d(in_channels=64, out_channels=3, kernel_size=[1, 1], stride=[1, 1], padding=[0, 0], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[dim]/Conv2d[2]/4398
def forward(self, *args):
output_module_0 = self.module_0(input=args[0])
output_module_0 = self.module_1(output_module_0)
output_module_0 = self.module_3(output_module_0)
output_module_0 = self.module_4(output_module_0)
output_module_5 = self.module_5(output_module_0)
output_module_5 = self.module_7(output_module_5)
output_module_5 = self.module_8(output_module_5)
output_module_5 = self.module_10(input=output_module_5, other=output_module_0, alpha=1)
output_module_5 = self.module_11(output_module_5)
output_module_12 = self.module_12(output_module_5)
output_module_12 = self.module_14(output_module_12)
output_module_12 = self.module_15(output_module_12)
output_module_12 = self.module_17(input=output_module_12, other=output_module_5, alpha=1)
output_module_12 = self.module_18(output_module_12)
output_module_19 = self.module_19(output_module_12)
output_module_19 = self.module_21(output_module_19)
output_module_19 = self.module_22(output_module_19)
output_module_24 = self.module_24(output_module_12)
output_module_19 = self.module_26(input=output_module_19, other=output_module_24, alpha=1)
output_module_19 = self.module_27(output_module_19)
output_module_28 = self.module_28(output_module_19)
output_module_28 = self.module_30(output_module_28)
output_module_28 = self.module_31(output_module_28)
output_module_28 = self.module_33(input=output_module_28, other=output_module_19, alpha=1)
output_module_28 = self.module_34(output_module_28)
output_module_35 = self.module_35(output_module_28)
output_module_35 = self.module_37(output_module_35)
output_module_35 = self.module_38(output_module_35)
output_module_40 = self.module_40(output_module_28)
output_module_35 = self.module_42(input=output_module_35, other=output_module_40, alpha=1)
output_module_35 = self.module_43(output_module_35)
output_module_44 = self.module_44(output_module_35)
output_module_44 = self.module_46(output_module_44)
output_module_44 = self.module_47(output_module_44)
output_module_44 = self.module_49(input=output_module_44, other=output_module_35, alpha=1)
output_module_44 = self.module_50(output_module_44)
output_module_51 = self.module_51(output_module_44)
output_module_51 = self.module_53(output_module_51)
output_module_51 = self.module_54(output_module_51)
output_module_56 = self.module_56(output_module_44)
output_module_51 = self.module_58(input=output_module_51, other=output_module_56, alpha=1)
output_module_51 = self.module_59(output_module_51)
output_module_60 = self.module_60(output_module_51)
output_module_60 = self.module_62(output_module_60)
output_module_60 = self.module_63(output_module_60)
output_module_60 = self.module_65(input=output_module_60, other=output_module_51, alpha=1)
output_module_60 = self.module_66(output_module_60)
output_module_60 = self.module_67(output_module_60)
output_module_60 = self.module_69(output_module_60)
output_module_60 = self.module_70(output_module_60)
output_module_60 = self.module_72(output_module_60)
output_module_60 = self.module_73(output_module_60)
output_module_60 = self.module_75(output_module_60)
output_module_76 = self.module_76(output_module_60)
output_module_76 = self.module_77(output_module_76)
output_module_76 = self.module_78(output_module_76)
output_module_79 = self.module_79(output_module_60)
output_module_79 = self.module_80(output_module_79)
output_module_79 = self.module_81(output_module_79)
output_module_82 = self.module_82(output_module_60)
output_module_82 = self.module_83(output_module_82)
output_module_82 = self.module_84(output_module_82)
output_module_85 = self.module_85(output_module_60)
output_module_85 = self.module_86(output_module_85)
output_module_85 = self.module_87(output_module_85)
output_module_88 = self.module_88(output_module_60)
output_module_88 = self.module_89(output_module_88)
output_module_88 = self.module_90(output_module_88)
return output_module_76,output_module_79,output_module_82,output_module_85,output_module_88
`
I'm trying to run the MNIST-Classification-TensorFlow tutorial on the ZCU102.
I went through all the steps and generated the .elf file for the DPU. I've loaded a pre-compiled DPU-TRD image from https://www.xilinx.com/member/forms/download/design-license-xef.html?filename=zcu102-dpu-trd-2019-1-190809.zip which boots on my ZCU102 without issues.
The problem is that it appears that the above DPU-TRD image doesn't contain python packages needed by the script in Vitis-AI-Tutorials/files/build/target_zcu102/app_mt.py. Should I use a different precompiled DPU-TRD image?
root@zcu102-dpu-trd-2019:~# python3 app_mt.py -m model_dir/dpu_customcnn.elf
Traceback (most recent call last):
File "app_mt.py", line 17, in <module>
import runner
ImportError: No module named 'runner'
The resnet50 example, which is part of the DPU-TRD image works fine
sounds like ultra96v2 does not support tensorflow 2 am I right?
Hi all,
I followed the example in https://github.com/Xilinx/Vitis-AI-Tutorials/tree/master/Design_Tutorials/09-mnist_pyt and successfully quantized my model and then compiled it to be CNN_zcu102.xmodel.
However, when I load the xmodel to the pynq dpu overlay (KV260), it showed the following error. Any advice on the problem?
It is noteworthy that I can successfully load the xmodel compiled from the example model.
The problem comes only when I change to my model, and there are no warnings during quantization and compiling.
In addition, I am using the same arch as in the example.
I attach the quantized model for your reference.
`
import torch
import pytorch_nndct as py_nndct
class PoseResNet(torch.nn.Module):
def init(self):
super(PoseResNet, self).init()
self.module_0 = py_nndct.nn.Input() #PoseResNet::input_0
self.module_1 = py_nndct.nn.Conv2d(in_channels=3, out_channels=64, kernel_size=[7, 7], stride=[2, 2], padding=[3, 3], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Conv2d[conv1]/input.2
self.module_3 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/ReLU[relu]/3604
self.module_4 = py_nndct.nn.MaxPool2d(kernel_size=[3, 3], stride=[2, 2], padding=[1, 1], dilation=[1, 1], ceil_mode=False) #PoseResNet::PoseResNet/MaxPool2d[maxpool]/input.4
self.module_5 = py_nndct.nn.Conv2d(in_channels=64, out_channels=64, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer1]/BasicBlock[0]/Conv2d[conv1]/input.5
self.module_7 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer1]/BasicBlock[0]/ReLU[relu]/input.7
self.module_8 = py_nndct.nn.Conv2d(in_channels=64, out_channels=64, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer1]/BasicBlock[0]/Conv2d[conv2]/input.8
self.module_10 = py_nndct.nn.Add() #PoseResNet::PoseResNet/Sequential[layer1]/BasicBlock[0]/input.9
self.module_11 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer1]/BasicBlock[0]/ReLU[relu]/input.10
self.module_12 = py_nndct.nn.Conv2d(in_channels=64, out_channels=64, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer1]/BasicBlock[1]/Conv2d[conv1]/input.11
self.module_14 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer1]/BasicBlock[1]/ReLU[relu]/input.13
self.module_15 = py_nndct.nn.Conv2d(in_channels=64, out_channels=64, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer1]/BasicBlock[1]/Conv2d[conv2]/input.14
self.module_17 = py_nndct.nn.Add() #PoseResNet::PoseResNet/Sequential[layer1]/BasicBlock[1]/input.15
self.module_18 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer1]/BasicBlock[1]/ReLU[relu]/input.16
self.module_19 = py_nndct.nn.Conv2d(in_channels=64, out_channels=128, kernel_size=[3, 3], stride=[2, 2], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer2]/BasicBlock[0]/Conv2d[conv1]/input.17
self.module_21 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer2]/BasicBlock[0]/ReLU[relu]/input.19
self.module_22 = py_nndct.nn.Conv2d(in_channels=128, out_channels=128, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer2]/BasicBlock[0]/Conv2d[conv2]/input.20
self.module_24 = py_nndct.nn.Conv2d(in_channels=64, out_channels=128, kernel_size=[1, 1], stride=[2, 2], padding=[0, 0], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer2]/BasicBlock[0]/Sequential[downsample]/Conv2d[0]/input.21
self.module_26 = py_nndct.nn.Add() #PoseResNet::PoseResNet/Sequential[layer2]/BasicBlock[0]/input.22
self.module_27 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer2]/BasicBlock[0]/ReLU[relu]/input.23
self.module_28 = py_nndct.nn.Conv2d(in_channels=128, out_channels=128, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer2]/BasicBlock[1]/Conv2d[conv1]/input.24
self.module_30 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer2]/BasicBlock[1]/ReLU[relu]/input.26
self.module_31 = py_nndct.nn.Conv2d(in_channels=128, out_channels=128, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer2]/BasicBlock[1]/Conv2d[conv2]/input.27
self.module_33 = py_nndct.nn.Add() #PoseResNet::PoseResNet/Sequential[layer2]/BasicBlock[1]/input.28
self.module_34 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer2]/BasicBlock[1]/ReLU[relu]/input.29
self.module_35 = py_nndct.nn.Conv2d(in_channels=128, out_channels=256, kernel_size=[3, 3], stride=[2, 2], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer3]/BasicBlock[0]/Conv2d[conv1]/input.30
self.module_37 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer3]/BasicBlock[0]/ReLU[relu]/input.32
self.module_38 = py_nndct.nn.Conv2d(in_channels=256, out_channels=256, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer3]/BasicBlock[0]/Conv2d[conv2]/input.33
self.module_40 = py_nndct.nn.Conv2d(in_channels=128, out_channels=256, kernel_size=[1, 1], stride=[2, 2], padding=[0, 0], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer3]/BasicBlock[0]/Sequential[downsample]/Conv2d[0]/input.34
self.module_42 = py_nndct.nn.Add() #PoseResNet::PoseResNet/Sequential[layer3]/BasicBlock[0]/input.35
self.module_43 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer3]/BasicBlock[0]/ReLU[relu]/input.36
self.module_44 = py_nndct.nn.Conv2d(in_channels=256, out_channels=256, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer3]/BasicBlock[1]/Conv2d[conv1]/input.37
self.module_46 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer3]/BasicBlock[1]/ReLU[relu]/input.39
self.module_47 = py_nndct.nn.Conv2d(in_channels=256, out_channels=256, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer3]/BasicBlock[1]/Conv2d[conv2]/input.40
self.module_49 = py_nndct.nn.Add() #PoseResNet::PoseResNet/Sequential[layer3]/BasicBlock[1]/input.41
self.module_50 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer3]/BasicBlock[1]/ReLU[relu]/input.42
self.module_51 = py_nndct.nn.Conv2d(in_channels=256, out_channels=512, kernel_size=[3, 3], stride=[2, 2], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer4]/BasicBlock[0]/Conv2d[conv1]/input.43
self.module_53 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer4]/BasicBlock[0]/ReLU[relu]/input.45
self.module_54 = py_nndct.nn.Conv2d(in_channels=512, out_channels=512, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer4]/BasicBlock[0]/Conv2d[conv2]/input.46
self.module_56 = py_nndct.nn.Conv2d(in_channels=256, out_channels=512, kernel_size=[1, 1], stride=[2, 2], padding=[0, 0], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer4]/BasicBlock[0]/Sequential[downsample]/Conv2d[0]/input.47
self.module_58 = py_nndct.nn.Add() #PoseResNet::PoseResNet/Sequential[layer4]/BasicBlock[0]/input.48
self.module_59 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer4]/BasicBlock[0]/ReLU[relu]/input.49
self.module_60 = py_nndct.nn.Conv2d(in_channels=512, out_channels=512, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer4]/BasicBlock[1]/Conv2d[conv1]/input.50
self.module_62 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer4]/BasicBlock[1]/ReLU[relu]/input.52
self.module_63 = py_nndct.nn.Conv2d(in_channels=512, out_channels=512, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer4]/BasicBlock[1]/Conv2d[conv2]/input.53
self.module_65 = py_nndct.nn.Add() #PoseResNet::PoseResNet/Sequential[layer4]/BasicBlock[1]/input.54
self.module_66 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer4]/BasicBlock[1]/ReLU[relu]/4125
self.module_67 = py_nndct.nn.ConvTranspose2d(in_channels=512, out_channels=256, kernel_size=[4, 4], stride=[2, 2], padding=[1, 1], output_padding=[0, 0], groups=1, bias=True, dilation=[1, 1]) #PoseResNet::PoseResNet/Sequential[deconv_layers]/ConvTranspose2d[0]/input.55
self.module_69 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[deconv_layers]/ReLU[2]/4151
self.module_70 = py_nndct.nn.ConvTranspose2d(in_channels=256, out_channels=256, kernel_size=[4, 4], stride=[2, 2], padding=[1, 1], output_padding=[0, 0], groups=1, bias=True, dilation=[1, 1]) #PoseResNet::PoseResNet/Sequential[deconv_layers]/ConvTranspose2d[3]/input.57
self.module_72 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[deconv_layers]/ReLU[5]/4177
self.module_73 = py_nndct.nn.ConvTranspose2d(in_channels=256, out_channels=256, kernel_size=[4, 4], stride=[2, 2], padding=[1, 1], output_padding=[0, 0], groups=1, bias=True, dilation=[1, 1]) #PoseResNet::PoseResNet/Sequential[deconv_layers]/ConvTranspose2d[6]/input.59
self.module_75 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[deconv_layers]/ReLU[8]/input.61
self.module_76 = py_nndct.nn.Conv2d(in_channels=256, out_channels=64, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[hm_cen]/Conv2d[0]/input.62
self.module_77 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[hm_cen]/ReLU[1]/input.63
self.module_78 = py_nndct.nn.Conv2d(in_channels=64, out_channels=3, kernel_size=[1, 1], stride=[1, 1], padding=[0, 0], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[hm_cen]/Conv2d[2]/4242
self.module_79 = py_nndct.nn.Conv2d(in_channels=256, out_channels=64, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[cen_offset]/Conv2d[0]/input.64
self.module_80 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[cen_offset]/ReLU[1]/input.65
self.module_81 = py_nndct.nn.Conv2d(in_channels=64, out_channels=2, kernel_size=[1, 1], stride=[1, 1], padding=[0, 0], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[cen_offset]/Conv2d[2]/4281
self.module_82 = py_nndct.nn.Conv2d(in_channels=256, out_channels=64, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[direction]/Conv2d[0]/input.66
self.module_83 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[direction]/ReLU[1]/input.67
self.module_84 = py_nndct.nn.Conv2d(in_channels=64, out_channels=2, kernel_size=[1, 1], stride=[1, 1], padding=[0, 0], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[direction]/Conv2d[2]/4320
self.module_85 = py_nndct.nn.Conv2d(in_channels=256, out_channels=64, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[z_coor]/Conv2d[0]/input.68
self.module_86 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[z_coor]/ReLU[1]/input.69
self.module_87 = py_nndct.nn.Conv2d(in_channels=64, out_channels=1, kernel_size=[1, 1], stride=[1, 1], padding=[0, 0], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[z_coor]/Conv2d[2]/4359
self.module_88 = py_nndct.nn.Conv2d(in_channels=256, out_channels=64, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[dim]/Conv2d[0]/input.70
self.module_89 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[dim]/ReLU[1]/input
self.module_90 = py_nndct.nn.Conv2d(in_channels=64, out_channels=3, kernel_size=[1, 1], stride=[1, 1], padding=[0, 0], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[dim]/Conv2d[2]/4398
def forward(self, *args):
output_module_0 = self.module_0(input=args[0])
output_module_0 = self.module_1(output_module_0)
output_module_0 = self.module_3(output_module_0)
output_module_0 = self.module_4(output_module_0)
output_module_5 = self.module_5(output_module_0)
output_module_5 = self.module_7(output_module_5)
output_module_5 = self.module_8(output_module_5)
output_module_5 = self.module_10(input=output_module_5, other=output_module_0, alpha=1)
output_module_5 = self.module_11(output_module_5)
output_module_12 = self.module_12(output_module_5)
output_module_12 = self.module_14(output_module_12)
output_module_12 = self.module_15(output_module_12)
output_module_12 = self.module_17(input=output_module_12, other=output_module_5, alpha=1)
output_module_12 = self.module_18(output_module_12)
output_module_19 = self.module_19(output_module_12)
output_module_19 = self.module_21(output_module_19)
output_module_19 = self.module_22(output_module_19)
output_module_24 = self.module_24(output_module_12)
output_module_19 = self.module_26(input=output_module_19, other=output_module_24, alpha=1)
output_module_19 = self.module_27(output_module_19)
output_module_28 = self.module_28(output_module_19)
output_module_28 = self.module_30(output_module_28)
output_module_28 = self.module_31(output_module_28)
output_module_28 = self.module_33(input=output_module_28, other=output_module_19, alpha=1)
output_module_28 = self.module_34(output_module_28)
output_module_35 = self.module_35(output_module_28)
output_module_35 = self.module_37(output_module_35)
output_module_35 = self.module_38(output_module_35)
output_module_40 = self.module_40(output_module_28)
output_module_35 = self.module_42(input=output_module_35, other=output_module_40, alpha=1)
output_module_35 = self.module_43(output_module_35)
output_module_44 = self.module_44(output_module_35)
output_module_44 = self.module_46(output_module_44)
output_module_44 = self.module_47(output_module_44)
output_module_44 = self.module_49(input=output_module_44, other=output_module_35, alpha=1)
output_module_44 = self.module_50(output_module_44)
output_module_51 = self.module_51(output_module_44)
output_module_51 = self.module_53(output_module_51)
output_module_51 = self.module_54(output_module_51)
output_module_56 = self.module_56(output_module_44)
output_module_51 = self.module_58(input=output_module_51, other=output_module_56, alpha=1)
output_module_51 = self.module_59(output_module_51)
output_module_60 = self.module_60(output_module_51)
output_module_60 = self.module_62(output_module_60)
output_module_60 = self.module_63(output_module_60)
output_module_60 = self.module_65(input=output_module_60, other=output_module_51, alpha=1)
output_module_60 = self.module_66(output_module_60)
output_module_60 = self.module_67(output_module_60)
output_module_60 = self.module_69(output_module_60)
output_module_60 = self.module_70(output_module_60)
output_module_60 = self.module_72(output_module_60)
output_module_60 = self.module_73(output_module_60)
output_module_60 = self.module_75(output_module_60)
output_module_76 = self.module_76(output_module_60)
output_module_76 = self.module_77(output_module_76)
output_module_76 = self.module_78(output_module_76)
output_module_79 = self.module_79(output_module_60)
output_module_79 = self.module_80(output_module_79)
output_module_79 = self.module_81(output_module_79)
output_module_82 = self.module_82(output_module_60)
output_module_82 = self.module_83(output_module_82)
output_module_82 = self.module_84(output_module_82)
output_module_85 = self.module_85(output_module_60)
output_module_85 = self.module_86(output_module_85)
output_module_85 = self.module_87(output_module_85)
output_module_88 = self.module_88(output_module_60)
output_module_88 = self.module_89(output_module_88)
output_module_88 = self.module_90(output_module_88)
return output_module_76,output_module_79,output_module_82,output_module_85,output_module_88
`
Typed:
root@xilinx-zcu102-2021_2:~/Vitis-AI/demo/Vitis-AI-Library/samples/yolov4# ./test_video_yolov4 dpu_yolov4 0 -t 6
[ WARN:0] global /usr/src/debug/opencv/4.4.0-r0/git/modules/videoio/src/cap_gstreamer.cpp (935) open OpenCV | GStreamer warning: Cannot query video position: status=0, value=-1, duration=-1
Just locks up / doing nothing that I can see
Monitor is connected via the Display port.
I noticed that the TensorFlow yolov4 example is removed from the repo. In the READ ME file, the instructions said to use the pre-trained weights yolov4-leaky_best.weights.7z.001 . I could not find them in this repo. Plus, can you provide me with the pre-trained weights for the yolov4.cfg model ? Also, I wanted to know, why this example is removed from the VitisAI Design tutorials ?
Thanks.
Bitstream Generation fails to complete with the error:
VPL-4: design did not meet timing - Design failed to meet timing.
Error from the Vivado (v2019.2.1) build log:
ERROR: [runtcl-1] design did not meet timing - Design failed to meet timing.
Failed timing checks (paths):
{ultra96v2_mipi_i/dpu_xrt_top_1/inst/u_631818d4/m_43dd20ae/u_b2263e3b/s_189e67da_reg[0]/C --> ultra96v2_mipi_i/axi_intc_0/U0/INTC_CORE_I/INTR_DETECT_GEN[0].LVL_DETECT_GEN.hw_intr_reg[0]/D}
Hello,
could you please confirm whether running
./docker_run.sh xilinx/vitis-ai-cpu:1.2.82
rather than
:latest
which is 1.3.x is the proper way to work in vitis_ai 1.2 ?
Seems that "lastest" is hard coded in some of the scripts.
Thx Gerd
Hi
I downloaded Vitis IDE and docker. Is there any tutorial that can guide us to deploy our own model from scratch using the IDE or docker?
Thanks.
Hi,
After testing this design example, I noticed that any image not containing a cat or a dog will be miss classified as a cat. Is there a way to fix this in the model? for instance
xilinx-k26-starterkit-2021_2:~/target_kv260$ python3 app_single.py -i car001.jpg
Command line options:
--image_dir : images
--image : car001.jpg
--threads : 1
--model : customcnn.xmodel
Starting 1 threads...
image classified as : cat
xilinx-k26-starterkit-2021_2:~/target_kv260$ python3 app_single.py -i snowby.jpg
Command line options:
--image_dir : images
--image : snowby.jpg
--threads : 1
--model : customcnn.xmodel
Starting 1 threads...
image classified as : cat
Here is my modified script to classify a single image
'''
Copyright 2020 Xilinx Inc.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
'''
from ctypes import *
from typing import List
import cv2
import numpy as np
import vart
import os
import pathlib
import xir
import threading
import time
import sys
import argparse
divider = '------------------------------------'
def preprocess_fn(image_path, fix_scale):
'''
Image pre-processing.
Rearranges from BGR to RGB then normalizes to range 0:1
and then scales by input quantization scaling factor
input arg: path of image file
return: numpy array
'''
image = cv2.imread(image_path)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
image = image * (1/255.0) * fix_scale
image = image.astype(np.int8)
return image
def get_child_subgraph_dpu(graph: "Graph") -> List["Subgraph"]:
assert graph is not None, "'graph' should not be None."
root_subgraph = graph.get_root_subgraph()
assert (root_subgraph is not None), "Failed to get root subgraph of input Graph object."
if root_subgraph.is_leaf:
return []
child_subgraphs = root_subgraph.toposort_child_subgraph()
assert child_subgraphs is not None and len(child_subgraphs) > 0
return [
cs
for cs in child_subgraphs
if cs.has_attr("device") and cs.get_attr("device").upper() == "DPU"
]
def runDPU(id,start,dpu,img):
'''get tensor'''
inputTensors = dpu.get_input_tensors()
outputTensors = dpu.get_output_tensors()
input_ndim = tuple(inputTensors[0].dims)
output_ndim = tuple(outputTensors[0].dims)
# we can avoid output scaling if use argmax instead of softmax
#output_fixpos = outputTensors[0].get_attr("fix_point")
#output_scale = 1 / (2**output_fixpos)
batchSize = input_ndim[0]
n_of_images = len(img)
count = 0
write_index = start
ids=[]
ids_max = 50
outputData = []
for i in range(ids_max):
outputData.append([np.empty(output_ndim, dtype=np.int8, order="C")])
while count < n_of_images:
if (count+batchSize<=n_of_images):
runSize = batchSize
else:
runSize=n_of_images-count
'''prepare batch input/output '''
inputData = []
inputData = [np.empty(input_ndim, dtype=np.int8, order="C")]
'''init input image to input buffer '''
for j in range(runSize):
imageRun = inputData[0]
imageRun[j, ...] = img[(count + j) % n_of_images].reshape(input_ndim[1:])
'''run with batch '''
job_id = dpu.execute_async(inputData,outputData[len(ids)])
ids.append((job_id,runSize,start+count))
count = count + runSize
if count<n_of_images:
if len(ids) < ids_max-1:
continue
for index in range(len(ids)):
dpu.wait(ids[index][0])
write_index = ids[index][2]
'''store output vectors '''
for j in range(ids[index][1]):
# we can avoid output scaling if use argmax instead of softmax
# out_q[write_index] = np.argmax(outputData[0][j] * output_scale)
out_q[write_index] = np.argmax(outputData[index][0][j])
write_index += 1
ids=[]
def app(image_dir, image, threads,model):
global out_q
out_q = [None]
g = xir.Graph.deserialize(model)
subgraphs = get_child_subgraph_dpu(g)
all_dpu_runners = []
for i in range(threads):
all_dpu_runners.append(vart.Runner.create_runner(subgraphs[0], "run"))
# input scaling
input_fixpos = all_dpu_runners[0].get_input_tensors()[0].get_attr("fix_point")
input_scale = 2**input_fixpos
''' preprocess images '''
img = []
path = os.path.join(image_dir,image)
img.append(preprocess_fn(path, input_scale))
'''run threads '''
print('Starting',threads,'threads...')
threadAll = []
start=0
for i in range(threads):
if (i==threads-1):
end = len(img)
else:
end = start+(len(img)//threads)
in_q = img[start:end]
t1 = threading.Thread(target=runDPU, args=(i,start,all_dpu_runners[i], in_q))
threadAll.append(t1)
start=end
for x in threadAll:
x.start()
for x in threadAll:
x.join()
classes = ['dog','cat']
prediction = classes[out_q[0]]
print("image classified as : %s" % prediction)
return
# only used if script is run as 'main' from command line
def main():
# construct the argument parser and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument('-d', '--image_dir', type=str, default='images', help='Path to folder of images. Default is images')
ap.add_argument('-i', '--image', type=str, default='cat.27.jpg', help='Path to image. Default is 001.jpg')
ap.add_argument('-t', '--threads', type=int, default=1, help='Number of threads. Default is 1')
ap.add_argument('-m', '--model', type=str, default='customcnn.xmodel', help='Path of xmodel. Default is customcnn.xmodel')
args = ap.parse_args()
print ('Command line options:')
print (' --image_dir : ', args.image_dir)
print (' --image : ', args.image)
print (' --threads : ', args.threads)
print (' --model : ', args.model)
app(args.image_dir,args.image,args.threads,args.model)
if __name__ == '__main__':
main()
Thanks,
I have done a training on this model for VOC, follow every step of the tutorial and after some long time it seems the training did not converge, is that the right term. After running the score.sh script on the snapshot_iter_120000.caffemodel I am getting (end of log)
I0311 19:11:02.286180 515 net.cpp:284] Network initialization done.
I0311 19:11:02.610352 515 net.cpp:823] Ignoring source layer mbox_loss
I0311 19:11:02.610754 515 caffe.cpp:574] Running for 4952 iterations.
I0311 19:20:58.740268 515 caffe.cpp:438] Test net output #0: detection_eval = 0.00244108
I have an RTX 3060, nvidia-smi output
Fri Mar 11 20:59:39 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03 Driver Version: 510.47.03 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... On | 00000000:01:00.0 On | N/A |
| 0% 45C P8 25W / 170W | 1150MiB / 12288MiB | 1% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1122 G /usr/lib/xorg/Xorg 640MiB |
| 0 N/A N/A 1465 G /usr/bin/gnome-shell 139MiB |
| 0 N/A N/A 3933 G ...AAAAAAAAA= --shared-files 107MiB |
| 0 N/A N/A 443952 C caffe 191MiB |
| 0 N/A N/A 2159972 G ...952486002011016735,131072 65MiB |
+-----------------------------------------------------------------------------+
My training stop accidentally a little after iteration 50000 so I use the following command to resume
caffe train -solver solver.prototxt -snapshot /workspace/SSD/workspace/Mobilenetv2-SSD/snapshots/snapshot_iter_50000.solverstate -gpu 0 2>&1 | tee SSD_train_2.log
but I run score at snapshot 20000 and it is worse
I0311 21:09:44.394701 591 net.cpp:284] Network initialization done.
I0311 21:09:45.296046 591 net.cpp:823] Ignoring source layer mbox_loss
I0311 21:09:45.296661 591 caffe.cpp:574] Running for 4952 iterations.
I0311 21:21:22.749675 591 caffe.cpp:438] Test net output #0: detection_eval = 0.000904203
How can I solve this ?
Hello, I just want to report some issues I had when trying to follow the Vitis™ 2020.2 / Vitis-AI™ 1.3 - Machine Learning Tutorial for the ZCU104, more specifically the demo application in module 6: 3.6 Usb Camera Input and Multi-Threads base on Vitis AI Library
When I tried compiling the application running the build_app.sh
script I encountered the following issues:
sudo apt-get install libgoogle-glog-dev libgflags-dev
and set up the SDK again.CXXFLAGS= (...)-std=c++17
/Module_6/CMakeLists.txt
to instead link to ~/petalinux_sdk_2021.1/sysroots/cortexa72-cortexa53-xilinx-linux/usr/lib/libvitis_ai_library-xnnpp.so
by replacing xnnpp-xnnpp
with vitis_ai_library-xnnpp
after all these fixes I managed to compile the app and run it on my ZCU102 using a webcam!
re:
If using my pretrained model, you’ll need to extract it by right clicking “dpu_yolov4.elf.7z.001” and selecting ‘extract’.
Is there a link to a copy of this file?
Many thanks
Working now - I expect it was a issue with different versions.
I found that the cpp files from which we generate the binary are outdated and are using Vitis1.0,
for example from test_video_yolov3.cpp
#include <xilinx/ai/demo.hpp>
#include <xilinx/ai/yolov3.hpp>
#include <xilinx/ai/nnpp/yolov3.hpp>
Also the build.sh script refers to old libraries like ldpyolov3. Can the owner change the project to new Vitis1.2 base
Hello,
I download the Vitis-AI-ssd/SSD and retrain the Mobilenetv2-SSD in the Vitis-AI-ssd/SSD/workspace/Mobilenetv2-SSD. the log as follow:
I0521 00:17:13.147367 1523 solver.cpp:772] Iteration 6000, Testing net (#0)
I0521 00:17:13.148303 1523 net.cpp:743] Ignoring source layer mbox_loss
I0521 00:19:13.689599 1523 solver.cpp:885] Test net output #0: detection_eval = 0
I0521 00:19:19.519392 1523 solver.cpp:270] Iteration 6000 (0.149189 iter/s, 670.289s/100 iter), loss = 3.38898, remaining 212 hours and 14 minutes
I0521 00:19:19.519454 1523 solver.cpp:291] Train net output #0: mbox_loss = 3.49844 (* 1 = 3.49844 loss)
I0521 00:19:19.519470 1523 sgd_solver.cpp:106] Iteration 6000, lr = 0.001
...........................................
why "Test net output #0: detection_eval = 0". it confuse me for almost all the day, I can't find any useful solution in google.
( I use caffe-xilinx 1.1 )
train.sh:
/workspace/caffe-xilinx/build/tools/caffe train -solver="/workspace/Vitis-AI-ssd/SSD/workspace/Mobilenetv2-SSD/solver.prototxt"
-weights="/workspace/Vitis-AI-ssd/SSD/workspace/Mobilenetv2-SSD/pretrained.caffemodel" -gpu 0,1,2,3 2>&1 | tee train.log:q
/opt/petalinux/2021.1/sysroots/x86_64-petalinux-linux/usr/libexec/aarch64-xilinx-linux/gcc/aarch64-xilinx-linux/10.2.0/real-ld: cannot find -lxnnpp-xnnpp
collect2: error: ld returned 1 exit status
Tried to git clone the ssd folder but didn't work...
Hello, the following error occurred when I was quantifying the tf.pb file with 1.2.1
It should be noted that I converted the PyTorch model to TF, and after the conversion, I tested the output of the TF model, which was the same as the output of the PyTorch. There is no problem with the tf.pb file;Only the quantize_eval_model.pb file is generated under the vai_q_output folder
Dear all,
I'm not sure about which prototxt should I use when i attempt to convert the darknet model to a caffemodel. Since I've cloned the repo, I possess 2 prototxts into the 2 directories:
dpu_yolov4/dpu_yolov4.prototxt
dpu_yolov4_voc/dpu_yolov4_voc.prototxt
Should I consider one on these 2 files, or should I generate my prototxt from my weights file and cfg?
in the second case; how to generate it?
Many thanks
I got the error message when I do the step 6 "source 6_compile.sh".
Traceback (most recent call last):
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/bin/vai_c_tensorflow", line 186, in
compiler = VAI_TensorFlow_Frontend(args)
File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/bin/vai_c_tensorflow", line 76, in init
with open(args.arch) as json_data:
FileNotFoundError: [Errno 2] No such file or directory: '/opt/vitis_ai/compiler/arch/DPUCAHX8H/U50/arch.json'
It seem like the file "arch.json" did not generate when I start this docker.
How should I do to fix that? thx
I would like to do the simple inference application on my board and I wanted to use the pre-trained model available for the VGG-16 network. However the pre-trained models seem to be corrupted or I am unable to evaluate on the host, though using the corresponding 'deploy.prototxt' file. Can you provide a link or any other resource for the pretrained model. Thanks in advance
Hello,
I am wondering what the time frame is to move the MNIST tutorial
to Tensorflow 2 and Python 3 ?
Thank you
Hello,
I was wondering is it possible to use the DPU as a feature extractor? how can I retrieve and copy the final feature vectors (the last feature map) ?
I tried this in the main cc file:
feature = dpuGetOutputTensorInHWCFP32(taskResnet50, OUTPUT_NODE, FCResult, channel);
printf("features = %f\n\r", feature);
But I only obtained one value.
Any help please?
i was dong the Quantize part in pytorch flow, with torch==1.7 (my prj required) vitis-ai ==1.4 and come across the error
error info here:
[VAIQ_ERROR]: /opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.6/site-packages/pytorch_nndct/nn/_kernels.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN6caffe28TypeMeta21_typeMetaDataInstanceIN3c108BFloat16EEEPKNS_6detail12TypeMetaDataEv
i have tried to import pytorch_nndct before torch, but it did not work.
did anyone have the same problem, and could someone help to solve? thanks!
can we simulate boards in vitis ai tool
After having built everything successfully from step 0 to 7 from the tutorial ML/02-MNIST_classification_tf
But I want to run the xmodel on AWS F1 f1.2xlarge, I get the errow below. Can you please confirm U50 is ok to run on f1.2xlarge otherwise which target ?
(vitis-ai-tensorflow) Vitis-AI /workspace/build/target_u50 > /usr/bin/python3 app_mt.py -m model_dir/customcnn.xmodel
Command line options:
--image_dir : images
--threads : 1
--model : model_dir/customcnn.xmodel
WARNING: Logging before InitGoogleLogging() is written to STDERR
F0401 08:21:48.498520 440 dpu_controller.cpp:44] Check failed: !the_factory_methods.empty()
*** Check failure stack trace: ***
Aborted (core dumped)
we referring to https://github.com/Xilinx/Vitis-AI-Tutorials/tree/master/Design_Tutorials/07-yolov4-tutorial.
"When Num_classes is 16 changed in dpu_yolov4.prototxt. then model is not working properly."
In our case we are doing transfer learning on model pre-trained on coco dataset.
We are not getting any error or crashes. But when we deployed our model, it is not working. It is not detecting classes in runtime. But our model before conversion worked fine on CPU.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.