Giter VIP home page Giter VIP logo

hevc-complexity-reduction's Introduction

Programs for our deep learning based complexity reduction approach for HEVC, at both intra- and inter-modes.

Deep Learning Based HM Encoder (Test at Intra-mode)

Relative folder: HM-16.5_Test_AI/

This encoder is used for evaluating the performance of our deep ETH-CNN based approach [1] (improved from the conference version [2] published on IEEE ICME) at All-Intra configuration. The main part is modified from the standard reference software HM 16.5 [3], coded with C++. The proposed ETH-CNN is realized based on Tensorflow, coded with Python 3.5. For evaluating the performance of our deep learning based approach, the Python program is invoked inside the HM encoder. To encode a YUV file, the probability of CU partition for all the frames is predicted in advance, before the encoding process in HM really starts. Compared with the upper and lower thresholds at three levels, the probability is read to finally determine the CU partition by HM. In this way, most redundant checking of RD cost checking can be skipped, thus save the overall encoding time significantly.

Process

  1. Before encoding the first frame, HM invokes the Python program video_to_cu_depth.py via a command line with some essential paramaters, containing YUV file name, frame width, frame height and QP.

  2. The Python program reads the YUV file and other essential parameters, to predict the probability of CU partition for all the frames and save it in file cu_depth.dat.

  3. HM encodes all the frames according to the predicted CU partition probability from cu_depth.dat, thus simplifying the RDO process by skipping redundant checking of RD cost.

Source

In HM 16.5, four C++ files have been modified, as below.

  • source/App/TAppEncoder/TAppEncCfg.cpp

  • source/Lib/TLibCommon/TComPic.h

  • source/Lib/TLibEncoder/TEncGOP.cpp

  • source/Lib/TLibEncoder/TEncCu.cpp

Among them, TAppEncCfg.cpp directly invokes the Python program.

Note: it is assumed that the default Python is Python 3 in the system (Windows or Linux). If Python 3 is not default, please edit the 2319th line of TAppEncCfg.cpp by changing the string python into python3, and rebuild the HM.

Also, two Python files are included for predicting the CU partition with the proposed deep networks. The top file is video_to_cu_depth.py, monitoring the overall procedure of adopting ETH-CNN, together with necessary steps such as file reading/writting, data transferring, network feeding, etc. The specific network architecture is defined in net_CNN.py.

For more details, please refer to the comments in these source codes.

This program is used to evaluate the performance of our deep ETH-CNN based approach at All-Intra configuration.

Running Instructions

  1. Install Tensorflow. Versions $\ge$ 1.8.0 are recommanded.

  2. Path into HM-16.5_Test_AI/Release

    Set upper/lower thresholds for 3-level CU partition in file Thr_info.txt

    Format: [$\bar{\alpha}_1$ $\alpha_1$ $\bar{\alpha}_2$ $\alpha_2$ $\bar{\alpha}_3$ $\alpha_3$]

    Example: [0.5 0.5 0.5 0.5 0.5 0.5]

  3. Run TAppEncoderStatic on Linux or TAppEncoder.exe on Windows.

    Examples: RUN_AI.sh and RUN_AI.bat

Note: It is highly recommended to run on Linux (64-bit) platform, which supports encoding high-resolution video sequences normally. If to run with other platform, you need to rebuild the project and re-generate the executable files. Also, please make sure the path of YUV file is not so long (shorter than 900 characters), because the file path is a part of a command line for invoking the Python program and the maximum length of the command line is 1000.

Deep Learning Based HM Encoder (Test at Inter-mode)

Relative folder: HM-16.5_Test_LDP/

This encoder is used for evaluating the performance of our deep ETH-CNN + ETH-LSTM based approach [1] at Low-Delay-P configuration. The main part is modified from the standard reference software HM 16.5, coded with C++. The proposed ETH-CNN and ETH-LSTM are realized based on Tensorflow, coded with Python 3.5. For evaluating the performance of our deep learning based approach, the HM and the Python program are linked together via sharing intermediate information when running both the programs. When encoding a YUV file, the CU partition is predicted frame-wise in accord with the encoding process in HM. For a certain frame, a simplified setting is adopted for quick pre-encoding, to obtaining the residue of this frame in HM. Next, the residue is fed into ETH-CNN + ETH-LSTM in the Python program. Then the Python program predicts the probability of CU partition for this frame. Compared with the upper and lower thresholds at three levels, the probability is read to finally determine the CU partition by HM. In this way, most redundant checking of RD cost can be skipped, thus save the overall encoding time significantly.

Process

  1. During encoding one frame, HM adopts a simplified setting (CU size and PU size are all the maximum, 64$\times$64, except on the right/bottom edge without full-size CTUs) to quickly pre-encode the frame, for obtaining the residue. Then the residue frame is saved into file resi.yuv. Note that the quick pre-encoding for residue is not included in the standard HM. Instead, it is introduced by our approach, to provide the input for ETH-CNN at inter-mode.

  2. In HM, some essential parameters are wirtten into file command.dat.

    Format: Frame_index Frame_width Frame_height QP [end]

    Example: 19 416 240 22 [end]

  3. HM generates a signal file pred_start.sig, indicating the residue file and command file are ready and the Python program can start to run. Here the Python program need only check whether the siginal file exists rather than the content. So the siginal file can be null, for simplicity.

  4. Once pred_start.sig is detected by the Python program, some essential files are read, containing the residue resi.yuv, the command command.dat and the previous state of ETH-LSTM state.dat (if have).

  5. The residue frame and the LSTM state are fed into ETH-CNN + ETH-LSTM. Then ETH-LSTM runs ahead for one time step, and the file state.dat is updated. Also, the predicted CU partition probability for the whole frame, is saved in file cu_depth.dat.

  6. After updating the state and generating cu_depth.dat, the Python program generates an end signal pred_end.sig, indicating the prediction for current frame is finished.

  7. Once pred_end.sig is detected by HM, the probability of all possible CUs are read from cu_depth.dat, based on which the CU partition is determined.

  8. HM encodes the current frame according to the predicted CU partition, thus simplifying the RDO process by skipping redundant checking of RD cost.

Source

In HM 16.5, five C++ files have been modified, as below.

  • source/Lib/TLibCommon/TComPic.h

  • source/Lib/TLibEncoder/TEncGOP.cpp

  • source/Lib/TLibEncoder/TEncCu.cpp

  • source/Lib/TLibEncoder/TEncSearch.cpp

  • source/Lib/TLibEncoder/TEncSlice.cpp

Among them, the TEncGOP.cpp directly invokes ETH-CNN + ETH_LSTM.

Also, three Python files are included for predicting the CU partition with the proposed deep networks. The top file is resi_to_cu_depth_LDP.py, monitoring the overall procedure of adopting ETH-CNN + ETH-LSTM, together with necessary steps such as file reading/writting, data transferring, network feeding, etc. The specific network architecture is defined in net_CNN_LSTM_one_step.py, and some constants are stored in config.py.

For more details, please refer to the comments in these source codes.

This program is used to evaluate the performance of our deep ETH-CNN+ETH-LSTM based approach at the Low-Delay-P configuration.

Running Instructions

  1. Install Tensorflow. Versions $\ge$ 1.8.0 are recommanded.

  2. Path into HM-16.5_Test_LDP/Release

  3. Set upper/lower thresholds for 3-level CU partition in file Thr_info.txt

    Format: [$\bar{\alpha}_1$ $\alpha_1$ $\bar{\alpha}_2$ $\alpha_2$ $\bar{\alpha}_3$ $\alpha_3$]

    Example: [0.4 0.6 0.3 0.7 0.2 0.8]

  4. Ensure there exists no any signal file, pred_start.sig or pred_end.sig, for avoiding unproper behavier of communication between the HM encoder and the python program resi_to_cu_depth_LDP.py.

  5. Run resi_to_cu_depth_LDP.py with Python 3.5, and Initializing. Please wait... is shown.

  6. Wait for about 1~10s, until Python: Tensorflow initialized. is shown.

  7. Run TAppEncoderStatic on Linux or TAppEncoder.exe on Windows.

    Examples: RUN_LDP.sh and RUN_LDP.bat

Note: It is highly recommended to run on Linux (64-bit) platform, which supports encoding high-resolution video sequences normally. If to run with other platform, you need to rebuild the project and re-generate the executable files.

Training at Intra-mode

Relative folders: HM-16.5_Extract_Data/, AI_Info/, Extract_Data/ and ETH-CNN_Training_AI/

These programs are used for training the deep ETH-CNN at All-Intra configuration. Require: 12 YUV files (and 96 Info_XX.dat files, optional)

  1. Build databases: Compress 12 YUV files with encoder HM-16.5_Extract_Data/bin/TAppEncoderStatic at 4 QPs, to extract str_XX.bin, Info_XX.dat and log_XX.txt files.

    (This step can be skipped, because all Info_XX.dat files are already provided in folder AI_Info/ )

  2. Extract data:

    To configure: variables YUV_PATH_ORI and INFO_PATH in Extract_Data/extract_data_AI.py.

    Run Extract_Data/extract_data_AI.py to get training, validation and test data. Each data file is shuffled during the program's running. Sample size: 4992 bytes.

  3. Train:

    Run ETH-CNN_Training_AI/train_CNN_CTU64.py following the instruction ETH-CNN_Training_AI/readme.txt.

Training at Inter-mode

Relative folders: HM-16.5_Extract_Data/, HM-16.5_Resi_Pre/, LDP_Info/, Extract_Data/, ETH-CNN_Training_LDP/ and ETH-LSTM_Training_LDP/

These programs are used for training the deep ETH-CNN and ETH-LSTM at Low-Delay-P configuration.

Require: 111 YUV files (and 888 Info_XX.dat files, optional)

  1. Build databases: (1) Compress 111 YUV files with encoder HM-16.5_Extract_Data/bin/TAppEncoderStatic at 4 QPs, to extract str_XX.bin, Info_XX.dat and log_XX.txt files.

    (This step can be skipped, because all Info_XX.dat files are already provided in folder LDP_Info/ )

    (2) Compress 111 YUV files with encoder HM-16.5_Resi_Pre/bin/TAppEncoderStatic at 4 QPs, to obtain 444 resi_XX.yuv files.

  2. Extract data for ETH-CNN:

    To configure: variables CONFIG, YUV_PATH_RESI and INFO_PATH in Extract_Data/extract_data_LDP_LDB_RA.py.

    Run Extract_Data/extract_data_LDP_LDB_RA.py to get training, validation and test data. For each data file, both shuffled and unshuffled versions are generated. Sample size: 16516 bytes.

  3. Train ETH-CNN: Run ETH-CNN_Training_LDP/train_resi_CNN_CTU64.py for all 4 QPs at one time.

  4. Extract data for ETH-LSTM:

    To configure: variables MODEL_FILE, INPUT_PATH and OUTPUT_PATH in ETH-LSTM_Training_LDP/get_LSTM_input.py.

    Run ETH-LSTM_Training_LDP/get_LSTM_input.py. Use unshuffled training/validation/test files in step 2 to generate data for ETH-LSTM. Sample size: 37264 bytes.

  5. Train ETH-LSTM:

    Run ETH-CNN_Training_LDP/train_LSTM_CTU64.py for 4 QPs separately.

References

[1] M. Xu, T. Li, Z. Wang, X. Deng, R. Yang and Z. Guan, "Reducing Complexity of HEVC: A Deep Learning Approach", in IEEE Transactions on Image Processing (TIP), vol. 27, no. 10, pp. 5044-5059, Oct. 2018.

[2] T. Li, M. Xu and X. Deng, "A deep convolutional neural network approach for complexity reduction on intra-mode HEVC," 2017 IEEE International Conference on Multimedia and Expo (ICME), Hong Kong, Hong Kong, 2017, pp. 1255-1260.

[3] JCT-VC, “HM Software,” [Online]. Available: https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/tags/HM-16.5/, 2014, [Accessed 5-Nov.-2016].

hevc-complexity-reduction's People

Contributors

tianyili2017 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

hevc-complexity-reduction's Issues

Intra-mode test 出错

你好:

运行RUN_AI.sh时会报以下错误log, 似乎是buffer overflow detected。 想咨询下,大概是什么原因引起的,谢谢。
Ps: 运行RUN_LDP.sh是没问题的。 

报错log如下:

HM software: Encoder Version [16.5] (including RExt)[Linux][GCC 5.4.0][64 bit]


** WARNING: --SEIDecodedPictureHash is now disabled by default. **
** Automatic verification of decoded pictures by a **
** decoder requires this option to be enabled. **


*** buffer overflow detected ***: ./TAppEncoderStatic terminated
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x7329f)[0x7f13edbac29f]
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x5c)[0x7f13edc4787c]
/lib/x86_64-linux-gnu/libc.so.6(+0x10d750)[0x7f13edc46750]
/lib/x86_64-linux-gnu/libc.so.6(+0x10cc59)[0x7f13edc45c59]
/lib/x86_64-linux-gnu/libc.so.6(_IO_default_xsputn+0xbc)[0x7f13edbb461c]
/lib/x86_64-linux-gnu/libc.so.6(_IO_vfprintf+0x1cc5)[0x7f13edb84905]
/lib/x86_64-linux-gnu/libc.so.6(__vsprintf_chk+0x84)[0x7f13edc45ce4]
/lib/x86_64-linux-gnu/libc.so.6(__sprintf_chk+0x7d)[0x7f13edc45c3d]
./TAppEncoderStatic[0x406a56]
./TAppEncoderStatic[0x41c6fe]
./TAppEncoderStatic[0x404ad9]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f13edb5af45]
./TAppEncoderStatic[0x405c49]
======= Memory map: ========
00400000-00528000 r-xp 00000000 08:03 77991474 /home/yangliu/HEVC-Complexity-Reduction/HM-16.5_AI/bin/vc10/x64/Release/TAppEncoderStatic
00727000-0072a000 r--p 00127000 08:03 77991474 /home/yangliu/HEVC-Complexity-Reduction/HM-16.5_AI/bin/vc10/x64/Release/TAppEncoderStatic
0072a000-0072b000 rw-p 0012a000 08:03 77991474 /home/yangliu/HEVC-Complexity-Reduction/HM-16.5_AI/bin/vc10/x64/Release/TAppEncoderStatic
0072b000-0072d000 rw-p 00000000 00:00 0
00c78000-00d92000 rw-p 00000000 00:00 0 [heap]
7f13edb39000-7f13edcf7000 r-xp 00000000 08:01 1847035 /lib/x86_64-linux-gnu/libc-2.19.so
7f13edcf7000-7f13edef7000 ---p 001be000 08:01 1847035 /lib/x86_64-linux-gnu/libc-2.19.so
7f13edef7000-7f13edefb000 r--p 001be000 08:01 1847035 /lib/x86_64-linux-gnu/libc-2.19.so
7f13edefb000-7f13edefd000 rw-p 001c2000 08:01 1847035 /lib/x86_64-linux-gnu/libc-2.19.so
7f13edefd000-7f13edf02000 rw-p 00000000 00:00 0
7f13edf02000-7f13edf13000 r-xp 00000000 08:03 4462279 /home/yangliu/work/anaconda2/lib/libgcc_s.so.1
7f13edf13000-7f13ee112000 ---p 00011000 08:03 4462279 /home/yangliu/work/anaconda2/lib/libgcc_s.so.1
7f13ee112000-7f13ee113000 r--p 00010000 08:03 4462279 /home/yangliu/work/anaconda2/lib/libgcc_s.so.1
7f13ee113000-7f13ee114000 rw-p 00011000 08:03 4462279 /home/yangliu/work/anaconda2/lib/libgcc_s.so.1
7f13ee114000-7f13ee219000 r-xp 00000000 08:01 1847021 /lib/x86_64-linux-gnu/libm-2.19.so
7f13ee219000-7f13ee418000 ---p 00105000 08:01 1847021 /lib/x86_64-linux-gnu/libm-2.19.so
7f13ee418000-7f13ee419000 r--p 00104000 08:01 1847021 /lib/x86_64-linux-gnu/libm-2.19.so
7f13ee419000-7f13ee41a000 rw-p 00105000 08:01 1847021 /lib/x86_64-linux-gnu/libm-2.19.so
7f13ee41a000-7f13ee544000 r-xp 00000000 08:03 51643047 /home/yangliu/work/anaconda2/lib/libstdc++.so.6.0.24
7f13ee544000-7f13ee743000 ---p 0012a000 08:03 51643047 /home/yangliu/work/anaconda2/lib/libstdc++.so.6.0.24
7f13ee743000-7f13ee74d000 r--p 00129000 08:03 51643047 /home/yangliu/work/anaconda2/lib/libstdc++.so.6.0.24
7f13ee74d000-7f13ee751000 rw-p 00133000 08:03 51643047 /home/yangliu/work/anaconda2/lib/libstdc++.so.6.0.24
7f13ee751000-7f13ee754000 rw-p 00000000 00:00 0
7f13ee754000-7f13ee777000 r-xp 00000000 08:01 1847024 /lib/x86_64-linux-gnu/ld-2.19.so
7f13ee94c000-7f13ee951000 rw-p 00000000 00:00 0
7f13ee973000-7f13ee976000 rw-p 00000000 00:00 0
7f13ee976000-7f13ee977000 r--p 00022000 08:01 1847024 /lib/x86_64-linux-gnu/ld-2.19.so
7f13ee977000-7f13ee978000 rw-p 00023000 08:01 1847024 /lib/x86_64-linux-gnu/ld-2.19.so
7f13ee978000-7f13ee979000 rw-p 00000000 00:00 0
7ffdb476c000-7ffdb478e000 rw-p 00000000 00:00 0 [stack]
7ffdb47f5000-7ffdb47f7000 r--p 00000000 00:00 0 [vvar]
7ffdb47f7000-7ffdb47f9000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
./RUN_AI.sh: line 1: 4769 Aborted (core dumped) ./TAppEncoderStatic -c encoder_yuv_source.cfg -c encoder_intra_main.cfg

关于LDP_info的问题

您好,我想请问一下,我打开后缀为index的文件,发现QP值都不同,且只有第一行是22、27、32、37. 想问一下这是为什么,谢谢

请教:复现不成功的原因

您好,我在复现时出现问题,特来请教。
环境:win10、安装python3.5.4及cpu版tensorflow
运行RUN_AI,结果如下:
(1)调用video_to_cu_depth.py成功,完成所有帧CU预测;
(2)随即跳出:TAppEncoder.exe已停止工作。
请问出现这种情况可能是什么原因?请指教。

Question about LSTM

Hi Tianyi,

I am trying to reproduce your method on VVEnc, I just use your training code and the same network, except that I generate residual with IME for each 8x8 in the 64x64 and use slice qp instead of sequence qp.

Then I try to predict with your provided model, both LDP-CNN and LDP-LSTM. For 32x32 block, setting threshold to 0.5, LDP-CNN model can get about 75% accuracy, while LSTM get a very low accuracy. I also try to train LDP-CNN and LDP-LSTM with my data, similarly, LDP-CNN can get about 75% accuracy, but LSTM get about 60%.

I use the same training and validation dataset for LDP-CNN and LDP-LSTM, I just don't know what is the problem with LSTM? Is LSTM a good method for prediction from your point of view?

Besides, how do you calculate accuracy? The average accuracy for all the samples, or the average of negative sample accuracy and positive sample accuracy?

Looking for your reply, best regards !!!

训练模型问题

请问一下从小样本换成全部样本训练时,需要注意input_data.py和train_CNN_CTU64.py中哪里的修改,只修改了训练样本数量可以吗,训练报错在images=data [:, 0:4096].astype(np.float32),MemoryError: Unabale to allocate 1.22Gib for an array with shape (80000,4096)and data type float32

运行HM-16.5_Test_AI\bulid工程出现的问题

您好,博主
看到您发表的论文之后,决定复现一下具体结果,但是在这过程中遇到一些问题,研究了好久也没弄明白究竟哪里出了错,具体的信息如下,我电脑使用相关软件和配置为vs2015 tensorflow1.10 python3.5我在运行HM-16.5_Test_AI\bulid时候 ,不能出现编码信息。具体运行情况如下:

`Tensor("Conv2D:0", shape=(?, 1, 1, 1), dtype=float32)
Tensor("ResizeNearestNeighbor:0", shape=(?, 16, 16, 1), dtype=float32)
Tensor("LeakyRelu:0", shape=(?, 4, 4, 16), dtype=float32)
Tensor("LeakyRelu_1:0", shape=(?, 2, 2, 24), dtype=float32)
Tensor("LeakyRelu_2:0", shape=(?, 1, 1, 32), dtype=float32)
Tensor("Conv2D_4:0", shape=(?, 2, 2, 1), dtype=float32)
Tensor("ResizeNearestNeighbor_1:0", shape=(?, 32, 32, 1), dtype=float32)
Tensor("LeakyRelu_3:0", shape=(?, 8, 8, 16), dtype=float32)
Tensor("LeakyRelu_4:0", shape=(?, 4, 4, 24), dtype=float32)
Tensor("LeakyRelu_5:0", shape=(?, 2, 2, 32), dtype=float32)
Tensor("Conv2D_8:0", shape=(?, 4, 4, 1), dtype=float32)
Tensor("ResizeNearestNeighbor_2:0", shape=(?, 64, 64, 1), dtype=float32)
Tensor("LeakyRelu_6:0", shape=(?, 16, 16, 16), dtype=float32)
Tensor("LeakyRelu_7:0", shape=(?, 8, 8, 24), dtype=float32)
Tensor("LeakyRelu_8:0", shape=(?, 4, 4, 32), dtype=float32)
Tensor("concat:0", shape=(?, 2688), dtype=float32)
Tensor("cond/Merge:0", shape=(?, 64), dtype=float32)
Tensor("cond_1/Merge:0", shape=(?, 48), dtype=float32)
Tensor("cond_2/Merge:0", shape=(?, 1), dtype=float32)
Tensor("cond_3/Merge:0", shape=(?, 128), dtype=float32)
Tensor("cond_4/Merge:0", shape=(?, 96), dtype=float32)
Tensor("cond_5/Merge:0", shape=(?, 4), dtype=float32)
Tensor("cond_7/Merge:0", shape=(?, 256), dtype=float32)
Tensor("cond_8/Merge:0", shape=(?, 192), dtype=float32)
Tensor("cond_9/Merge:0", shape=(?, 16), dtype=float32)

D:\HM\HM-16.5_Test_AI\bin\vc10\x64\Release\BasketballPass_416x240_50.yuv frame 501/501 416x240
Predicting Time : 7.986 sec.`

HM software: Encoder Version [16.5] (including RExt)[Windows][VS 1900][64 bit]

python video_to_cu_depth.py D:\HM\HM-16.5_Test_AI\bin\vc10\x64\Release\BasketballPass_416x240_50.yuv 416 240 32

Input File : D:\HM\HM-16.5_Test_AI\bin\vc10\x64\Release\BasketballPass_416x240_50.yuv
Bitstream File : str.bin
Reconstruction File : rec.yuv
Real Format : 416x240 50Hz
Internal Format : 416x240 50Hz
Sequence PSNR output : Linear average only
Sequence MSE output : Disabled
Frame MSE output : Disabled
Cabac-zero-word-padding : Enabled
Frame/Field : Frame based coding
Frame index : 0 - 99 (100 frames)
Profile : main
CU size / depth / total-depth : 64 / 4 / 4
RQT trans. size (min / max) : 4 / 32
Max RQT depth inter : 3
Max RQT depth intra : 3
Min PCM size : 8
Motion search range : 64
Intra period : 1
Decoding refresh type : 0
QP : 32.00
Max dQP signaling depth : 0
Cb QP Offset : 0
Cr QP Offset : 0
QP adaptation : 0 (range=0)
GOP size : 1
Input bit depth : (Y:8, C:8)
MSB-extended bit depth : (Y:8, C:8)
Internal bit depth : (Y:8, C:8)
PCM sample bit depth : (Y:8, C:8)
Intra reference smoothing : Enabled
diff_cu_chroma_qp_offset_depth : -1
extended_precision_processing_flag : Disabled
implicit_rdpcm_enabled_flag : Disabled
explicit_rdpcm_enabled_flag : Disabled
transform_skip_rotation_enabled_flag : Disabled
transform_skip_context_enabled_flag : Disabled
cross_component_prediction_enabled_flag: Disabled
high_precision_offsets_enabled_flag : Disabled
persistent_rice_adaptation_enabled_flag: Disabled
cabac_bypass_alignment_enabled_flag : Disabled
log2_sao_offset_scale_luma : 0
log2_sao_offset_scale_chroma : 0
Cost function: : Lossy coding (default)
RateControl : 0
Max Num Merge Candidates : 5

TOOL CFG: IBD:0 HAD:1 RDQ:1 RDQTS:1 RDpenalty:0 SQP:0 ASR:0 FEN:1 ECU:0 FDM:1 CFM:0 ESD:0 RQT:1 TransformSkip:1 TransformSkipFast:1 TransformSkipLog2MaxSize:2 Slice: M=0 SliceSegment: M=0 CIP:0 SAO:1 PCM:0 TransQuantBypassEnabled:0 WPP:0 WPB:0 PME:2 WaveFrontSynchro:0 WaveFrontSubstreams:1 ScalingList:0 TMVPMode:1 AQpS:0 SignBitHidingFlag:1 RecalQP:0

Non-environment-variable-controlled macros set as follows:

                        RExt__DECODER_DEBUG_BIT_STATISTICS =   0
                              RExt__HIGH_BIT_DEPTH_SUPPORT =   0
                    RExt__HIGH_PRECISION_FORWARD_TRANSFORM =   0
                                O0043_BEST_EFFORT_DECODING =   0

           Input ChromaFormatIDC =   4:2:0

Output (internal) ChromaFormatIDC = 4:2:0
请按任意键继续
`
于是我单独运行D:\HM\HM-16.5_Test_AI\bin下文件video_to_cu_depth.py文件 出现
Tensor("Conv2D:0", shape=(?, 1, 1, 1), dtype=float32)
Tensor("ResizeNearestNeighbor:0", shape=(?, 16, 16, 1), dtype=float32)
Tensor("LeakyRelu:0", shape=(?, 4, 4, 16), dtype=float32)
Tensor("LeakyRelu_1:0", shape=(?, 2, 2, 24), dtype=float32)
Tensor("LeakyRelu_2:0", shape=(?, 1, 1, 32), dtype=float32)
Tensor("Conv2D_4:0", shape=(?, 2, 2, 1), dtype=float32)
Tensor("ResizeNearestNeighbor_1:0", shape=(?, 32, 32, 1), dtype=float32)
Tensor("LeakyRelu_3:0", shape=(?, 8, 8, 16), dtype=float32)
Tensor("LeakyRelu_4:0", shape=(?, 4, 4, 24), dtype=float32)
Tensor("LeakyRelu_5:0", shape=(?, 2, 2, 32), dtype=float32)
Tensor("Conv2D_8:0", shape=(?, 4, 4, 1), dtype=float32)
Tensor("ResizeNearestNeighbor_2:0", shape=(?, 64, 64, 1), dtype=float32)
Tensor("LeakyRelu_6:0", shape=(?, 16, 16, 16), dtype=float32)
Tensor("LeakyRelu_7:0", shape=(?, 8, 8, 24), dtype=float32)
Tensor("LeakyRelu_8:0", shape=(?, 4, 4, 32), dtype=float32)
Tensor("concat:0", shape=(?, 2688), dtype=float32)
Tensor("cond/Merge:0", shape=(?, 64), dtype=float32)
Tensor("cond_1/Merge:0", shape=(?, 48), dtype=float32)
Tensor("cond_2/Merge:0", shape=(?, 1), dtype=float32)
Tensor("cond_3/Merge:0", shape=(?, 128), dtype=float32)
Tensor("cond_4/Merge:0", shape=(?, 96), dtype=float32)
Tensor("cond_5/Merge:0", shape=(?, 4), dtype=float32)
Tensor("cond_7/Merge:0", shape=(?, 256), dtype=float32)
Tensor("cond_8/Merge:0", shape=(?, 192), dtype=float32)
Tensor("cond_9/Merge:0", shape=(?, 16), dtype=float32)
File"video_to_cu_depth.py",line 120,in
assert len(sys.argv)==5
AssertionError
请问是为什么呢???如果你能回答 很感谢!

Could you tell me how to training method.

Hi,

I'm interesting your HEVC complexity reduction, and want to study hard.
Could you possibly tell me how to training method in tensorflow ?
I would be grateful if you could send training framework to me.
This is my email address.
[email protected]

Best Regards

AI训练复现不成功的原因?Valid Loss超大

您好!我下载了您的代码然后直接运行了HEVC-Complexity-Reduction-master\ETH-CNN_Training_AI里面的train_CNN_CTU64.py。因为readme文件里面说是可以直接运行的?所以我也没有进行修改。
然而我得到的结果图跟原本就在models文件夹里面的很不一样,具体如下
image
您提供的结果(原本就在models文件夹里面的)其中一张如下
image
可以看到我的valid loss等等很不正常,数值超大,但是我刚刚进行代码的学习暂时不是很清楚造成问题的原因是什么。请问您能解释一下吗?感谢!

(我的配置是windows10 GTX1060MQ python3.7 tensorflow1.13.1 使用pycharm运行)
还有我的邮箱是[email protected]

Issues with tensorflow and python versions

which versions of tensorflow (with GPU), python and visual studio are suitable to run this code

my system specifications are

windows 10
64 bit
GPU--NVIDIA GEFORCE GTX 1650

No issue

Thanks for ur great and amazing work!!!

non loss

Hello. When I train the model, the value of the Loss function is non(). Where is the problem?

step 11000: loss=[[nan nan nan] [nan nan nan]], accu=[[0.231 0.479 0.462] [0.210 0.494 0.441]],

代码衔接相关问题

您好,很高兴能看到您的代码,如获至宝!

如果我没有理解错的话,您使用Deep Learning预测CU划分方法,并使用预测后的CU划分方案编码
我的问题是,您是通过哪几行或哪些代码,将预测好的CU划分方法导入HM当中进行编码的呢?

感激不尽!
我的邮箱是[email protected]

网络优化accuracy = 1.000000, 0.802571, 0.768633

你好,我按照代码中设置迭代次数100万,但是valid和test的准确率只有80和70多,训练集已经100了,但是看训练的loss和acc曲线趋势也没有过拟合,请问关于这种情况优化还有哪些建议呢

Could you tell me how to training inter mode method.

Hi,

Recently I'm trying to reproduce the training program as described in your paper "Reducing Complexity of HEVC: A Deep Learning Approach" for inter mode, but got stuck. I would be grateful if you could send me the training program of inter mode . My email address is: [email protected].

Best regards.

About the whole dataset for training/validation/testing

Hi,
I've been concerned about your ETH-CNN architecture and studied for a good while, and it's glad to see you further update your work. Thanks for your contribution!
But, during my training process, the biggest issue is the dataset. I tried to reproduce the training step but couldn't get a good result due to the small training set.
Could you possibly tell me how to generate the whole dataset?
This is my email address:
[email protected]
I would be grateful if you could send the code for this part to me.

Sincerely,
Hong

HM调用python训练好的模型有什么好方法吗?

因为用python训练网络比较方便,TensorFlow,pytorch等都可以用,但是怎样把训练好的模型导入HM里做预测好像都不是很方便,而且很慢,请问您用过什么比较有效的方法吗?

谢谢, 祝好!

关于Thr_info.txt

您好。关于HM-16.5_AI(帧内),Thr_info.txt的内容为8 0.5 0.5 0.5 0.5 0.5 0.5
。按照我的理解,阈值应该为六个数。是否应该去掉8 ? 8 有什么特别的用处吗?

请教:关于建立数据库。

您好,我看了关于构建数据库的那部分,但是没有明白是如何构建的。因为我需要用于屏幕内容视频编码的培训,验证和测试数据,所以想请教您是如何将编码后CU的深度和CU亮度像素值信息提取出来,以构建数据库的呢?
希望您能回答,十分感谢!

请教数据集

您好,

我最近想要复现您的《Reducing Complexity of HEVC: A Deep Learning Approach》中帧内CTU划分部分,想请教您完整的数据集可以发我一份嘛,我的邮箱是[email protected],因为现在复现的效果很不好,我又刚开始做,希望得到您的帮助,谢谢您!!!
您提供的原始YUV数据是分成了7个压缩的文件夹,我不知道按照什么顺序把多个相同分辨率的yuv文件合并成一个,如果顺序乱了的话对应的标签就乱了。

关于sample size = 4992

您好,请问sample size为什么设置为4992?前面4096是一个CTU的像素,后面的64+16x52是哪些数据呢?期望得到您的答复,谢谢。

The purpose of y_image_valid_32 and y_image_valid_16

Hi,

According to the code in net_CTU64.py, it is necessary to get the ground truth for 6464, 3232 and 16*16 with the HCPM map as input, so you first reshape the input y into y_image and the get 3 maps y_image_16, y_image_32 and y_image_64.
My question is why we still need y_image_valid_32 and y_image_valid_16? What are the meanings of these two terms since they are all used in cross entropy loss and accuracy?
Thanks!

Best Regards.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.