Comments (13)
上层 demo 软件很多额外开销
echo 0x100 > /sys/module/rk_vcodec/parameters/mpp_dev_debug
看硬件时间
from mpp.
加了 -o 选项有写文件,去掉
from mpp.
去掉写文件性能好太多了,有点没道理,里面 mpp_buffer_get_ptr获取内存地址再拷贝到文件中
rk_vcodec: fdf80200.rkvdec: pid: 18375, session: 00000000cb19fe7b, time: 3322 us
rk_vcodec: fdf80200.rkvdec: pid: 18375, session: 00000000cb19fe7b, time: 2526 us
mpi_dec_test: decode 480 frames time 952 ms delay 22 ms fps 503.99
加了写到文件性能
rk_vcodec: fdf80200.rkvdec: pid: 17549, session: 000000006a21feb5, time: 3597 us
mpi_dec_test: decode 480 frames time 22411 ms delay 9 ms fps 21.42
from mpp.
进一步分析了下,从mpp_buffer_get_ptr拷贝数据出来的性能很差
这是log,dump_mpp_frame_to_file耗时超过一帧时间,在58~68这个性能级别
11-14 02:51:25.978 5817 5822 I mpi_dec_test: dump_mpp_frame_to_file:68433
11-14 02:51:25.978 5817 5822 I mpi_dec_test: 0xe84c0280 decode get frame 472 fps:19.22 max_usage:9216000
11-14 02:51:26.036 5817 5822 I mpi_dec_test: dump_mpp_frame_to_file:57518
11-14 02:51:26.036 5817 5822 I mpi_dec_test: 0xe84c0280 decode get frame 473 fps:19.22 max_usage:9216000
11-14 02:51:26.094 5817 5822 I mpi_dec_test: dump_mpp_frame_to_file:57636
11-14 02:51:26.094 5817 5822 I mpi_dec_test: 0xe84c0280 decode get frame 474 fps:19.21 max_usage:9216000
11-14 02:51:26.152 5817 5822 I mpi_dec_test: dump_mpp_frame_to_file:57991
11-14 02:51:26.153 5817 5822 I mpi_dec_test: 0xe84c0280 decode get frame 475 fps:19.21 max_usage:9216000
11-14 02:51:26.211 5817 5822 I mpi_dec_test: dump_mpp_frame_to_file:57649
11-14 02:51:26.215 5817 5822 I mpi_dec_test: 0xe84c0280 found last packet
11-14 02:51:26.223 0 0 I rk_vcodec: fdf80200.rkvdec: pid: 5817, session: 0000000050327563, time: 3290 us
11-14 02:51:26.218 5817 5822 I mpi_dec_test: 0xe84c0280 decode get frame 476 fps:19.20 max_usage:9216000
11-14 02:51:26.228 0 0 I rk_vcodec: fdf80200.rkvdec: pid: 5817, session: 0000000050327563, time: 7940 us
11-14 02:51:26.228 0 0 I rk_vcodec: fdf80200.rkvdec: pid: 5817, session: 0000000050327563, time: 6589 us
11-14 02:51:26.232 0 0 I rk_vcodec: fdf80200.rkvdec: pid: 5817, session: 0000000050327563, time: 6481 us
11-14 02:51:26.232 0 0 I mpp_rkvdec2 fdf80200.rkvdec: resetting...
11-14 02:51:26.232 0 0 I mpp_rkvdec2 fdf80200.rkvdec: reset done
11-14 02:51:26.285 5817 5822 I mpi_dec_test: dump_mpp_frame_to_file:65800
11-14 02:51:26.286 5817 5822 I mpi_dec_test: 0xe84c0280 decode get frame 477 fps:19.19 max_usage:9216000
11-14 02:51:26.345 5817 5822 I mpi_dec_test: dump_mpp_frame_to_file:58320
11-14 02:51:26.347 5817 5822 I mpi_dec_test: 0xe84c0280 decode get frame 478 fps:19.18 max_usage:9216000
11-14 02:51:26.406 5817 5822 I mpi_dec_test: dump_mpp_frame_to_file:58768
11-14 02:51:26.407 5817 5822 I mpi_dec_test: 0xe84c0280 decode get frame 479 fps:19.17 max_usage:9216000 err 1 discard 0
11-14 02:51:26.407 5817 5822 I mpi_dec_test: dump_mpp_frame_to_file:2
11-14 02:51:26.408 5817 5822 I mpi_dec_test: 0xe84c0280 found last packet
11-14 02:51:26.408 5817 5822 I mpi_dec_test: decode 480 frames time 25001 ms delay 19 ms fps 19.20
11-14 02:51:26.423 5817 5817 I mpi_dec_test: test success max memory 8.79 MB
对应文件修改
case MPP_FMT_YUV420SP_VU :
case MPP_FMT_YUV420SP : {
RK_U32 i;
RK_U8 *base_y = base;
RK_U8 *base_c = base + h_stride * v_stride;
RK_U8 *tmp = mpp_malloc(RK_U8, h_stride * height * 3);
memcpy(tmp, base, h_stride * height * 3/2);
mpp_free(tmp);
// for (i = 0; i < height; i++, base_y += h_stride) {
// fwrite(base_y, 1, width, fp);
// }
// for (i = 0; i < height / 2; i++, base_c += h_stride) {
// fwrite(base_c, 1, width, fp);
// }
} break;
from mpp.
如果不做拷贝的性能,速度是非常不错的
11-14 02:54:01.634 7392 7397 I mpi_dec_test: 0xee2c0580 decode get frame 476 fps:467.07 max_usage:9216000
11-14 02:54:01.635 7392 7397 I mpi_dec_test: dump_mpp_frame_to_file:161
11-14 02:54:01.636 7392 7397 I mpi_dec_test: 0xee2c0580 decode get frame 477 fps:467.33 max_usage:9216000
11-14 02:54:01.636 7392 7397 I mpi_dec_test: dump_mpp_frame_to_file:142
11-14 02:54:01.642 0 0 I rk_vcodec: fdf80200.rkvdec: pid: 7392, session: 00000000953ba828, time: 3871 us
11-14 02:54:01.643 0 0 I rk_vcodec: fdf80200.rkvdec: pid: 7392, session: 00000000953ba828, time: 3643 us
11-14 02:54:01.638 7392 7397 I mpi_dec_test: 0xee2c0580 decode get frame 478 fps:467.16 max_usage:9216000
11-14 02:54:01.639 7392 7397 I mpi_dec_test: dump_mpp_frame_to_file:298
11-14 02:54:01.640 7392 7397 I mpi_dec_test: 0xee2c0580 decode get frame 479 fps:467.41 max_usage:9216000 err 1 discard 0
11-14 02:54:01.640 7392 7397 I mpi_dec_test: dump_mpp_frame_to_file:1
11-14 02:54:01.645 0 0 I rk_vcodec: fdf80200.rkvdec: pid: 7392, session: 00000000953ba828, time: 3714 us
11-14 02:54:01.640 7392 7397 I mpi_dec_test: 0xee2c0580 found last packet
11-14 02:54:01.640 7392 7397 I mpi_dec_test: decode 480 frames time 1039 ms delay 14 ms fps 461.83
from mpp.
进一步分析了下,从mpp_buffer_get_ptr拷贝数据出来的性能很差
这是log,dump_mpp_frame_to_file耗时超过一帧时间,在58~68这个性能级别
11-14 02:51:25.978 5817 5822 I mpi_dec_test: dump_mpp_frame_to_file:68433 11-14 02:51:25.978 5817 5822 I mpi_dec_test: 0xe84c0280 decode get frame 472 fps:19.22 max_usage:9216000 11-14 02:51:26.036 5817 5822 I mpi_dec_test: dump_mpp_frame_to_file:57518 11-14 02:51:26.036 5817 5822 I mpi_dec_test: 0xe84c0280 decode get frame 473 fps:19.22 max_usage:9216000 11-14 02:51:26.094 5817 5822 I mpi_dec_test: dump_mpp_frame_to_file:57636 11-14 02:51:26.094 5817 5822 I mpi_dec_test: 0xe84c0280 decode get frame 474 fps:19.21 max_usage:9216000 11-14 02:51:26.152 5817 5822 I mpi_dec_test: dump_mpp_frame_to_file:57991 11-14 02:51:26.153 5817 5822 I mpi_dec_test: 0xe84c0280 decode get frame 475 fps:19.21 max_usage:9216000 11-14 02:51:26.211 5817 5822 I mpi_dec_test: dump_mpp_frame_to_file:57649 11-14 02:51:26.215 5817 5822 I mpi_dec_test: 0xe84c0280 found last packet 11-14 02:51:26.223 0 0 I rk_vcodec: fdf80200.rkvdec: pid: 5817, session: 0000000050327563, time: 3290 us 11-14 02:51:26.218 5817 5822 I mpi_dec_test: 0xe84c0280 decode get frame 476 fps:19.20 max_usage:9216000 11-14 02:51:26.228 0 0 I rk_vcodec: fdf80200.rkvdec: pid: 5817, session: 0000000050327563, time: 7940 us 11-14 02:51:26.228 0 0 I rk_vcodec: fdf80200.rkvdec: pid: 5817, session: 0000000050327563, time: 6589 us 11-14 02:51:26.232 0 0 I rk_vcodec: fdf80200.rkvdec: pid: 5817, session: 0000000050327563, time: 6481 us 11-14 02:51:26.232 0 0 I mpp_rkvdec2 fdf80200.rkvdec: resetting... 11-14 02:51:26.232 0 0 I mpp_rkvdec2 fdf80200.rkvdec: reset done 11-14 02:51:26.285 5817 5822 I mpi_dec_test: dump_mpp_frame_to_file:65800 11-14 02:51:26.286 5817 5822 I mpi_dec_test: 0xe84c0280 decode get frame 477 fps:19.19 max_usage:9216000 11-14 02:51:26.345 5817 5822 I mpi_dec_test: dump_mpp_frame_to_file:58320 11-14 02:51:26.347 5817 5822 I mpi_dec_test: 0xe84c0280 decode get frame 478 fps:19.18 max_usage:9216000 11-14 02:51:26.406 5817 5822 I mpi_dec_test: dump_mpp_frame_to_file:58768 11-14 02:51:26.407 5817 5822 I mpi_dec_test: 0xe84c0280 decode get frame 479 fps:19.17 max_usage:9216000 err 1 discard 0 11-14 02:51:26.407 5817 5822 I mpi_dec_test: dump_mpp_frame_to_file:2 11-14 02:51:26.408 5817 5822 I mpi_dec_test: 0xe84c0280 found last packet 11-14 02:51:26.408 5817 5822 I mpi_dec_test: decode 480 frames time 25001 ms delay 19 ms fps 19.20 11-14 02:51:26.423 5817 5817 I mpi_dec_test: test success max memory 8.79 MB
对应文件修改
case MPP_FMT_YUV420SP_VU : case MPP_FMT_YUV420SP : { RK_U32 i; RK_U8 *base_y = base; RK_U8 *base_c = base + h_stride * v_stride; RK_U8 *tmp = mpp_malloc(RK_U8, h_stride * height * 3); memcpy(tmp, base, h_stride * height * 3/2); mpp_free(tmp); // for (i = 0; i < height; i++, base_y += h_stride) { // fwrite(base_y, 1, width, fp); // } // for (i = 0; i < height / 2; i++, base_c += h_stride) { // fwrite(base_c, 1, width, fp); // } } break;
这个拷贝为什么性能这么差,帮忙分析下
from mpp.
这很正常啊,硬件 buffer 默认开的 non-cachable 的啊,不写文件,不去做映射,速度就很快啊
from mpp.
这很正常啊,硬件 buffer 默认开的 non-cachable 的啊,不写文件,不去做映射,速度就很快啊
在RK3288平台上测试数据会比较好,有什么优化方法吗?3566这个平台A55核心不应该这么拉跨才对
11-14 16:17:51.053 6227 6232 I mpi_dec_test: dump_mpp_frame_to_file:5705
11-14 16:17:51.053 6227 6232 I mpi_dec_test: 0xb1393000 decode get frame 1790 fps:148.58 max_usage:9953280
11-14 16:17:51.058 6227 6232 I mpi_dec_test: dump_mpp_frame_to_file:5187
11-14 16:17:51.058 6227 6232 I mpi_dec_test: 0xb1393000 decode get frame 1791 fps:148.59 max_usage:9953280
11-14 16:17:51.064 6227 6232 I mpi_dec_test: dump_mpp_frame_to_file:5500
11-14 16:17:51.064 6227 6232 I mpi_dec_test: 0xb1393000 decode get frame 1792 fps:148.61 max_usage:9953280
11-14 16:17:51.069 6227 6232 I mpi_dec_test: dump_mpp_frame_to_file:4800
11-14 16:17:51.069 6227 6232 I mpi_dec_test: 0xb1393000 decode get frame 1793 fps:148.63 max_usage:9953280
11-14 16:17:51.074 6227 6232 I mpi_dec_test: dump_mpp_frame_to_file:4502
11-14 16:17:51.074 6227 6232 I mpi_dec_test: 0xb1393000 decode get frame 1794 fps:148.65 max_usage:9953280
11-14 16:17:51.079 6227 6232 I mpi_dec_test: dump_mpp_frame_to_file:5274
11-14 16:17:51.079 6227 6232 I mpi_dec_test: 0xb1393000 decode get frame 1795 fps:148.67 max_usage:9953280
11-14 16:17:51.084 6227 6232 I mpi_dec_test: dump_mpp_frame_to_file:4562
11-14 16:17:51.084 6227 6232 I mpi_dec_test: 0xb1393000 found last packet
11-14 16:17:51.089 6227 6232 I mpi_dec_test: 0xb1393000 decode get frame 1796 fps:148.63 max_usage:9953280
11-14 16:17:51.094 6227 6232 I mpi_dec_test: dump_mpp_frame_to_file:5077
11-14 16:17:51.096 6227 6232 I mpi_dec_test: 0xb1393000 decode get frame 1797 fps:148.63 max_usage:9953280
11-14 16:17:51.100 6227 6232 I mpi_dec_test: dump_mpp_frame_to_file:4358
11-14 16:17:51.101 6227 6232 I mpi_dec_test: 0xb1393000 decode get frame 1798 fps:148.65 max_usage:9953280
11-14 16:17:51.106 6227 6232 I mpi_dec_test: dump_mpp_frame_to_file:4987
11-14 16:17:51.106 6227 6232 I mpi_dec_test: 0xb1393000 found last packet
11-14 16:17:51.106 6227 6232 I mpi_dec_test: decode 1799 frames time 12119 ms delay 18 ms fps 148.44
from mpp.
3288 的 cpu 强啊……3566 的核相对差了不止一个档次
from mpp.
看硬件解码时间吧,软件时间只能做为参考
from mpp.
硬解码时间,大家都差不多。
现在是解码后的数据搬运时间差距比较大,3288是DDR3,3566是DDR4,CPU性能差再多不应该拷贝720P的数据耗时差了快10倍
3288/3566的CPU算力跑分,有对比过,3288确实会好,3566也不会差很多
3288 CPU: 26332
3566 CPU: 24593
from mpp.
优化下来了,64位性能反而没有32位好,64位从45fps->90+fps,32位从33fps->150+fps 提升巨大
32位数据
11-16 10:05:15.645 15677 15687 I mpi_dec_test: 0xeaf00460 decode get frame 1796 fps:136.03 max_usage:16588800
11-16 10:05:15.648 15677 15687 I mpi_dec_test: dump_mpp_frame_to_file:2495
11-16 10:05:15.651 15677 15687 I mpi_dec_test: 0xeaf00460 decode get frame 1797 fps:136.04 max_usage:16588800
11-16 10:05:15.654 15677 15687 I mpi_dec_test: dump_mpp_frame_to_file:2435
11-16 10:05:15.658 15677 15687 I mpi_dec_test: 0xeaf00460 decode get frame 1798 fps:136.05 max_usage:16588800
11-16 10:05:15.660 15677 15687 I mpi_dec_test: dump_mpp_frame_to_file:2382
11-16 10:05:15.660 15677 15687 I mpi_dec_test: 0xeaf00460 found last packet
11-16 10:05:15.661 15677 15687 I mpi_dec_test: decode 1799 frames time 13243 ms delay 19 ms fps 135.84
11-16 10:05:15.681 15677 15677 I mpi_dec_test: test success max memory 15.82 MB
64位数据
11-16 10:06:41.503 16503 16507 I mpi_dec_test: dump_mpp_frame_to_file:6375
11-16 10:06:41.507 16503 16507 I mpi_dec_test: 0xb400007698a30a60 decode get frame 474 fps:85.83 max_usage:9216000
11-16 10:06:41.511 16503 16507 I utils : =======Justa /home/justa/work/opensource/mpp/utils/utils.c:118
11-16 10:06:41.511 16503 16507 I mpi_dec_test: dump_mpp_frame_to_file:4837
11-16 10:06:41.515 16503 16507 I mpi_dec_test: 0xb400007698a30a60 decode get frame 475 fps:85.88 max_usage:9216000
11-16 10:06:41.524 16503 16507 I utils : =======Justa /home/justa/work/opensource/mpp/utils/utils.c:118
11-16 10:06:41.524 16503 16507 I mpi_dec_test: dump_mpp_frame_to_file:8669
11-16 10:06:41.527 16503 16507 I mpi_dec_test: 0xb400007698a30a60 decode get frame 476 fps:85.87 max_usage:9216000
11-16 10:06:41.532 16503 16507 I utils : =======Justa /home/justa/work/opensource/mpp/utils/utils.c:118
from mpp.
方案可用,先关掉
from mpp.
Related Issues (20)
- Missing linux/dma-buf.h with older Linux headers HOT 1
- rv1126解码h264视频显示到显示屏幕上看,画面一卡一卡的,有时短暂时间又会流畅 HOT 5
- mpp_log: can not found match soc name: radxa,radxa-cm3-io rockchip,rk3566 HOT 1
- Build mpp, and gets Segmentation Fault HOT 2
- rv1126使用rkmedia的vdec接口进行解码jpeg图片时段错误,解码h264/h265没有问题(但是有内存泄漏问题) HOT 8
- rv1126解码h264有内存泄漏 HOT 3
- build mpp,but test log show Rk3588 does not support HOT 1
- 请教下,测试RK3588 H264硬件编码出来的帧都是I帧,是什么原因呢? HOT 7
- 能否修改mpp目录以及mpp/legacy目录下的CMakeLists.txt,使得能否同时生成同名的动态库以及静态库? HOT 1
- 解码后的数据拷贝很慢 HOT 4
- void os_log_error(const char* tag, const char* msg, va_list list)会崩 HOT 3
- 解码 hevc 码流一直报错 PPS id out of range: 0 HOT 5
- jpeg decode stuck using test/mpi_dec_mt_test
- MPP can't be initialized on RK3568B2 HOT 4
- 如何绑定纹理指针,让MPP解码直接将数据赋值给纹理指针? HOT 2
- 最新包 aarch64崩溃问题 HOT 10
- RK3588在1080p下的最大编码帧率是多少? HOT 1
- Assistance Needed - NDA and SDK Access for Rockchip (rk3566/rk3588) HOT 2
- 1920*1080大小输入图像帧拷贝到mpp frame buffer占用CPU高 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mpp.