Giter VIP home page Giter VIP logo

Comments (13)

HermanChen avatar HermanChen commented on May 30, 2024

上层 demo 软件很多额外开销
echo 0x100 > /sys/module/rk_vcodec/parameters/mpp_dev_debug
看硬件时间

from mpp.

HermanChen avatar HermanChen commented on May 30, 2024

加了 -o 选项有写文件,去掉

from mpp.

Justa-Cai avatar Justa-Cai commented on May 30, 2024

去掉写文件性能好太多了,有点没道理,里面 mpp_buffer_get_ptr获取内存地址再拷贝到文件中

rk_vcodec: fdf80200.rkvdec: pid: 18375, session: 00000000cb19fe7b, time: 3322 us
rk_vcodec: fdf80200.rkvdec: pid: 18375, session: 00000000cb19fe7b, time: 2526 us
mpi_dec_test: decode 480 frames time 952 ms delay  22 ms fps 503.99

加了写到文件性能

rk_vcodec: fdf80200.rkvdec: pid: 17549, session: 000000006a21feb5, time: 3597 us
mpi_dec_test: decode 480 frames time 22411 ms delay   9 ms fps 21.42

from mpp.

Justa-Cai avatar Justa-Cai commented on May 30, 2024

进一步分析了下,从mpp_buffer_get_ptr拷贝数据出来的性能很差

这是log,dump_mpp_frame_to_file耗时超过一帧时间,在58~68这个性能级别

11-14 02:51:25.978  5817  5822 I mpi_dec_test: dump_mpp_frame_to_file:68433
11-14 02:51:25.978  5817  5822 I mpi_dec_test: 0xe84c0280 decode get frame 472 fps:19.22 max_usage:9216000
11-14 02:51:26.036  5817  5822 I mpi_dec_test: dump_mpp_frame_to_file:57518
11-14 02:51:26.036  5817  5822 I mpi_dec_test: 0xe84c0280 decode get frame 473 fps:19.22 max_usage:9216000
11-14 02:51:26.094  5817  5822 I mpi_dec_test: dump_mpp_frame_to_file:57636
11-14 02:51:26.094  5817  5822 I mpi_dec_test: 0xe84c0280 decode get frame 474 fps:19.21 max_usage:9216000
11-14 02:51:26.152  5817  5822 I mpi_dec_test: dump_mpp_frame_to_file:57991
11-14 02:51:26.153  5817  5822 I mpi_dec_test: 0xe84c0280 decode get frame 475 fps:19.21 max_usage:9216000
11-14 02:51:26.211  5817  5822 I mpi_dec_test: dump_mpp_frame_to_file:57649
11-14 02:51:26.215  5817  5822 I mpi_dec_test: 0xe84c0280 found last packet
11-14 02:51:26.223     0     0 I rk_vcodec: fdf80200.rkvdec: pid: 5817, session: 0000000050327563, time: 3290 us
11-14 02:51:26.218  5817  5822 I mpi_dec_test: 0xe84c0280 decode get frame 476 fps:19.20 max_usage:9216000
11-14 02:51:26.228     0     0 I rk_vcodec: fdf80200.rkvdec: pid: 5817, session: 0000000050327563, time: 7940 us
11-14 02:51:26.228     0     0 I rk_vcodec: fdf80200.rkvdec: pid: 5817, session: 0000000050327563, time: 6589 us
11-14 02:51:26.232     0     0 I rk_vcodec: fdf80200.rkvdec: pid: 5817, session: 0000000050327563, time: 6481 us
11-14 02:51:26.232     0     0 I mpp_rkvdec2 fdf80200.rkvdec: resetting...
11-14 02:51:26.232     0     0 I mpp_rkvdec2 fdf80200.rkvdec: reset done
11-14 02:51:26.285  5817  5822 I mpi_dec_test: dump_mpp_frame_to_file:65800
11-14 02:51:26.286  5817  5822 I mpi_dec_test: 0xe84c0280 decode get frame 477 fps:19.19 max_usage:9216000
11-14 02:51:26.345  5817  5822 I mpi_dec_test: dump_mpp_frame_to_file:58320
11-14 02:51:26.347  5817  5822 I mpi_dec_test: 0xe84c0280 decode get frame 478 fps:19.18 max_usage:9216000
11-14 02:51:26.406  5817  5822 I mpi_dec_test: dump_mpp_frame_to_file:58768
11-14 02:51:26.407  5817  5822 I mpi_dec_test: 0xe84c0280 decode get frame 479 fps:19.17 max_usage:9216000 err 1 discard 0
11-14 02:51:26.407  5817  5822 I mpi_dec_test: dump_mpp_frame_to_file:2
11-14 02:51:26.408  5817  5822 I mpi_dec_test: 0xe84c0280 found last packet
11-14 02:51:26.408  5817  5822 I mpi_dec_test: decode 480 frames time 25001 ms delay  19 ms fps 19.20
11-14 02:51:26.423  5817  5817 I mpi_dec_test: test success max memory 8.79 MB

对应文件修改

    case MPP_FMT_YUV420SP_VU :
    case MPP_FMT_YUV420SP : {
        RK_U32 i;
        RK_U8 *base_y = base;
        RK_U8 *base_c = base + h_stride * v_stride;
        RK_U8 *tmp = mpp_malloc(RK_U8, h_stride * height * 3);

        memcpy(tmp, base, h_stride * height * 3/2);

        mpp_free(tmp);

        // for (i = 0; i < height; i++, base_y += h_stride) {
        //     fwrite(base_y, 1, width, fp);
        // }
        // for (i = 0; i < height / 2; i++, base_c += h_stride) {
        //     fwrite(base_c, 1, width, fp);
        // }
    } break;

from mpp.

Justa-Cai avatar Justa-Cai commented on May 30, 2024

如果不做拷贝的性能,速度是非常不错的

11-14 02:54:01.634  7392  7397 I mpi_dec_test: 0xee2c0580 decode get frame 476 fps:467.07 max_usage:9216000
11-14 02:54:01.635  7392  7397 I mpi_dec_test: dump_mpp_frame_to_file:161
11-14 02:54:01.636  7392  7397 I mpi_dec_test: 0xee2c0580 decode get frame 477 fps:467.33 max_usage:9216000
11-14 02:54:01.636  7392  7397 I mpi_dec_test: dump_mpp_frame_to_file:142
11-14 02:54:01.642     0     0 I rk_vcodec: fdf80200.rkvdec: pid: 7392, session: 00000000953ba828, time: 3871 us
11-14 02:54:01.643     0     0 I rk_vcodec: fdf80200.rkvdec: pid: 7392, session: 00000000953ba828, time: 3643 us
11-14 02:54:01.638  7392  7397 I mpi_dec_test: 0xee2c0580 decode get frame 478 fps:467.16 max_usage:9216000
11-14 02:54:01.639  7392  7397 I mpi_dec_test: dump_mpp_frame_to_file:298
11-14 02:54:01.640  7392  7397 I mpi_dec_test: 0xee2c0580 decode get frame 479 fps:467.41 max_usage:9216000 err 1 discard 0
11-14 02:54:01.640  7392  7397 I mpi_dec_test: dump_mpp_frame_to_file:1
11-14 02:54:01.645     0     0 I rk_vcodec: fdf80200.rkvdec: pid: 7392, session: 00000000953ba828, time: 3714 us
11-14 02:54:01.640  7392  7397 I mpi_dec_test: 0xee2c0580 found last packet
11-14 02:54:01.640  7392  7397 I mpi_dec_test: decode 480 frames time 1039 ms delay  14 ms fps 461.83

from mpp.

Justa-Cai avatar Justa-Cai commented on May 30, 2024

进一步分析了下,从mpp_buffer_get_ptr拷贝数据出来的性能很差

这是log,dump_mpp_frame_to_file耗时超过一帧时间,在58~68这个性能级别

11-14 02:51:25.978  5817  5822 I mpi_dec_test: dump_mpp_frame_to_file:68433
11-14 02:51:25.978  5817  5822 I mpi_dec_test: 0xe84c0280 decode get frame 472 fps:19.22 max_usage:9216000
11-14 02:51:26.036  5817  5822 I mpi_dec_test: dump_mpp_frame_to_file:57518
11-14 02:51:26.036  5817  5822 I mpi_dec_test: 0xe84c0280 decode get frame 473 fps:19.22 max_usage:9216000
11-14 02:51:26.094  5817  5822 I mpi_dec_test: dump_mpp_frame_to_file:57636
11-14 02:51:26.094  5817  5822 I mpi_dec_test: 0xe84c0280 decode get frame 474 fps:19.21 max_usage:9216000
11-14 02:51:26.152  5817  5822 I mpi_dec_test: dump_mpp_frame_to_file:57991
11-14 02:51:26.153  5817  5822 I mpi_dec_test: 0xe84c0280 decode get frame 475 fps:19.21 max_usage:9216000
11-14 02:51:26.211  5817  5822 I mpi_dec_test: dump_mpp_frame_to_file:57649
11-14 02:51:26.215  5817  5822 I mpi_dec_test: 0xe84c0280 found last packet
11-14 02:51:26.223     0     0 I rk_vcodec: fdf80200.rkvdec: pid: 5817, session: 0000000050327563, time: 3290 us
11-14 02:51:26.218  5817  5822 I mpi_dec_test: 0xe84c0280 decode get frame 476 fps:19.20 max_usage:9216000
11-14 02:51:26.228     0     0 I rk_vcodec: fdf80200.rkvdec: pid: 5817, session: 0000000050327563, time: 7940 us
11-14 02:51:26.228     0     0 I rk_vcodec: fdf80200.rkvdec: pid: 5817, session: 0000000050327563, time: 6589 us
11-14 02:51:26.232     0     0 I rk_vcodec: fdf80200.rkvdec: pid: 5817, session: 0000000050327563, time: 6481 us
11-14 02:51:26.232     0     0 I mpp_rkvdec2 fdf80200.rkvdec: resetting...
11-14 02:51:26.232     0     0 I mpp_rkvdec2 fdf80200.rkvdec: reset done
11-14 02:51:26.285  5817  5822 I mpi_dec_test: dump_mpp_frame_to_file:65800
11-14 02:51:26.286  5817  5822 I mpi_dec_test: 0xe84c0280 decode get frame 477 fps:19.19 max_usage:9216000
11-14 02:51:26.345  5817  5822 I mpi_dec_test: dump_mpp_frame_to_file:58320
11-14 02:51:26.347  5817  5822 I mpi_dec_test: 0xe84c0280 decode get frame 478 fps:19.18 max_usage:9216000
11-14 02:51:26.406  5817  5822 I mpi_dec_test: dump_mpp_frame_to_file:58768
11-14 02:51:26.407  5817  5822 I mpi_dec_test: 0xe84c0280 decode get frame 479 fps:19.17 max_usage:9216000 err 1 discard 0
11-14 02:51:26.407  5817  5822 I mpi_dec_test: dump_mpp_frame_to_file:2
11-14 02:51:26.408  5817  5822 I mpi_dec_test: 0xe84c0280 found last packet
11-14 02:51:26.408  5817  5822 I mpi_dec_test: decode 480 frames time 25001 ms delay  19 ms fps 19.20
11-14 02:51:26.423  5817  5817 I mpi_dec_test: test success max memory 8.79 MB

对应文件修改

    case MPP_FMT_YUV420SP_VU :
    case MPP_FMT_YUV420SP : {
        RK_U32 i;
        RK_U8 *base_y = base;
        RK_U8 *base_c = base + h_stride * v_stride;
        RK_U8 *tmp = mpp_malloc(RK_U8, h_stride * height * 3);

        memcpy(tmp, base, h_stride * height * 3/2);

        mpp_free(tmp);

        // for (i = 0; i < height; i++, base_y += h_stride) {
        //     fwrite(base_y, 1, width, fp);
        // }
        // for (i = 0; i < height / 2; i++, base_c += h_stride) {
        //     fwrite(base_c, 1, width, fp);
        // }
    } break;

这个拷贝为什么性能这么差,帮忙分析下

from mpp.

HermanChen avatar HermanChen commented on May 30, 2024

这很正常啊,硬件 buffer 默认开的 non-cachable 的啊,不写文件,不去做映射,速度就很快啊

from mpp.

Justa-Cai avatar Justa-Cai commented on May 30, 2024

这很正常啊,硬件 buffer 默认开的 non-cachable 的啊,不写文件,不去做映射,速度就很快啊

在RK3288平台上测试数据会比较好,有什么优化方法吗?3566这个平台A55核心不应该这么拉跨才对

11-14 16:17:51.053  6227  6232 I mpi_dec_test: dump_mpp_frame_to_file:5705
11-14 16:17:51.053  6227  6232 I mpi_dec_test: 0xb1393000 decode get frame 1790 fps:148.58 max_usage:9953280
11-14 16:17:51.058  6227  6232 I mpi_dec_test: dump_mpp_frame_to_file:5187
11-14 16:17:51.058  6227  6232 I mpi_dec_test: 0xb1393000 decode get frame 1791 fps:148.59 max_usage:9953280
11-14 16:17:51.064  6227  6232 I mpi_dec_test: dump_mpp_frame_to_file:5500
11-14 16:17:51.064  6227  6232 I mpi_dec_test: 0xb1393000 decode get frame 1792 fps:148.61 max_usage:9953280
11-14 16:17:51.069  6227  6232 I mpi_dec_test: dump_mpp_frame_to_file:4800
11-14 16:17:51.069  6227  6232 I mpi_dec_test: 0xb1393000 decode get frame 1793 fps:148.63 max_usage:9953280
11-14 16:17:51.074  6227  6232 I mpi_dec_test: dump_mpp_frame_to_file:4502
11-14 16:17:51.074  6227  6232 I mpi_dec_test: 0xb1393000 decode get frame 1794 fps:148.65 max_usage:9953280
11-14 16:17:51.079  6227  6232 I mpi_dec_test: dump_mpp_frame_to_file:5274
11-14 16:17:51.079  6227  6232 I mpi_dec_test: 0xb1393000 decode get frame 1795 fps:148.67 max_usage:9953280
11-14 16:17:51.084  6227  6232 I mpi_dec_test: dump_mpp_frame_to_file:4562
11-14 16:17:51.084  6227  6232 I mpi_dec_test: 0xb1393000 found last packet
11-14 16:17:51.089  6227  6232 I mpi_dec_test: 0xb1393000 decode get frame 1796 fps:148.63 max_usage:9953280
11-14 16:17:51.094  6227  6232 I mpi_dec_test: dump_mpp_frame_to_file:5077
11-14 16:17:51.096  6227  6232 I mpi_dec_test: 0xb1393000 decode get frame 1797 fps:148.63 max_usage:9953280
11-14 16:17:51.100  6227  6232 I mpi_dec_test: dump_mpp_frame_to_file:4358
11-14 16:17:51.101  6227  6232 I mpi_dec_test: 0xb1393000 decode get frame 1798 fps:148.65 max_usage:9953280
11-14 16:17:51.106  6227  6232 I mpi_dec_test: dump_mpp_frame_to_file:4987
11-14 16:17:51.106  6227  6232 I mpi_dec_test: 0xb1393000 found last packet
11-14 16:17:51.106  6227  6232 I mpi_dec_test: decode 1799 frames time 12119 ms delay  18 ms fps 148.44

from mpp.

HermanChen avatar HermanChen commented on May 30, 2024

3288 的 cpu 强啊……3566 的核相对差了不止一个档次

from mpp.

HermanChen avatar HermanChen commented on May 30, 2024

看硬件解码时间吧,软件时间只能做为参考

from mpp.

Justa-Cai avatar Justa-Cai commented on May 30, 2024

硬解码时间,大家都差不多。
现在是解码后的数据搬运时间差距比较大,3288是DDR3,3566是DDR4,CPU性能差再多不应该拷贝720P的数据耗时差了快10倍

3288/3566的CPU算力跑分,有对比过,3288确实会好,3566也不会差很多

3288 CPU: 26332
3566 CPU: 24593

from mpp.

Justa-Cai avatar Justa-Cai commented on May 30, 2024

优化下来了,64位性能反而没有32位好,64位从45fps->90+fps,32位从33fps->150+fps 提升巨大

32位数据

11-16 10:05:15.645 15677 15687 I mpi_dec_test: 0xeaf00460 decode get frame 1796 fps:136.03 max_usage:16588800
11-16 10:05:15.648 15677 15687 I mpi_dec_test: dump_mpp_frame_to_file:2495
11-16 10:05:15.651 15677 15687 I mpi_dec_test: 0xeaf00460 decode get frame 1797 fps:136.04 max_usage:16588800
11-16 10:05:15.654 15677 15687 I mpi_dec_test: dump_mpp_frame_to_file:2435
11-16 10:05:15.658 15677 15687 I mpi_dec_test: 0xeaf00460 decode get frame 1798 fps:136.05 max_usage:16588800
11-16 10:05:15.660 15677 15687 I mpi_dec_test: dump_mpp_frame_to_file:2382
11-16 10:05:15.660 15677 15687 I mpi_dec_test: 0xeaf00460 found last packet
11-16 10:05:15.661 15677 15687 I mpi_dec_test: decode 1799 frames time 13243 ms delay  19 ms fps 135.84
11-16 10:05:15.681 15677 15677 I mpi_dec_test: test success max memory 15.82 MB

64位数据

11-16 10:06:41.503 16503 16507 I mpi_dec_test: dump_mpp_frame_to_file:6375
11-16 10:06:41.507 16503 16507 I mpi_dec_test: 0xb400007698a30a60 decode get frame 474 fps:85.83 max_usage:9216000
11-16 10:06:41.511 16503 16507 I utils   : =======Justa /home/justa/work/opensource/mpp/utils/utils.c:118
11-16 10:06:41.511 16503 16507 I mpi_dec_test: dump_mpp_frame_to_file:4837
11-16 10:06:41.515 16503 16507 I mpi_dec_test: 0xb400007698a30a60 decode get frame 475 fps:85.88 max_usage:9216000
11-16 10:06:41.524 16503 16507 I utils   : =======Justa /home/justa/work/opensource/mpp/utils/utils.c:118
11-16 10:06:41.524 16503 16507 I mpi_dec_test: dump_mpp_frame_to_file:8669
11-16 10:06:41.527 16503 16507 I mpi_dec_test: 0xb400007698a30a60 decode get frame 476 fps:85.87 max_usage:9216000
11-16 10:06:41.532 16503 16507 I utils   : =======Justa /home/justa/work/opensource/mpp/utils/utils.c:118

from mpp.

Justa-Cai avatar Justa-Cai commented on May 30, 2024

方案可用,先关掉

from mpp.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.