The benchmarking_video_reading_python from bml1g12

ModuleNotFoundError: No module named 'video_reading_benchmarks.camgear'

Traceback (most recent call last):
File "main.py", line 11, in
from video_reading_benchmarks.benchmarks import baseline_benchmark, imutils_benchmark,
File "/usr/local/lib/python3.8/dist-packages/video_reading_benchmarks-X-py3.8.egg/video_reading_benchmarks/benchmarks.py", line 16, in
from video_reading_benchmarks.camgear.camgear import CamGear
ModuleNotFoundError: No module named 'video_reading_benchmarks.camgear'

I've DOUBLED the FFmpeg Speed!

Say! Remember me?

Once upon a time we exchanged code on a thread about how OpenCV doesn't... ya know, work... for video.

Then, over a year later, I started dipping my toes back into Python video processing / software rendering, and pretty quickly stumbled onto this article: Lightning Fast Video Reading in Python, written by the guy I traded code with back in the day. Which is YOU! So I have a peek into the github repo linked in the article, and sure enough! That's my little FFmpeg script lurking behind the polished graphs of tests complete!

So I immediately file suit for copyright infringement in the 11th circuit court of... No, j/k.

So I immediately reach out to tell him that after revisiting my methodology, (removing head from betwixt buttock) I've just released a new script that is currently ingesting raw 1080p frames, unblocked, at about 220fps on my system; (which is 50+ frames faster than I've gotten reproducing your OpenCV read() baseline test) ...crunching through 720x480 frames at 1734fps; (~400 frames faster) And that I'm currently playing back full frame 1080 video to screen at 76.5fps through PyGame using it.

My new script is here: FFmpeg VideoStream

Given how disgustingly my old solution stacked up against the others, and the chance that this method potentially bests all competitors, (I genuinely don't know) I hope you'll have a look. I haven't pushed it through your complete benchmarking mill myself, and would love to know what figures you see out of it. And maybe there's fodder for another article here, if you're still writing.

Why was it so slow?

The old script I'd written was more task-specific than I'd even recognized when I shared it with you. I was forcing the output format of the video to a BGR 24-bit pixel format that was fast and convenient to interact with once the video's frames were in my hands, but ill-advised for getting it INTO my hands. It was reshaping the bytestream into a numpy array in-line to hand back a ready formatted frame -- as though it shouldn't be up to the user to do what they want to the raw output. And the call I was placing to FFmpeg was invoking the slowest 'seek' method available.

In other words: I wrote it to do what I was doing at the time. And didn't really know what I was doing with FFmpeg while I was doing it! When I saw that the speeds I got were better than the multiprocessed OpenCV method I'd written, (and that the FFmpeg script handed out the correct frame, unlike OpenCV) I was happy and left it there. Then you came along and I happily passed my little script along, to save you a headache I'd already suffered, trying to figure out what the ffmpeg-python library wanted of me, just to hand back some frames.

Why is it faster now?

My FFmpeg VideoStream script now defaults to the YUV420p pixel format. This is the format of virtually ALL video circulating in the modern world. Mp4s, Webm's, DVDs, Blu-Ray discs... These are all using a YUV 4:2:0 pixel format of one variant or another. The reason is that this format packs full color pixel data into a space of just 12-bits per pixel. AKA: 1.5 bytes.
AKA: This many 1s and 0s: '010101010101'.

RGB, BGR, and many other YUV formats package that same pixel's worth of data into 24-bits. AKA: 3 bytes.
AKA: This many 1s and 0s: '010101010101010101010101'.

It's literally double the data. And while YUV420p technically loses color information by packing the pixels up so tightly, the loss is nothing that 99.9% of the world notices. As evidenced by the mass proliferation of the format across all mediums.

So, by having FFmpeg read in YUV420p data two major gains are achieved. First, odds are that the video being accessed is already stored as YUV420p. So there's no conversion from one pixel space to another to eat clock cycles. Second, our bytestream is literally half the size it was. So moving the raw binary data into Python through a 'stdout' pipe is theoretically twice as fast.

In fact, I've found that even where you require the final frame to be in an RGB / BGR format to be processed, it is FASTER to ingest the raw YUV data and convert the frame in Python using OpenCV's .cvtColor() method. Which is a little shocking! That FFmpeg's multi-core processing engine can't unpack the 12-bit YUV into a 24-bit RGB format fast enough to overcome the simple fact that there is twice as much binary data to push through the pipe, is an unintuitive truth to land upon.

What else?

I've also managed to shoe-horn the 'ss:' and 'to:' input properties into ffmpeg-python's call constructor. This means being able to 'seek' near-instantaneously to any point in the video requested. Where the old method sent FFmpeg sloshing, one frame at a time, through the entire video to find the starting position.

'Showinfo' data is now available. With an optional configuration of showinfo=True, when .open_stream() is called, FFmpeg's per-frame information is liberated. This contains the current frame number, presentation time stamp, byte position in the file, mean channel values, etc. But 'showinfo' comes at a penalty of access-speed that, situation depending, can either be invisibly negligible when the process is blocked by a heavy action like rendering frames to screen, or a significantly meaningful slowdown to raw frame access when there's nothing to block it at all.

Wrap-up

There's a lot of little tweaks in the new code, brief as it is. But the main takeaway here is the access speed. And knowing that YUV420p is the fastest pixel format to ingest gets me thinking about all kinds of ways to do the kind of frame-matching and analysis that I was pulling raw video into Python for before. But now using BLAZINGLY FAST methods that operate directly on unconverted, array-reshaped, YUV frames.

Incidentally, 'reshaping' with Numpy adds ZERO overhead. It's the conversion to other color spaces that eats clock cycles. (10-20% loss depending on the size of frame) And knowing THAT gets me excited to locate a non OpenCV library that crunches YUV to BGR as fast as possible. it's got me looking into the SDL2 library for access to hardware acceleration methods that might render raw YUV 4:2:0 to the screen space without even touching the CPU.

The potential for Python to become a Video-Processing powerhouse by way of C-compiled execution, is wild!

Oh yeah... "Wrap-up," I said. Right!

I'm glad you were able to give FFmpeg a chance in 2020 using my poorly-considered little script. That said, it leaves me a bit guilty for FFmpeg performing so poorly in your tests. I hope you'll take the time to push this new version through the test bed you fashioned here, and share with me the results you get. And if you do test it, make sure to acquire the latest version of FFmpeg and FFprobe. I didn't think to do that until I was completely done testing, tweaking, and developing this first release. When I did, I suddenly got 20-30+ more fps at 1080p and 200-300+ more fps at lower resolutions.

One last time, here's the new script: FFmpeg VideoStream

bml1g12 / benchmarking_video_reading_python Goto Github PK

benchmarking_video_reading_python's People

Contributors

Stargazers

Watchers

Forkers

benchmarking_video_reading_python's Issues

ModuleNotFoundError: No module named 'video_reading_benchmarks.camgear'

I've DOUBLED the FFmpeg Speed!

Say! Remember me?

Why was it so slow?

Why is it faster now?

What else?

Wrap-up

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent