Giter VIP home page Giter VIP logo

Comments (8)

nbrochu avatar nbrochu commented on May 27, 2024 2

Apologies for the late reply.

D3Dshot uses the Desktop Duplication API through Direct3D. It's the fastest way to capture, but it's only full display + cropping. It's built for the absolute best speed at the expense of utility. If you fall back to bitblt (using a library like mss for example) you'll gain some utility back at the expense of capture speed. It's not a minor drop either. In my benchmarks, bitblt can only get 30% of the FPS of the Desktop Duplication API (with full display capture, bitblt will get faster when capturing windows I assume).

bitblt also has important problems (for my use case, at least) with some graphics APIs. For example, you can't give it the handle of a window that uses OpenGL or you can't capture exclusive full screen applications, these will all fail to capture.

Thanks for the question on how OBS handles window capture. I looked into it and found that they use a new Windows API (WindowsGraphicsCapture). I can start looking into it to see if there is an opportunity to hook it in Python, measure the performance and evaluate the overall quality.

I do understand the use case for window capture. I built D3DShot specifically to replace mss in SerpentAI and technically I only care about 1 window too: the game window. The only difference is that for me, cropping was sufficient since I need the window focused to send inputs to the game anyway.

from d3dshot.

parsarahimi avatar parsarahimi commented on May 27, 2024

I agree with Alex, most people intend to use this for AI stuff and being able to run it on just a specific window is essential, can't do any work while it's running this way. An alternative, for now, would probably be to make the window small and crop out the region that has the game.

from d3dshot.

nbrochu avatar nbrochu commented on May 27, 2024

#22

from d3dshot.

alex-ong avatar alex-ong commented on May 27, 2024

It's the fastest way to capture, but it's only full display + cropping. It's built for the absolute best speed at the expense of utility.

I'm using it for a little program that captures a 700x700 window. My monitor res is 2560x1440, and i get like 20~ fps with D3DShot vs >60 with Targetted BitBlt, using the default example code. mss was even worse of course, iirc around 5 fps, since it copies the full screen and then crops.

Hopefully if you integrate WindowsGraphicsCapture, we can have our pie and eat it too, though i use a sub-section of a 1280x1440 window; i'm not sure what WGC's performance would be since i'm not certain if assume it grabs the entire window then crops vs just cropping a subsection.

from d3dshot.

nbrochu avatar nbrochu commented on May 27, 2024

I'm getting 58fps at the same resolution for fullscreen captures. You have to use the "numpy" capture output to get good speed as shown in the Performance section of the README.

PIL is the default capture out because it's a lighter dependency and easier to use for casual users but it's about 3 times slower (it's still adequate for everything capture()). The README is massive, so maybe people aren't thoroughly reading it and drawing the wrong conclusions about the speeds that can be achieved. I should probably raise a warning when someone tries to use capture() with PIL.

Are you benchmarking your bitblt FPS as time-to-numpy-array? In my tests, the only scenario I've seen bitblt be faster is when you provide a hwnd of a smaller window. That's still not my main issue with bitblt. It can't do OpenGL or DirectX client areas and things are worse in Windows 10 with the updated DWM. In OBS Studio, if I set a window capture to use bitblt, it only works correctly with native win32 applications. Anything else (Electron, Qt, WPF etc.) was a black screen. The only solution to get them back with bitblt is fullscreen + crop and you are back in sub 20 FPS land again.

That being said, if it's faster and it capture correctly for your use case, I don't see why you wouldn't use that. I would. Just make sure that all your potential capture targets work with bitblt.

from d3dshot.

alex-ong avatar alex-ong commented on May 27, 2024

The test was a while ago (i had to install python x64) and iirc it was numpy. I remember reading the docs saying so. IIRC i didn't test the threaded workload, i believe i just called d.screenshot() in a loop and printed out timing.

The bitblt was a PIL Image but that actually slows it down because the RGBX vs GBRX formatting makes it do a full memcpy; in my app i'm planning on removing PIL at some point and making it pure numpy, keeping GBRX data format.

i'll install it now and do a retest 🍡

from d3dshot.

alex-ong avatar alex-ong commented on May 27, 2024

I'm getting 55-58fps on both numpy and PIL. No idea why i was getting poor performance before. I went through all my discord logs an i was getting 20ms/frame == 50fps, not 20fps as i stated earlier.

Now i'm beginning to remember why i went against using D3DShot; the plan is to run this on peasant cpus, in a single-process setup. When running it in a thread, (using d.capture()) its still the same process as everything else, because GIL.

I was wondering what the breakdown of (waiting for vblank) occurs when running D3Dshot.screenshot(). If 99% of the time is just waiting for vblank, and the actual capture -> numpy array takes 1ms, that leaves 15ms for the rest of the "threads" to do image processing.

I think I assumed (perhaps incorrectly?) that there was no non-blocking code and the copying from memory took the full 16ms, so even using threading.Thread would cripple the image processing thread, unless i started using multiprocessing.process. It'd be interesting if you had stats on how much of the time is spent waiting for vblank. I might do some further testing now that its getting 58fps, as i was immediately unsatistifed with 50fps.

My BitBlt thing takes 2ms to capture the 700x700 region, leaving 14ms~ for processing. Once i remove the BitBlt -> PIL conversion it will be <1ms. (Yes it runs in a thread ofc but all threads are on the same process...)

Your statement about only native win32 applications is 100% correct; Users weren't able to capture Streamlabs OBS (just shows as blank). Fortunately most users use it to capture an NES Emulator window or OBS, and it works fine for those use-cases.

from d3dshot.

nbrochu avatar nbrochu commented on May 27, 2024

I don't have solid answers to what you are asking so I created 2 new issues to address that.

#23
#24

Let's move discussion about those aspects there and keep this issue open for window capture.

from d3dshot.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.