I deal with big data and some times preview rendering is here quite slow < 1 FPS.<b

Update from my side: I have profiled it with <a href="https://

It's in Qt itself. The QQUickWindow has a number of stages: <a href="https://doc.qt.io

Profiling performance of rendering about uranium HOT 17 OPEN

marcus-bigrep commented on September 22, 2024

Profiling performance of rendering

from uranium.

Comments (17)

Ghostkeeper commented on September 22, 2024 2

With Bremco's PR I observed a 4-fold performance increase in a large layer view. Whatever benchmarks you have now will be invalidated with the next release. Perhaps we can then consider this issue fixed?

from uranium.

marcus-bigrep commented on September 22, 2024 1

Update from my side:

I have profiled it with NVIDIA NSight and >90% is up to rendering the tubes. In particular, this backs up that VBO caching does not take much time.
I was wrong with that tubes were generated on the cpu in a former version. In fact, there is a compatibility mode which renders pure lines. This is fast though.

I'll hack in the tubes on the cpu now and do some benchmarking.

from uranium.

marcus-bigrep commented on September 22, 2024 1

(1)
Sorry for answering not earlier. I have added a simple file to generate out of an txt file an stl file of tubes.
Furthermore you have to insert
with open(r"c:\dev\ttt.txt", "a") as file: numpy.savetxt(file, data)
in LayerPolygon.init

tube_to_stl.zip

(2) I had a look at Simplify3d and PrusaSlicer. They are about 10x faster.
Definitely, we should look into that! IMHO, this is the way to go.

from uranium.

nallath commented on September 22, 2024

I don't really understand how you timed this. When profiling code like this, I'd recommend using something like https://github.com/pyutils/line_profiler since it shows per line how much time was spent on it.

from uranium.

marcus-bigrep commented on September 22, 2024

Thanks for the advice, I'll do it as well.
Though, I don't think it is about digging deeper in this function.
It seems that there is something in the message handling and Qt which takes a lot of time. And this is a big black hole, I don't understand.
In particular, it is not about this code. This code is fast.

Bottom line of this investigation is:
Log message is performed asynchronously and has a latency of >0.7sec.
Especially, the question is what happens in the 0.7sec? Do you have an idea?

from uranium.

nallath commented on September 22, 2024

I'm not sure what part of the messaging that could be. It has to be something with opengl since the rendering is so much faster if you only display a few lines.

But i do have the feeling that you might be on to something here that I missed the last few times that I tried profiling. Problem is; I don't know what I missed...

from uranium.

marcus-bigrep commented on September 22, 2024

Just why I stumbled about it. I hacked in some caching of VBOs.
Then the render method takes 0.03 (i.e. 6x faster) but overall duration is still >1sec.

from uranium.

nallath commented on September 22, 2024

I managed to find it what is causing it; it's the frame swap. I don't know why that would take so much longer than the actual rendering...

from uranium.

marcus-bigrep commented on September 22, 2024

Where is it in the code?

from uranium.

nallath commented on September 22, 2024

It's in Qt itself. The QQUickWindow has a number of stages: https://doc.qt.io/qt-5/qquickwindow.html#RenderStage-enum

I've been digging around in the documentation of Qt and it seems that the swap stage is when the actual openGL calls that are queued are made. So in that sense it makes sense that the swap takes a long time.

I've hooked it up to QtCreator to debug this a bit, and this is a screenshot from what the scenegraph is doing:

The bar before the swap is the actual render (which is 56.8 ms).

from uranium.

marcus-bigrep commented on September 22, 2024

What surprises me that caching the VBOs has so little impact on the performance.
I am looking now for some OpenGL profilers to get a better understanding.

from uranium.

nallath commented on September 22, 2024

The thing is that we use a shader to get the 3D effect of the layers. We could consider pre-generating the verts and send that to be rendered. It might make it faster at the expense of more pre-computation.

from uranium.

marcus-bigrep commented on September 22, 2024

Still my gut feeling is that VBO caching should have significant effect even with the shaders.

But your idea is good and Cura's old approach is exactly this. I'll try VBO caching with the old approach.
(The issue with the old approach is that pre-generating is done in Python which is damn slow.)

from uranium.

nallath commented on September 22, 2024

We could also do the pre-generating in C++ and make a python binding.

I do agree with your gut feeling though. The VBO caching should have more effect.

from uranium.

marcus-bigrep commented on September 22, 2024

What I have done now:
I have "exported" the preview to an stl file. This file is huge >3GB. Of course, you may store it more efficiently (triangle strip etc.).
Though, colors and vertex normals come up additionally and then you likely end up by even more memory consumption.
My gfx card has 4 GB of RAM and won't make it.

Long story short, I don't think that creating the tubes initially and then cache them, will work out.

In case of 1fps, one can have a simple view (i.e. lines) for navigating and then rendering the ordinary view afterwards for the "static" situation. Though, my resources are slipping away and I won't make this in the short term.
@nallath Maybe it's something for the ordinary cura development? ;-)

from uranium.

nallath commented on September 22, 2024

Could you share the project file to create the tubes file (and the tubes file itself?). It would help me to convince people that this is worth the effort.

from uranium.

Profiling performance of rendering about uranium HOT 17 OPEN

Comments (17)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent