Giter VIP home page Giter VIP logo

Comments (130)

kisvegabor avatar kisvegabor commented on June 1, 2024 1

I also saw the freeze but I thought it's a bug in my hacky display driver. I've already debuged that it stops on a while(vdb->flushing); I still don't know if lv_disp_flush_ready is not called or it's called but as it is called from an interrupt (for all of us) it might mess up things somehow. I mean some assembly level you-never-find-it bug.

I suspect the latest because while(vdb->flushing); reads a bit field which can be quite complicated on assembly level.

from lv_port_esp32.

barbiani avatar barbiani commented on June 1, 2024 1

@C47D

I do have debug symbols available, but it's a pain to setup the ESP32 tools on Windows, I was using the WSL but can't run openocd there, so i have to setup the toolchain in Windows directly :/.

actually you can or use https://visualgdb.com/

from lv_port_esp32.

kisvegabor avatar kisvegabor commented on June 1, 2024 1

sorry if i say something silly but, aren't we limited by the spi clock and not the cpu clock?

From the quick calculation above, we can see that it takes only 30 ms the send a full screen image. So, it limits the FPS to 33. If data sending to the display happens in the background (using DMA) the MCU is free to deal with rendering. If rendering takes < 30 ms we are still limited by the SPI clock speed, but the limit is still at 33 FPS. So is we see 16 FPS it means rendering took 60 ms. Which is way too much for a so small display and a 160 (or 240?) MHz MCU.

Could you try the widgets demo again with -O2? The stress demo could be misleading because it's not a realistic load.

from lv_port_esp32.

kisvegabor avatar kisvegabor commented on June 1, 2024 1

In general, it shouldn't matter too much, however it's better to keep below 5 ms for higher precision.

In this case, when lvgl needs to measure 10-100 ms (the rendering time) the 10 ms tick resolution can cause some measurement error. E.g. if the real rendering time is 45 ms (22.2 FPS), lvgl can measure i as 40 ms (25 FPS) or 50 ms (20 FPS).

from lv_port_esp32.

kisvegabor avatar kisvegabor commented on June 1, 2024 1

@embeddedt

I forget t answer it: "Image RGB renders at a surprising 173 FPS. Is this possible?"

Yes, it's possible because the benchmark measures the rendering time and calculates the FPS from it. So if something was rendered in 10 ms, it will be displayed as 100 FPS. Although, the real FPS is limited by LV_DISP_REFR_INTERVAL. The performance monitor (right bottom corner) takes this limit into account, but the benchmark doesn't.

from lv_port_esp32.

barbiani avatar barbiani commented on June 1, 2024 1

I was also updating the esp32 port to dev-7.0 and today I added an adc touch driver to my wrover kit here

Just bought a cheap touch panel and hooked up to some gpios. Works quite well and gives stable readings.

image

image

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024 1

Sure, I will do that change.

And yes I'm using the Wrover with SD card. I had to remove a couple of resistors to make it work, see #100

Somebody at the forum asked for help to make lvgl and the sd card share the same spi bus, that's why I had to refactor some of the code, and somebody else asked in the forum to use the lvgl file system abstraction (lv_fs) on the esp32, so I'm planning to work on that too after finishing this port.

from lv_port_esp32.

kisvegabor avatar kisvegabor commented on June 1, 2024 1

Then all seems good, at least as far as I can judge. We are limited to 30 FPS by the SPI and we saw 25-30 FPS. I think it was usually 52 because of the 10 ms resolution of lv_tick. Let's see if it works like this for @C47D too.

If you have time and interest one more thing can be tried out. Set monitor_cb in the driver, printf the rendering time in it, and make flush_cb empty (but call the lv_disp_flush_ready). This way we could see the pure rendering time. A 2-3 ms resolution for tick would be also great for the measurement. I saw that it can be easily set in menuconfig.
It's really not an important thing, just out of curiosity. 🙂

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024 1

I tried to replicate your configuration:

All maxed out.

240 MHz core clock
80MHz qio spi flash
-O2 compiler flag
Spi master functions in IRAM

But I can't get the same results, and I get some artifacts/glitches on the display.

I will be able to continue the investigation until tonight (8 hours from now). What I saw earlier is that if the demo doesn't switch tabs I get 6% CPU usage and 66FPS.

There are some other tasks I would like to work on:

  • Indev, I haven't tested any touch drivers.
  • Multiple devices (touch and display controllers) on the same bus, I2C should be easier than SPI.
  • Cleanup and comment the code on lvgl_esp32_drivers directory, so is easier for users to add support for their needs.
  • Whatever fix we found on the SPI data transfers.

from lv_port_esp32.

barbiani avatar barbiani commented on June 1, 2024 1

@kisvegabor It is in my fork

from lv_port_esp32.

kisvegabor avatar kisvegabor commented on June 1, 2024

Update lvgl submodule.

It's in dev-7.0 branch

Update lvgl example submodule.

It's in rework-7 branch

Update lvgl configuration file.

It'd be best to create a new one based on lv_conf_templ.h in dev-7.0.

Could you make a build test to see if there are any serious issues?

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

I will do the tests here: https://github.com/C47D/lv_port_esp32_v7
Once we get it working I will update this repo

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

First teaser, still some work to do i think

lvgl_v7

from lv_port_esp32.

kisvegabor avatar kisvegabor commented on June 1, 2024

What is the size and resolution of the display and LV_DPI?

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

Size of the screen is 320*240, LV_DPI is 100, now it have the proper configuration (of display size and orientation).

lvgl_v7_2

from lv_port_esp32.

kisvegabor avatar kisvegabor commented on June 1, 2024

The demo recognizes if it's a small, medium, large or extra large display and sets the layouts accordingly. This display should have been recognized as small.
What is the size of the display (in inches). We can calculate the real DPI from it.

Besides, please modify LV_DISP_SMALL_LIMIT to 25 in lv_conf.h.

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

The LCD size is 3.2" (inches), ok I will modify LV_DISP_SMALL_LIMIT to 25 and upload a pic.

from lv_port_esp32.

kisvegabor avatar kisvegabor commented on June 1, 2024

So the actual DPI is: `sqrt(320^2 + 240^2) / 3.2 = 125.
And width of the display is 320/125 = 2.56".

If you set to LV_DISP_SMALL_LIMIT to 25 lvgl will consider > 2.5" display as medium-sized and create this ugly layout. So please set it to 30 (3.0") instead of 25.

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

Much better

lvgl_v7_3_landscape

from lv_port_esp32.

kisvegabor avatar kisvegabor commented on June 1, 2024

Awesome, thank you!
I'll recalculate the default size limits.

You can enable LV_USE_PERF_MONITOR in lv_conf.h to see the current FPS and CPU usage.

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

I'm getting 33FPS, with CPU from 18% to 28% and peaks of 34%.

from lv_port_esp32.

embeddedt avatar embeddedt commented on June 1, 2024

@C47D I'm curious; what FPS do you get at the moment when you scroll past the gauges in the Values tab? I'd like to compare with what I see on STM32F7 (I get 14-18 FPS while scrolling; it increases after that).

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

Hi @embeddedt, this particular display doesn't have a touch controller, I will try to setup another display to measure the FPS values you've requested.

from lv_port_esp32.

kisvegabor avatar kisvegabor commented on June 1, 2024

@C47D I've modified the demo in test/no-tp branch. It changes between the tabs automatically and shows the gauges on the second tab.

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

Hi, I took a video of the demo using the test/no-tp branch.

https://youtu.be/BJv-vr03RsM

from lv_port_esp32.

kisvegabor avatar kisvegabor commented on June 1, 2024

Thank you!

There is still huge FPS drop when the gauge is fully redrawn. At least refreshing the needle only is not that bad.

I have a few questions:

  • What is the size of the display buffer?
  • Do you use 2 buffers with DMA in flush_cb?
  • Have you enabled -Os or -O3 optimization?
  • What is the speed of the SPI?

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

Hi, I'm not at home right now I will update my reply with the data you requested.

I had an issue with the demo, the demo stopped working after some minutes and the screen is stuck, I have tried to replicate the issue but it stopped at different times. I will try to add the log functions to lvgl so I can know when it stops.

from lv_port_esp32.

embeddedt avatar embeddedt commented on June 1, 2024

the demo stopped working after some minutes and the screen is stuck

I also saw this happen once on my STM32F7, but I didn't have time to investigate further. It was a few days ago.

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

Added some printf debugging points to see where the display is getting stuck and I can't replicate the issue, the demo has been running for almost an hour, but noticed that the lv_tick_task is still running.

There's also a new issue on the lvgl repo which uses the same dev board as me and doesn't report the issue Dev-7.0 performance experiments

from lv_port_esp32.

kisvegabor avatar kisvegabor commented on June 1, 2024

I suggest running the lv_demo_stress() which makes much more drawing.

from lv_port_esp32.

embeddedt avatar embeddedt commented on June 1, 2024

I call lv_disp_flush_ready directly from the flush_cb function, so it can't be entirely an interrupt-related issue.

from lv_port_esp32.

kisvegabor avatar kisvegabor commented on June 1, 2024

I call lv_disp_flush_ready directly from the flush_cb function, so it can't be entirely an interrupt-related issue.

Ah, good to know. I thought you are using the DMA based flush_cb.

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

I call lv_disp_flush_ready from a callback when the SPI finished transferring the data via DMA.

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

@kisvegabor Do you know if it's possible to get a backtrace when the application is stuck? I'm searching some tutorials to debug the esp32 chip with gdb.

from lv_port_esp32.

embeddedt avatar embeddedt commented on June 1, 2024

If you have debug symbols available and can attach to the device with GDB after it's already running, you should be able to interrupt execution, type bt, and get a decently usable backtrace.

from lv_port_esp32.

kisvegabor avatar kisvegabor commented on June 1, 2024

It seems @embeddedt knows it much better than me 🙁


I added some debug code and will run the stress demo all night, and hopefully it'll freeze.

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

@embeddedt Thanks, I do have debug symbols available, but it's a pain to setup the ESP32 tools on Windows, I was using the WSL but can't run openocd there, so i have to setup the toolchain in Windows directly :/.

from lv_port_esp32.

kisvegabor avatar kisvegabor commented on June 1, 2024

I think I found the issue but the picture is not perfect yet.

These are the interesting parts of lv_disp_buf_t. So there are 4 bits next to each other. When going to the next part to refresh lvgl might write the last 3 fields. So what I suspect is:

  1. When e.g. last_area is set, first the whole bitfield is read to a register
  2. An interrupt comes and set flushing = 0
  3. In the register read in 1) the last_area is set but flushing is still 1 too because it was cleared in the "real" variable.
  4. The register is written back and it overwrites the flushing bit.

So it's a typical Read-Modify-Write issue.
It all makes sense but @embeddedt said he doesn't use interrupt in the flush_cb.
So @embeddedt, are you sure you weren't using an interrupt based driver when it froze for you?

I've pushed a trivial fix. Let's see if it helps.

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

Thanks for the explanation @kisvegabor, I've set the tools to be able to debug the demo-application (I'm still using the demo-widgets app), I'm running it with gdb so I can backtrace when the application gets stuck.

The "problem" I see is that I we don't know how much time it will take to the application to stall, so how much time should we run the application with the fix to be sure it worked as expected?

from lv_port_esp32.

embeddedt avatar embeddedt commented on June 1, 2024

are you sure you weren't using an interrupt based driver when it froze for you?

100% sure. I don't have access to interrupts in the context that LittlevGL runs in. My setup is a bit unique so it's quite possible that the issue isn't within LittlevGL (although I didn't experience this with 6.1 or earlier snapshots of 7.0).

how much time should we run the application with the fix to be sure it worked as expected?

We have several weeks before release, so I'm pretty sure one of us will run into the issue if it's still present. I'd say let it run for a few hours and if nothing goes wrong, we can consider it's fixed for now.

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

@kisvegabor I've fixed an error on the lv_examples repo that I found when trying to run the stress demo, I sent a pull request to that repo.

Is this the expected behavior of the demo? https://youtu.be/7goD6lRqTLc

from lv_port_esp32.

embeddedt avatar embeddedt commented on June 1, 2024

I think that's what it's supposed to look like - the idea is that it randomly creates and moves a bunch of objects around to try and trigger weird bugs like this one.

I've had lv_demo_widgets running for a few hours now - it hasn't crashed. I think I can safely say the freeze is gone for me.

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

I also haven't hit the bug again, I will leave the demo running overnight.

You guys know lvgl better than I do, is there any particular reason why flushing and flushing_last are ints instead of uint32_t?

from lv_port_esp32.

kisvegabor avatar kisvegabor commented on June 1, 2024

@kisvegabor I've fixed an error on the lv_examples repo that I found when trying to run the stress demo, I sent a pull request to that repo.
Is this the expected behavior of the demo? https://youtu.be/7goD6lRqTLc

Thank you for the PR. Yes, it should look something like this. You can increase TIME_STEP in lv_demo_stress.c (e.g. to 200) to see more.

It was running for me for more than 8 hours. So it really seems to be solved. 🎉

Okay, let's turn back to the original topic: v7 on ESP.
@C47D seemingly it's running well on ESP. It could be faster though... On STM32F7, in general, I measured higher FPS with v7 compared to v6, but on ESP for a 320x240 TFT, I'd expect higher FPS.
Could you answer these questions, please, to exclude some potential issues?

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

Hi @kisvegabor,

I hope this information can help, if you want me to do some tests please let me know.

But we configure with 2 buffers the display buffer:
https://github.com/littlevgl/lv_port_esp32/blob/d124fe22a99580c47b7e45f449faf028390d4353/main/main.c#L71-L74

  • Have you enabled -Os or -O3 optimization?

No, the default project is being compiled without optimizations and debug symbol generation enabled, here are the possible optimization levels available in esp-idf.
imagen

  • What is the speed of the SPI?

The speed of the SPI is 40MHz for this particular driver.

from lv_port_esp32.

embeddedt avatar embeddedt commented on June 1, 2024

Can you try it with -Og and see what happens? That should already be an improvement over no optimization.

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

@embeddedt the project is being compiled with - 0g, I can try the other options tho if you want me to.

from lv_port_esp32.

kisvegabor avatar kisvegabor commented on June 1, 2024

@C47D
Just a minor correction: one buffer has 320*40*2 byte = 25 kB (because there is 2 bytes/pixel)

Could you try -O2 and -Os too?

from lv_port_esp32.

kisvegabor avatar kisvegabor commented on June 1, 2024

I usually don't use -Og so tested how it compares to others on my STM32F7 dev board:

O0: 7 FPS
Og: 18 FPS
O2: 25 FPS
Os: 24 FPS

So hopefully you also will see ~50% performance boost.

from lv_port_esp32.

barbiani avatar barbiani commented on June 1, 2024

@C47D

Did you do anything do add the -O2 setting? Mine shows only debug amd release.

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

@barbiani I'm using the master branch of the esp-idf, maybe is because of that.

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

@kisvegabor this is the demo stress compiled with -02

https://youtu.be/lTgXsJjYHk8

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

actually you can or use https://visualgdb.com/

Thanks for the information @barbiani, I ended up running openocd on the Windows prompt and gdb on WSL, but I had to change the drivers for the FTDI chip on windows to be able to connect openocd to it.
I debugged the application and it ran for about 5 hours with no issues, so i think the patch @kisvegabor did worked.

from lv_port_esp32.

barbiani avatar barbiani commented on June 1, 2024

from lv_port_esp32.

kisvegabor avatar kisvegabor commented on June 1, 2024

I debugged the application and it ran for about 5 hours with no issues, so i think the patch @kisvegabor did worked.

Awesome! 🙂

While writing a touch driver I have found that lvgl crashes with touches
outside of the screen. Meaning coordinates bigger than x and y resolutions.

I'll check it!

from lv_port_esp32.

kisvegabor avatar kisvegabor commented on June 1, 2024

While writing a touch driver I have found that lvgl crashes with touches
outside of the screen. Meaning coordinates bigger than x and y resolutions.

I'll check it!

Should work now.

from lv_port_esp32.

kisvegabor avatar kisvegabor commented on June 1, 2024

@C47D

this is the demo stress compiled with -02

It seems to me it's quite the same FPS as -Og.

Sending 320 * 240 * 16bit with 40 MHz takes 30 ms (33 FPS) so it should not be a limiting factor (as it happens in parallel with rendering). However, a 200 MHz MCU should be much faster with a 320x240 TFT. It'd be awesome to see how does it look on an oscilloscope to send an empty screen to the driver (CLK would be enough). I know it needs more effort to do, so I'm just writing it, in case you or @barbiani has time, interest an oscilloscope to measure it. 🙂

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

@kisvegabor, sorry if i say something silly but, aren't we limited by the spi clock and not the cpu clock?

I don't have an oscilloscope at home to test it :/, I'm going to the office until next Wednesday so i can test it until then, seems like i need a better logic analyzer :)

Regards

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

Thanks for the explanation @kisvegabor, I'm compiling the widgets demo with -O2, will upload the video asap.

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

Here's the video for demo widgets compiled with -O2 https://youtu.be/6FWhe9Ovd6k

from lv_port_esp32.

kisvegabor avatar kisvegabor commented on June 1, 2024

@C47D Thank you for the video. It still looks quite the same FPS as with -Og. :(

I'm looking forward to seeing the oscilloscope measurement on Wednesday! :)

from lv_port_esp32.

kisvegabor avatar kisvegabor commented on June 1, 2024

@C47D
I've added some minor fixes to lv_demo_bechmark.
Could you try it and share a photo/video about the summary screen at the end?

from lv_port_esp32.

embeddedt avatar embeddedt commented on June 1, 2024

I ran lv_demo_benchmark on STM32F746 (without the GPU). This looks like a really great tool. Here's the results I'm seeing. For brevity I'm only including the ones that either stuttered visually or have poor FPS numbers.

  • Small/medium text - 18 FPS
  • Shadows - ~15 FPS when small, ~11 FPS when an offset is added, <7 FPS when they're large. They seem to be very taxing.
  • Image (A)RGB rotate/zoom anti-aliased - ~11 FPS (regardless of opacity or alpha)
  • Medium/large compressed text - 8/4 FPS, respectively
  • Substr. shadow - 9 FPS
  • Substr. text - 15 FPS

I'll retry on the GPU-enabled version now.

EDIT: Oops, I had my touchscreen disabled, so there was no way to scroll in the results. I'll have to run the test again. 😆

from lv_port_esp32.

embeddedt avatar embeddedt commented on June 1, 2024

I should have mentioned before that all of my tests thus far without GPU are done with a custom driver that does direct copies to the framebuffer in flush_cb, and not with the official TFT driver, which uses standard DMA in flush_cb. I only use the official TFT driver to test with the GPU.

Results with the official TFT driver (GPU enabled):

  • "Opa speed" increased from 79% to 91%.
  • All uncompressed text now renders at ~11 FPS regardless of size.
  • Circle borders now render at 16 FPS (they weren't on the "Slow but common" list before).
  • Circles without opacity render at 10 FPS, but with opacity they drop to 13 FPS.
  • Shadows now render at 8 FPS when small, 5 FPS when an offset is added, and 2 FPS when they're large. 😕 It seems that the GPU does a worse job with this (2-3x the speed without GPU).
  • Image RGB renders at a surprising 173 FPS. Is this possible?
  • Rotated images render at 15-20 FPS, adding anti-aliasing cuts that in half.
  • Compressed text renders at 8 FPS when small, 6 FPS when medium, and 2 FPS when large.
  • Substr. shadow - 4 FPS

from lv_port_esp32.

kisvegabor avatar kisvegabor commented on June 1, 2024

@embeddedt
It'd be interesting to measure with speed/gpu branch and enable LV_USE_GPU_STM32_DMA2D to use the builtin GPU support.

I've added LV_SHADOW_CACHE_SIZE to lv_conf.h to speed up shadow drawing. You should set it to 10000.

In general, it's strange that I see 2-3 times better FPS on STM32F769 which is quite the same MCU.

  • Still disabled caching? (It'd explain the slowness)
  • What is the optimization level? (-O3 for me)
  • What is the display buffer size?
  • Display buffers are in internal RAM?
  • Are you using the latest lvgl and lv_examples versions?

@C47D

  • The ESP dev board there is no external RAM, right?
  • I saw that there is some kind of flash caching in ESP but still don't know what it is really good for. Do you know anything about it?
  • I read this doc and it seems it can cause some speed up if we'd place the performance-critical functions to IRAM.

from lv_port_esp32.

embeddedt avatar embeddedt commented on June 1, 2024

It'd be interesting to measure with speed/gpu branch and enable LV_USE_GPU_STM32_DMA2D to use the builtin GPU support.

Will do.

Still disabled caching? (It'd explain the slowness)

Yes; that's probably why it's so slow (although I would hope that it can perform reasonably even without the cache).

What is the optimization level?

I'm using -Og at the moment.

Are you using the latest lvgl and lv_examples versions?

Always. 🙂

What is the display buffer size? Display buffers are in internal RAM?

Thanks for checking that. I had forgotten to increase the size in the demo project, so it's currently at about 1/27th of the display size. We're probably losing a lot of performance to all the style calculations.

I'll increase the buffer size and optimization level, change LV_SHADOW_CACHE_SIZE, and report back with the results from speed/gpu.

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

@kisvegabor @embeddedt how often should I call lv_tick_inc? Right now it's called every 10ms.

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

@barbiani I didn't knew that was possible, it's nice to know is worked quite well.

from lv_port_esp32.

barbiani avatar barbiani commented on June 1, 2024

Here are two measurements of the MOSI line.

Visuals tab of lv_demo_widgets with LV_INDEV_DEF_READ_PERIOD and LV_DISP_DEF_REFR_PERIOD set to 15. So 66FPS.
Compiler flag -Os
TFT SPI at 60Mhz
Cores at 240Mhz
Flash SPI at 80Mhz

Gauge animating in the middle of the page:
20200428_172612

Gauge animating and scrolling:
20200428_172628

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

Does every horizontal division is 20ms?

@kisvegabor The spi should be only active when flushing, isnt?

from lv_port_esp32.

barbiani avatar barbiani commented on June 1, 2024

Does every horizontal division is 20ms?

Correct.

from lv_port_esp32.

kisvegabor avatar kisvegabor commented on June 1, 2024

Thank you @barbiani!

I suppose there were ~10 FPS (100 ms/frame) while scrolling.
There is one thing I don't understand. One "burst" on the trace is ~ 10 ms wide. And there is 20 ms between the start of the bursts. If you also have a 320x240 display with 40 lines of buffer (1/6 display) it makes sense because 20 ms x 5 = 100 ms (x5 instead of x6 because the top is not refreshed). So one burst should be a sending of an 1/6 part.

But why does it take so long to send 1/6 display?
320x40 = 12,800 pixels -> 204,800 bits. To send it on 60MHz should be only 3.4 ms.

I've just found this post which says that code placed in SPI flash can be read with 16 MB/s throughput (4 M instruction/sec I guess). If I understand it correctly it's very very slow. I know there is caching but realistically every new drawing function needs to be reloaded when called. It seems it'd help to move critical function into IRAM. If you agree I can a define to lv_conf.h the prefix these functions.

from lv_port_esp32.

barbiani avatar barbiani commented on June 1, 2024

Thank you @barbiani!

I suppose there were ~10 FPS (100 ms/frame) while scrolling.

It shows 16 to 20 fps scrolling the animated gauge.

There is one thing I don't understand. One "burst" on the trace is ~ 10 ms wide. And there is 20 ms between the start of the bursts. If you also have a 320x240 display with 40 lines of buffer (1/6 display) it makes sense because 20 ms x 5 = 100 ms (x5 instead of x6 because the top is not refreshed). So one burst should be a sending of an 1/6 part.

But why does it take so long to send 1/6 display?
320x40 = 12,800 pixels -> 204,800 bits. To send it on 60MHz should be only 3.4 ms.

Here is a better view of the clock line. Shows 2 buffer updates taking roughly 10ms.

image

The other tab (selectors) has the list with the quite long text to scroll. It takes less than 2ms to update. Lvgl is brilliant.

I've just found this post which says that code placed in SPI flash can be read with 16 MB/s throughput (4 M instruction/sec I guess). If I understand it correctly it's very very slow. I know there is caching but realistically every new drawing function needs to be reloaded when called. It seems it'd help to move critical function into IRAM. If you agree I can a define to lv_conf.h the prefix these functions.

I have two SPI options set in the sdk:

[*] Place transmitting functions of SPI master into IRAM
-*- Place SPI master ISR function into IRAM

There is this area good for optimization:
5 really small spi transfers just before the full buffer taking >100us alone. @C47D Can we combine them or use polled mode?

image

from lv_port_esp32.

kisvegabor avatar kisvegabor commented on June 1, 2024

@barbiani
Wow, that's really impressive!

The 10 ms update time is measured on the scrolling gauge?

I'm really curious about the performance impact of placing critical function to IRAM so I added LV_ATTRIBUTE_FAST_MEM to lv_conf.h. Could you try the speed with #define LV_ATTRIBUTE_FAST_MEM IRAM_ATTR?

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

@barbiani Can you try with this changes, polling disp_spi_is_busy is done inside disp_spi_send_data already so i think it's not necessary to poll for it in the ili9341_send_cmd function.

I don't know if it's possible to send everything in one transaction (setting /CS low, sending all the commands and data and setting /CS high again) we would need to check the driver datasheet for it.

Another way to improve it is acquire the spi bus at the beginning of the ili9341_flush function and release it when we are done sending the colors, see Bus Acquiring

diff --git a/components/lvgl_esp32_drivers/lvgl_tft/ili9341.c b/components/lvgl_esp32_drivers/lvgl_tft/ili9341.c
index 10847d0..7a8db1e 100644
--- a/components/lvgl_esp32_drivers/lvgl_tft/ili9341.c
+++ b/components/lvgl_esp32_drivers/lvgl_tft/ili9341.c
@@ -128,15 +128,18 @@ void ili9341_init(void)

 void ili9341_flush(lv_disp_drv_t * drv, const lv_area_t * area, lv_color_t * color_map)
 {
-       uint8_t data[4];
+       uint8_t data[] = {
+               (area->x1 >> 8) & 0xFF,
+               area->x1 & 0xFF,
+               (area->x2 >> 8) & 0xFF,
+               area->x2 & 0xFF
+       };
+
+       uint32_t size = lv_area_get_width(area) * lv_area_get_height(area);

        /*Column addresses*/
        ili9341_send_cmd(0x2A);
-       data[0] = (area->x1 >> 8) & 0xFF;
-       data[1] = area->x1 & 0xFF;
-       data[2] = (area->x2 >> 8) & 0xFF;
-       data[3] = area->x2 & 0xFF;
-       ili9341_send_data(data, 4);
+       ili9341_send_data(data, sizeof data);

        /*Page addresses*/
        ili9341_send_cmd(0x2B);
@@ -144,14 +147,11 @@ void ili9341_flush(lv_disp_drv_t * drv, const lv_area_t * area, lv_color_t * col
        data[1] = area->y1 & 0xFF;
        data[2] = (area->y2 >> 8) & 0xFF;
        data[3] = area->y2 & 0xFF;
-       ili9341_send_data(data, 4);
+       ili9341_send_data(data, sizeof data);

        /*Memory write*/
        ili9341_send_cmd(0x2C);

-
-       uint32_t size = lv_area_get_width(area) * lv_area_get_height(area);
-
        ili9341_send_color((void*)color_map, size * 2);
 }

@@ -178,21 +178,18 @@ void ili9341_enable_backlight(bool backlight)

 static void ili9341_send_cmd(uint8_t cmd)
 {
-         while(disp_spi_is_busy()) {}
          gpio_set_level(ILI9341_DC, 0);         /*Command mode*/
          disp_spi_send_data(&cmd, 1);
 }

 static void ili9341_send_data(void * data, uint16_t length)
 {
-         while(disp_spi_is_busy()) {}
          gpio_set_level(ILI9341_DC, 1);         /*Data mode*/
          disp_spi_send_data(data, length);
 }

 static void ili9341_send_color(void * data, uint16_t length)
 {
-               while(disp_spi_is_busy()) {}
     gpio_set_level(ILI9341_DC, 1);   /*Data mode*/
     disp_spi_send_colors(data, length);
 }

from lv_port_esp32.

barbiani avatar barbiani commented on June 1, 2024

@barbiani
Wow, that's really impressive!

The 10 ms update time is measured on the scrolling gauge?

I'm really curious about the performance impact of placing critical function to IRAM so I added LV_ATTRIBUTE_FAST_MEM to lv_conf.h. Could you try the speed with #define LV_ATTRIBUTE_FAST_MEM IRAM_ATTR?

I can not tell anymore. Need better measurement setup as a possible improvement is too small to see.
Getting 33 fps scrolling the gauge (~15ms to send the two buffers).
Average under 1mS to update the needle.

The new widgets slide show demo averages at 25fps. I think that there is a problem with the animations if the cpu usage is 100%. If you leave the demo running, the pages will eventually get stuck at the bottom.

I can see that the page is not moving up/down anymore. Screen looks static, but the spi bus is really busy. Very different than what I see with no page animation.

Defining MASK_AREA_DEBUG makes it very clear. The tab page is not moving but its contents are being redrawn.

from lv_port_esp32.

barbiani avatar barbiani commented on June 1, 2024

@C47D

@barbiani Can you try with this changes, polling disp_spi_is_busy is done inside disp_spi_send_data already so i think it's not necessary to poll for it in the ili9341_send_cmd function.

I was thinking that we could avoid going through the queues during the buffer setup code. 100us (for just a few bytes) per buffer transfer is lost in these sections.

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

Yep, we can avoid them, can you add the code for that on disp_spi_send_data?

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

I'm not currently at home, will try to replicate the issue later.

The new widgets slide show demo averages at 25fps. I think that there is a problem with the animations if the cpu usage is 100%. If you leave the demo running, the pages will eventually get stuck at the bottom.

I can see that the page is not moving up/down anymore. Screen looks static, but the spi bus is really busy. Very different than what I see with no page animation.

Defining MASK_AREA_DEBUG makes it very clear. The tab page is not moving but its contents are being redrawn.

Can you take a video of it?

Does the time it takes to get stuck consistent?
This also happen when there's no automatic slide of the tabs?

This issue happened before, and there was a fix for it (13dd42fd0bb61fc65a6d317e093436e76be59967) but you need to make sure both lvgl and lv_examples submodules are updated.

If the submodules are not updated you can try:
Go to components/lv_examples/lv_examples, then do git fetch and then git checkout --track origin/rework.
Go to components/lvgl/lvgl, then do git fetch and then git checkout --track origin/dev-7.0

from lv_port_esp32.

embeddedt avatar embeddedt commented on June 1, 2024

I believe it should be origin/rework-7 not origin/rework.

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

I think the last cherry picked commit on lv_example was merged into rework.

from lv_port_esp32.

kisvegabor avatar kisvegabor commented on June 1, 2024

@barbiani

I can not tell anymore. Need better measurement setup as a possible improvement is too small to see.
Getting 33 fps scrolling the gauge (~15ms to send the two buffers).
Average under 1mS to update the needle.

I video would be really great. If you set LV_DISP_DEF_REFR_PERIOD 10 the max FPS will be limited at 100, so will the changes better.

FreeRTOS's tick should be also set to a lower value (2-3 ms?),or move lv_tick inc() to a 1 ms timer. It will give more precision on FPS measurement.

Anyway improving FPS from 10 to >33 when to gauge is scrolled is a great achievement so far 👏

@C47D

I think the last cherry picked commit on lv_example was merged into rework.

It seems it's in rework-7: https://github.com/littlevgl/lv_examples/commits/rework-7

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

Yep, it is rework-7, sorry about the confussion, i'm at work and have to finish a prototype tomorrow 😄

from lv_port_esp32.

embeddedt avatar embeddedt commented on June 1, 2024

No problem. Take your time!

from lv_port_esp32.

barbiani avatar barbiani commented on June 1, 2024

It seems that the animation tries to go past the page bottom... it can not, but still wants to refresh it.

Videos 1 and 2

from lv_port_esp32.

kisvegabor avatar kisvegabor commented on June 1, 2024

Thanks, @barbiani.
I've fixed it in lv_examples

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

@barbiani I just found out this: https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-reference/peripherals/spi_master.html#transactions-with-data-not-exceeding-32-bits

Transactions with Data Not Exceeding 32 Bits

When the transaction data size is equal to or less than 32 bits, it will be sub-optimal to allocate a buffer for the data. The data can be directly stored in the transaction struct instead. For transmitted data, it can be achieved by using the tx_data member and setting the SPI_TRANS_USE_TXDATA flag on the transmission. For received data, use rx_data and set SPI_TRANS_USE_RXDATA. In both cases, do not touch the tx_buffer or rx_buffer members, because they use the same memory locations as tx_data and rx_data.

I did that optimization, also incorporated your changes from your PR and some refactoring I need to be able to use the SD card on my board, into a new branch named dev-7.0, I'm using the Wrover Kit v4.1 board to test it, and also a ESP32 board with the SSD1306 to test monochrome displays.

Can you test it?

Here's a video of the current state of that branch:

https://youtu.be/xqa2q6WpmF8

from lv_port_esp32.

barbiani avatar barbiani commented on June 1, 2024

@C47D You can change
https://github.com/littlevgl/lv_port_esp32/blob/5d07cda76dbe5e6136c97b8fc65e70ee56e42eb4/components/lvgl_esp32_drivers/lvgl_tft/disp_spi.c#L119 to spi_device_polling_transmit(spi, &t); to halve the flush_cb setup time.

I am seeing a bit more fps and lower cpu usage. https://youtu.be/TQ67PM-G-zc

The spi bus usage is very high.

Are you using the sd card with the wrover kit? It will be my next adventure.

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

Just pushed the commit, but I'm having the same performance as before, how are you configuring the ESP32?

from lv_port_esp32.

barbiani avatar barbiani commented on June 1, 2024

from lv_port_esp32.

kisvegabor avatar kisvegabor commented on June 1, 2024

Do you also use the LV_ATTRIBUTE_FAST_MEM?

from lv_port_esp32.

barbiani avatar barbiani commented on June 1, 2024

from lv_port_esp32.

barbiani avatar barbiani commented on June 1, 2024

from lv_port_esp32.

barbiani avatar barbiani commented on June 1, 2024

@C47D
Use .flags = SPI_DEVICE_NO_DUMMY|SPI_DEVICE_HALFDUPLEX

from lv_port_esp32.

kisvegabor avatar kisvegabor commented on June 1, 2024

@barbiani Would it be possible to upload your project somewhere?

from lv_port_esp32.

barbiani avatar barbiani commented on June 1, 2024

I am not sure why yet, but I gained 18fps (from 25 to 43) overall moving guiTask to CPU0.

From https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-guides/general-notes.html ...The duty of enabling cache for APP CPU is passed on to the application...

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

Thanks for the heads up, will update #110 with this, I've been fixing some bugs on the master branch, will add those as well.

from lv_port_esp32.

kisvegabor avatar kisvegabor commented on June 1, 2024

From https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-guides/general-notes.html ...The duty of enabling cache for APP CPU is passed on to the application...

Does it mean we were running uncached so far?

from lv_port_esp32.

barbiani avatar barbiani commented on June 1, 2024

from lv_port_esp32.

C47D avatar C47D commented on June 1, 2024

@barbiani Is your repo fork updated? I'm updating #110 rn

from lv_port_esp32.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.