Comments (12)
Actually the same issue on MacOS with the Metal backend enable, after 4k draw calls it will crash so definitely some limitation hit. However this is ONLY in debug mode, release mode there is not a problem:
_platform_memmove 0x00000001978966e8
bx::memCopy(void *, const void *, unsigned long) bx.cpp:52
bgfx::mtl::RendererContextMtl::setShaderUniform(unsigned char, unsigned int, const void *, unsigned int) renderer_mtl.mm:1547
bgfx::mtl::RendererContextMtl::setShaderUniform4x4f(unsigned char, unsigned int, const void *, unsigned int) renderer_mtl.mm:1557
bgfx::ViewState::setPredefined<…>(bgfx::mtl::RendererContextMtl *, unsigned short, const bgfx::mtl::PipelineStateMtl &, const bgfx::Frame *, const bgfx::RenderDraw &) renderer.h:194
bgfx::mtl::RendererContextMtl::submit(bgfx::Frame *, bgfx::ClearQuad &, bgfx::TextVideoMemBlitter &) renderer_mtl.mm:4728
bgfx::Context::renderFrame(int) bgfx.cpp:2470
bgfx::renderFrame(int) bgfx.cpp:1491
bgfx::Context::renderThread(bx::Thread *, void *) bgfx_p.h:3150
bx::Thread::entry() thread.cpp:328
bx::ThreadInternal::threadFunc(void *) thread.cpp:95
_pthread_start 0x0000000197867fa8
------------ BGFX Stats ------------
CPU Frame Time: 9193
CPU Begin Time: 1692529311994101
CPU End Time: 1692529312003243
CPU Timer Frequency: 1000000
GPU Begin Time: 1692529311976854
GPU End Time: 1692529311977518
GPU Timer Frequency: 1000000
Wait Render: 2096
Wait Submit: 22
Draw Calls: 4720
Compute Calls: 0
Blit Calls: 0
Max GPU Latency: 0
GPU Frame Number: 0
Texture Memory Used: 53248
Render Target Memory Used: 0
Transient VB Used: 0
GPU Memory Max: -9223372036854775807
GPU Memory Used: -9223372036854775807
Width: 450
Height: 800
Text Width: 100
Text Height: 28
Number of view stats: 0
Number of encoders used during frame: 1
Primitives Rendered [0]: 9440
Primitives Rendered [1]: 0
Primitives Rendered [2]: 0
Primitives Rendered [3]: 0
Primitives Rendered [4]: 0
------------ End of BGFX Stats ------------
from bgfx.
Make debug build and see debug output.
from bgfx.
I already have a debug build, that's how I was able to do the analysis in the issue description.
Do you mean to share the log for the debug build? If so then here it is: log.txt
I'm actually getting a slight different behaviour now, it crashes pretty much immediately. Not sure why, I haven't really used Vulkan ever since I created the ticket originally. But the out of bounds access error, stacktrace and everything is the same, so it's still the same issue.
from bgfx.
Update your drivers.
from bgfx.
I've isolated my problem and fixed it:
Change [src/renderer_mtl.mm:1556]:(
Line 1556 in 954c18b
void setShaderUniform(uint8_t _flags, uint32_t _loc, const void* _val, uint32_t _numRegs)
{
uint32_t offset = 0 != (_flags&kUniformFragmentBit)
? m_uniformBufferFragmentOffset
: m_uniformBufferVertexOffset
;
uint8_t* dst = (uint8_t*)m_uniformBuffer.contents();
bx::memCopy(&dst[offset + _loc], _val, _numRegs*16);
}
To check for the UNIFORM_BUFFER_SIZE
before copying the memory.
void setShaderUniform(uint8_t _flags, uint32_t _loc, const void* _val, uint32_t _numRegs)
{
uint32_t offset = 0 != (_flags&kUniformFragmentBit)
? m_uniformBufferFragmentOffset
: m_uniformBufferVertexOffset
;
uint8_t* dst = (uint8_t*)m_uniformBuffer.contents();
if (offset + _loc > UNIFORM_BUFFER_SIZE) {
return;
}
bx::memCopy(&dst[offset + _loc], _val, _numRegs*16);
}
I can also just increase the buffer instead from src/renderer_mtl.mm:19
#define UNIFORM_BUFFER_SIZE (8*1024*1024)
To:
#define UNIFORM_BUFFER_SIZE (24*1024*1024)
from bgfx.
@joseph-montanez I'm not entirely sure we are seeing the same issue. I'm not testing this with my code, this is happening with example 17.
The source line where the crash happens is in the issue description, where it's trying to write onto the vk scratch memory more than is available, these values are regardless of release/debug as well. Plus there's the incorrect assert in ScratchBufferVK::write
that only checks the start of the address and not the address + length of copy, although this would still result in a crash via the assert anyway so it doesn't really matter.
@bkaradzic If you mean my nvidia drivers then they are up to date. Is there any other Vulkan specific driver that I should have and I'm not aware of?
from bgfx.
@magester1 That limit 3971
is EXACTLY the number of quads I could draw on screen, if I went to 3972 nothing else would render and going beyond 4000+ would crash it. Which means somewhere there is a limit causing that. We are both hitting the same exact limit before crashing. Since I am using Metal, my fix will do nothing to help you but should help narrow the problem area around data thats trying to be passed to the shader. For me it was the unified memory. The VK implementation doesn't have this and there are several places that could tell you exactly whats wrong but you need to debug the application to get the stack trace with lines. The stack trace you originally provided doesn't have line number so you most likely do not have BGFX compiled/linked with the debug version to get the lines associated information to further track down the issue.
from bgfx.
But that's what I mean, this is happening because of vk's scratch memory, which I believe has nothing to do with Metal (please correct me if that's wrong). The number being the same seems like a happy coincidence to me, or maybe because bgfx is using this magic "128" for both of them?
Oh I feel like an idiot, I forgot to add the lines numbers to the stack trace!! Thank you for pointing that out. Just to clarify, I do have this running in debug mode, and I know exactly which lines are causing the issue (linked in the original description). But I don't know enough about Vulkan to understand the design decision behind the size of the scratch memory, that's why I created this ticket here.
Here's the trace with the line numbers, sorry about that I didn't realize they were missing:
example-17-drawstress.exe!bx::memCopy(void * _dst, const void * _src, unsigned __int64 _numBytes) (...\bgfx\bx\src\bx.cpp:44)
example-17-drawstress.exe!bgfx::vk::ScratchBufferVK::write(const void * _data, unsigned int _size) (...\bgfx\bgfx\src\renderer_vk.cpp:4644)
example-17-drawstress.exe!bgfx::vk::RendererContextVK::submit(bgfx::Frame * _render, bgfx::ClearQuad & _clearQuad, bgfx::TextVideoMemBlitter & _textVideoMemBlitter) (...\bgfx\bgfx\src\renderer_vk.cpp:8680)
example-17-drawstress.exe!bgfx::Context::renderFrame(int _msecs) (...\bgfx\bgfx\src\bgfx.cpp:2455)
example-17-drawstress.exe!bgfx::renderFrame(int _msecs) (...\bgfx\bgfx\src\bgfx.cpp:1489)
example-17-drawstress.exe!entry::Context::run(int _argc, const char * const * _argv) (...\bgfx\bgfx\examples\common\entry\entry_windows.cpp:521)
example-17-drawstress.exe!main(int _argc, const char * const * _argv) (...\bgfx\bgfx\examples\common\entry\entry_windows.cpp:1185)
example-17-drawstress.exe!invoke_main()
example-17-drawstress.exe!__scrt_common_main_seh()
example-17-drawstress.exe!__scrt_common_main()
example-17-drawstress.exe!mainCRTStartup(void * __formal)
from bgfx.
So here is the issue:
uint8_t m_fsScratch[64<<10];
uint8_t m_vsScratch[64<<10];
Take anything that increments in 16 and you get 3971 limit. BTW its also used for...
void setShaderUniform(uint8_t _flags, uint32_t _regIndex, const void* _val, uint32_t _numRegs)
{
if (_flags & kUniformFragmentBit)
{
bx::memCopy(&m_fsScratch[_regIndex], _val, _numRegs*16);
}
else
{
bx::memCopy(&m_vsScratch[_regIndex], _val, _numRegs*16);
}
}
Why the limit... no idea. In my case macOS running on Arm64 doesn't have vram since its all shared memory. I am not sure why this needs to be limited to 64KB for Vulkan.
from bgfx.
In my case the main culprit was the m_scratchBuffer
scratch buffer which is created here.
Although what you highlighted looks like an issue as well, and a bit odd that it's not using the BGFX_CONFIG_MAX_DRAW_CALLS
macro instead of being hardcoded. I'm not sure what the relationship between the m_scratchBuffer
and m_vs/fsScratch
buffers is.
But yeah, like you I don't know why this limits exists or how it was determined. Specially considering that what goes here depends on the shader size (is it size in number of uniforms?), since with the original example shader it works fine up to the max draw calls.
from bgfx.
64k / 16 is 4096. If you're running out of fs/vsScratch that means you're setting over 4k uniforms.
from bgfx.
I don't think example-17 is setting any uniforms besides the default ones (you know view transformations, etc), so I don't think that's the issue.
from bgfx.
Related Issues (20)
- Bug: DX3D12 crash in Debug mode HOT 7
- Small fix on soname defining on ldflags for Android examples. HOT 2
- Crash on linux (default backend - vulkan) with custom window handle and native display is NULL HOT 6
- Shaderc.cpp incorrectly adds PSSL preamble to compute shader HOT 1
- macOS version compatibility HOT 2
- Bad Vulkan performance due to weak index/vertex buffer management HOT 1
- Crash on resize: Nvidia RTX 4090 + Vulkan + Linux HOT 3
- Metal: example-01-helloworld corrupted render when MSAA is enabled HOT 2
- Shifted window buffer location when resizing a window HOT 1
- Vulkan Backend - incorrect mipmap generation for cubemap frame buffers. HOT 4
- Unable to build bgfx for iOS Simulator x86_64 arch HOT 4
- Vulkan on Android: the frame isn't updating (ONLY THE FIRST FRAME IS RENDERED) HOT 4
- Screenshot taken always from the main window backbuffer on MacOS/Metal
- Index buffer offset not being set with dynamic index buffer and indirect call HOT 25
- gl_NumWorkGroups doesn't compile when the target is DirectX HOT 1
- Feature request: Support FSR 3 instead of FSR 1
- Allow one executable to run with Vulkan on Wayland and X11 HOT 1
- BX and BGFX do not support the conformant MSVC preprocessor HOT 12
- Examples crash on startup on Ubuntu 22.04 HOT 7
- A D3D11 debuglayer warning in 00-helloworld example project.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bgfx.