Giter VIP home page Giter VIP logo

Comments (6)

Moguri avatar Moguri commented on May 15, 2024

Can you please attach a simple example that reproduces and demonstrates the issue?

from panda3d.

drewc5131 avatar drewc5131 commented on May 15, 2024

MultiThreading_and_Async_Texture.zip

from panda3d.

Moguri avatar Moguri commented on May 15, 2024

Thanks for the example! I did not get a crash on Linux, but I do believe I get a deadlock:

#0  0x00007ffff7bca70c in __lll_lock_wait () from /usr/lib/libpthread.so.0
#1  0x00007ffff7bc3a96 in pthread_mutex_lock () from /usr/lib/libpthread.so.0
#2  0x00007ffff4b82de0 in Pipeline::cycle() () from /home/mitchell/panda3d_venv/lib/python3.6/site-packages/panda3d/libpanda.so.1.10
#3  0x00007ffff4b4b81f in GraphicsEngine::render_frame() () from /home/mitchell/panda3d_venv/lib/python3.6/site-packages/panda3d/libpanda.so.1.10
#4  0x00007ffff5eab5f1 in ?? () from /home/mitchell/panda3d_venv/lib/python3.6/site-packages/panda3d/core.cpython-36m-x86_64-linux-gnu.so
#5  0x00007ffff7418bfa in _PyCFunction_FastCallDict () from /usr/lib/libpython3.6m.so.1.0
#6  0x00007ffff73c808b in ?? () from /usr/lib/libpython3.6m.so.1.0
#7  0x00007ffff73adb9a in _PyEval_EvalFrameDefault () from /usr/lib/libpython3.6m.so.1.0
#8  0x00007ffff73c752b in _PyFunction_FastCallDict () from /usr/lib/libpython3.6m.so.1.0
#9  0x00007ffff7406f5f in _PyObject_FastCallDict () from /usr/lib/libpython3.6m.so.1.0
#10 0x00007ffff7407cd3 in _PyObject_Call_Prepend () from /usr/lib/libpython3.6m.so.1.0
#11 0x00007ffff7407dbb in PyObject_Call () from /usr/lib/libpython3.6m.so.1.0
#12 0x00007ffff61b9685 in ?? () from /home/mitchell/panda3d_venv/lib/python3.6/site-packages/panda3d/core.cpython-36m-x86_64-linux-gnu.so
#13 0x00007ffff61bdac0 in ?? () from /home/mitchell/panda3d_venv/lib/python3.6/site-packages/panda3d/core.cpython-36m-x86_64-linux-gnu.so
#14 0x00007ffff61bfbb9 in ?? () from /home/mitchell/panda3d_venv/lib/python3.6/site-packages/panda3d/core.cpython-36m-x86_64-linux-gnu.so
#15 0x00007ffff4b86bea in AsyncTask::unlock_and_do_task() () from /home/mitchell/panda3d_venv/lib/python3.6/site-packages/panda3d/libpanda.so.1.10
#16 0x00007ffff4b9184d in AsyncTaskChain::service_one_task(AsyncTaskChain::AsyncTaskChainThread*) () from /home/mitchell/panda3d_venv/lib/python3.6/site-packages/panda3d/libpanda.so.1.10
#17 0x00007ffff4b92588 in AsyncTaskChain::do_poll() () from /home/mitchell/panda3d_venv/lib/python3.6/site-packages/panda3d/libpanda.so.1.10
#18 0x00007ffff4b927d4 in AsyncTaskManager::poll() () from /home/mitchell/panda3d_venv/lib/python3.6/site-packages/panda3d/libpanda.so.1.10
#19 0x00007ffff5edb658 in ?? () from /home/mitchell/panda3d_venv/lib/python3.6/site-packages/panda3d/core.cpython-36m-x86_64-linux-gnu.so
#20 0x00007ffff7418bfa in _PyCFunction_FastCallDict () from /usr/lib/libpython3.6m.so.1.0
#21 0x00007ffff73c808b in ?? () from /usr/lib/libpython3.6m.so.1.0
#22 0x00007ffff73adb9a in _PyEval_EvalFrameDefault () from /usr/lib/libpython3.6m.so.1.0
#23 0x00007ffff73c7a1b in ?? () from /usr/lib/libpython3.6m.so.1.0
#24 0x00007ffff73c814e in ?? () from /usr/lib/libpython3.6m.so.1.0
#25 0x00007ffff73adb9a in _PyEval_EvalFrameDefault () from /usr/lib/libpython3.6m.so.1.0
#26 0x00007ffff73ab682 in ?? () from /usr/lib/libpython3.6m.so.1.0
#27 0x00007ffff73c7c4f in ?? () from /usr/lib/libpython3.6m.so.1.0
#28 0x00007ffff73c814e in ?? () from /usr/lib/libpython3.6m.so.1.0
#29 0x00007ffff73adb9a in _PyEval_EvalFrameDefault () from /usr/lib/libpython3.6m.so.1.0
#30 0x00007ffff73c7a1b in ?? () from /usr/lib/libpython3.6m.so.1.0
#31 0x00007ffff73c814e in ?? () from /usr/lib/libpython3.6m.so.1.0
#32 0x00007ffff73adb9a in _PyEval_EvalFrameDefault () from /usr/lib/libpython3.6m.so.1.0
#33 0x00007ffff73ac1d8 in PyEval_EvalCodeEx () from /usr/lib/libpython3.6m.so.1.0
#34 0x00007ffff73ad06c in PyEval_EvalCode () from /usr/lib/libpython3.6m.so.1.0
#35 0x00007ffff748e2d4 in ?? () from /usr/lib/libpython3.6m.so.1.0
#36 0x00007ffff7490cc1 in PyRun_FileExFlags () from /usr/lib/libpython3.6m.so.1.0
#37 0x00007ffff7490ec4 in PyRun_SimpleFileExFlags () from /usr/lib/libpython3.6m.so.1.0
#38 0x00007ffff748d160 in Py_Main () from /usr/lib/libpython3.6m.so.1.0
#39 0x0000000000400a5d in main ()

I was able to reproduce the issue with the following, simpler PRC file:

# Window settings:
window-title MultiThreading + Async Texture Loading
win-origin -2 -2

# Models:
preload-textures 0

# The Problematic Configs

    # Setting this to 0 works fine
allow-incomplete-render 1

threading-model Cull/Draw

The application does not deadlock with the preload-textures 0 commented out.

@rdb is this one of the known threading bugs?

@drewc5131 do you know if there is a version of Panda where this worked? In other words, is this a regression?

from panda3d.

rdb avatar rdb commented on May 15, 2024

@Moguri: yes, this looks the same as LP#1202448 . The deadlock occurs when the Texture's cycler is destructing while the main thread is attempting to cycle it. That bug is not yet tracked in GitHub though.

from panda3d.

rdb avatar rdb commented on May 15, 2024

Just some information about why the deadlock happens, for the record (and my future self's benefit):

Pipeline keeps a list of objects to cycle between the different pipeline stages. Every frame, it iterates through the list of objects that have modifications in order to copy those to the next pipeline stage (so that the changes in the main thread are propagated to the cull thread, etc.)

When an object destructs, it needs to make sure it is removed from the Pipeline so that it will no longer be subject to this cycle process, so it calls Pipeline::remove_cycler(). This needs to grab the Pipeline lock to remove the object from the list of objects to be cycled. However, if this happens on another thread while the main thread is in Pipeline::cycle(), which is currently trying to cycle an object that's being locked by the other thread, this deadlocks.

Here's a crude schematic of the deadlock:

  Main Thread:                |    Loader thread:
                              |
Pipeline.cycle():             |  for some random cycler:
  Pipeline.lock.acquire() [1] |    cycler.lock.acquire() [2]
  for each cycler:            |    Pipeline.remove_cycler(any random object):
    cycler.lock.acquire() [2] |      Pipeline.lock.acquire() [1]

It feels that the solution should be for Pipeline::cycle() to better cooperate with Pipeline::remove_from_cycler(), but it's not clear to me how. The difficulty is that remove_cycler() must not return until the object is truly removed (since it is about to destruct), but cycle() can't reasonably let go of the lock while it's iterating over the list.

from panda3d.

rdb avatar rdb commented on May 15, 2024

I spent the day looking again into it and I think I have a solution that should hold up. More info about the fix is in e04ddbe.

The solution is kind of convoluted so it's possible I missed some corner case, so please test it and let me know if you find any issues. (Compiling with --override DEBUG_THREADS=1 helps by enabling debug checks at cost of performance.)

from panda3d.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.