Giter VIP home page Giter VIP logo

Comments (2)

ZwFink avatar ZwFink commented on August 22, 2024

Hello,
Thank you for reaching out. Did you install Charm4Py from pip or did you build Charm++ to use with Charm4Py manually? Also, are you distributing the entire 3GB data to each pool worker PE? Also, are you using charm.pool.map or some other functionality of the pool?

from charm4py.

Patol75 avatar Patol75 commented on August 22, 2024

Thanks for having a look @ZwFink.

The error message provided results from a pip installation of Charm4Py. I have just tried with an MPI build of Charm++ (./build charm4py mpi-linux-x86_64 -j8 --with-production) and it yields a very similar error message:

Running on 17 processors:  /usr/bin/python3.8 script.py HUGE.vtu --charm 
charmrun>  /usr/bin/setarch x86_64 -R  mpirun -np 17  /usr/bin/python3.8 script.py HUGE.vtu --charm 
Charm++> Running on MPI version: 3.1
Charm++> level of thread support used: -1 (desired: 0)
Charm++> Running in non-SMP mode: 17 processes (PEs)
Converse/Charm++ Commit ID: v6.11.0-beta1-29-gd35885331
Isomalloc> Synchronized global address space.
CharmLB> Load balancer assumes all CPUs are same.
Charm4py> Running Charm4py version 1.0 on Python 3.8.0 (CPython). Using 'cython' interface to access Charm++
Charm++> Running on 1 hosts (1 sockets x 10 cores x 2 PUs = 20-way SMP)
Charm++> cpu topology info is gathered in 0.009 seconds.
Initializing charm.pool with 16 worker PEs. Warning: charm.pool is experimental (API and performance is subject to change)
----------------- Python Stack Traceback PE 1 -----------------
  File "charm4py/charmlib/charmlib_cython.pyx", line 863, in charm4py.charmlib.charmlib_cython.recvGroupMsg
  File "/home/thomas/.local/lib/python3.8/site-packages/charm4py/charm.py", line 253, in recvGroupMsg
    header, args = self.unpackMsg(msg, dcopy_start, obj)
  File "charm4py/charmlib/charmlib_cython.pyx", line 739, in charm4py.charmlib.charmlib_cython.CharmLib.unpackMsg
------------- Processor 1 Exiting: Called CmiAbort ------------
Reason: UnpicklingError: pickle data was truncated
[1] Stack Traceback:
  [1:0] libcharm.so 0x7fff576fb6ec CmiAbortHelper(char const*, char const*, char const*, int, int)
  [1:1] libcharm.so 0x7fff576fb801 
  [1:2] charmlib_cython.cpython-38-x86_64-linux-gnu.so 0x7fff57a25dfe 
  [1:3] python3.8 0x5fff6f _PyObject_MakeTpCall
  [1:4] python3.8 0x4ffbbf 
  [1:5] python3.8 0x57dbb0 _PyEval_EvalFrameDefault
  [1:6] python3.8 0x602b2c _PyFunction_Vectorcall
  [1:7] python3.8 0x57904d _PyEval_EvalFrameDefault
  [1:8] charmlib_cython.cpython-38-x86_64-linux-gnu.so 0x7fff57a1ecc0 
  [1:9] charmlib_cython.cpython-38-x86_64-linux-gnu.so 0x7fff57a20464 
  [1:10] charmlib_cython.cpython-38-x86_64-linux-gnu.so 0x7fff57a23847 
  [1:11] charmlib_cython.cpython-38-x86_64-linux-gnu.so 0x7fff57a4d5f1 
  [1:12] libcharm.so 0x7fff576ba77b GroupExt::__entryMethod(void*, void*)
  [1:13] libcharm.so 0x7fff576256d4 CkDeliverMessageFree
  [1:14] libcharm.so 0x7fff5762e6d6 _processHandler(void*, CkCoreState*)
  [1:15] libcharm.so 0x7fff576bf491 CsdScheduleForever
  [1:16] libcharm.so 0x7fff576bf6fd CsdScheduler
  [1:17] libcharm.so 0x7fff576fdfaa ConverseInit
  [1:18] libcharm.so 0x7fff5762b91c StartCharmExt
  [1:19] charmlib_cython.cpython-38-x86_64-linux-gnu.so 0x7fff57a4a7ae 
  [1:20] python3.8 0x5fff6f _PyObject_MakeTpCall
  [1:21] python3.8 0x4ffbbf 
  [1:22] python3.8 0x57dbb0 _PyEval_EvalFrameDefault
  [1:23] python3.8 0x5765ec _PyEval_EvalCodeWithName
  [1:24] python3.8 0x602dd2 _PyFunction_Vectorcall
  [1:25] python3.8 0x57904d _PyEval_EvalFrameDefault
  [1:26] python3.8 0x5765ec _PyEval_EvalCodeWithName
  [1:27] python3.8 0x662c2e 
  [1:28] python3.8 0x662d07 PyRun_FileExFlags
  [1:29] python3.8 0x663a1f PyRun_SimpleFileExFlags
  [1:30] python3.8 0x687dbe Py_RunMain
  [1:31] python3.8 0x688149 Py_BytesMain
  [1:32] libc.so.6 0x7ffff7a03bf7 __libc_start_main
  [1:33] python3.8 0x607daa _start

Regarding data distribution, I am not fully sure, and it is highly possible I am not doing something ideal. Data from the VTU is read in the main function (the one given to charm.start()). It is then used to create 3-D Scipy Interpolator objects using, for example, NearestNDInterpolator. These objects are then passed to each function execution through charm.pool.

I am using the multi_future argument of map_async, making sure 60,000 futures at most are created, and I provide a function foo with a partial construct to pass a dictionary that holds variables which will be accessed by each process. These variables are never modified, they are only read or, in the case of the Interpolator objects mentioned above, called. I paste below the relevant snippet.

    if inputArgs.charm:  # Charm4Py
        nBatch = np.ceil(np.sum(varDict['indArr'] == 0) / 6e4).astype(int)
        for batch in range(nBatch):
            nodes2do = np.asarray(varDict['indArr'] == 0).nonzero()
            futures = charm.pool.map_async(
                partial(foo, dictGlobals=dictGlobals),
                list(zip(*[nodes[:60_000] for nodes in nodes2do])),
                multi_future=True)
            for future in charm.iwait(futures):
                output = future.get()
                for i, var in enumerate(outVar):
                    varDict[var][output[0]] = output[i + 1]
                varDict['indArr'][output[0][::-1]] = 1
                varDict['nodesComplete'] += 1

from charm4py.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.