Giter VIP home page Giter VIP logo

Comments (10)

manopapad avatar manopapad commented on August 30, 2024

So this looks like two separate problems.

(1) You are trying to reserve 8000 MiB of framebuffer memory, but your device only has ~7850 MiB available. Try reducing to --fbmem 7500.

(2) The cuSolver dependency is missing. This should have automatically been included in the environment that you built using generate-conda-envs.py, but possibly something is wrong with that. Can you provide the output from these commands, to help debug the issue?

objdump -p /home/emeitz/.conda/envs/legate/lib/libcunumeric.so | grep PATH
ldd /home/emeitz/.conda/envs/legate/lib/libcunumeric.so | grep solv
conda list

from cunumeric.

ejmeitz avatar ejmeitz commented on August 30, 2024

(1) Giving less RAM fixed the first thing and it just gives the same error as using the legate interactive console.

(2)
objdump -p /home/emeitz/.conda/envs/legate/lib/libcunumeric.so | grep PATH:

  • RPATH /home/emeitz/.conda/envs/legate/lib:$ORIGIN

ldd /home/emeitz/.conda/envs/legate/lib/libcunumeric.so | grep solv

  • Gives nothing, so yeah cuSolver likely missing. I have it installed with CUDA outside of the conda env maybe something got confused there.

Conda list:
cl.txt

from cunumeric.

manopapad avatar manopapad commented on August 30, 2024

ldd [...] gives nothing

That is surprising; I would expect to see something like

libcusolver.so.11 => not found

Can you also try objdump -p libcunumeric.so | grep solv?

For reference, here is what I see on my machine:

(noneditable) iblis:~/noneditable/env> ldd lib/libcunumeric.so  | grep solv
	libcusolver.so.11 => /home/mpapadakis/noneditable/env/lib/libcusolver.so.11 (0x00007f163f000000)
(noneditable) iblis:~/noneditable/env> objdump -p lib/libcunumeric.so  | grep solv
  NEEDED               libcusolver.so.11
  required from libcusolver.so.11:
    0x09a2e521 0x00 11 libcusolver.so.11

Just to confirm, did you build in an environment created using generate-conda-envs.py --ctk 12.0? That script should have included a bunch of packages that I don't see in your conda list https://github.com/nv-legate/legate.core/blob/branch-24.01/scripts/generate-conda-envs.py#L80-L83.

from cunumeric.

ejmeitz avatar ejmeitz commented on August 30, 2024

Well with that second command I get:
objdump: 'libcunumeric.so': No such file
This file does exist at /home/emeitz/software/cunumeric/build/lib but this isn't my LD_LIBRARY_PATH and I guess the anaconda env isn't picking it up either even though cunumeric is in conda list.

If I run objdump without the grep from inside the lib folder I get the file below. With the pipe to grep nothing pops up.
objdump.txt

Yes I used an environment file, here's the actual file and I'm pretty sure the command was:
./scripts/generate-conda-envs.py --python 3.11 --ctk 12.0 --os linux --no-compilers --no-openmpi --ucx
environment-test-linux-py3.11-cuda12.0-ucx.zip

from cunumeric.

manopapad avatar manopapad commented on August 30, 2024

OK, I suspect what happened is that you have cusolver somewhere on your system, and nvcc was able to find it and link to it at build time, but no link to libcusolver.so was even recorded.

I think the best thing to do is just add cusolver to your conda environment and rebuild. I think a top-of-tree pull of legate.core will give you a scripts/generate-conda-envs.py that works correctly under 12.0, and should include the package libcusolver-dev.

from cunumeric.

ejmeitz avatar ejmeitz commented on August 30, 2024

Ok I'll nuke everything and try that. Edit: Things compiled, legate-issue runs and so does my program.

While I'm here is there a way to make a cunumeric program an executable or linked library so I can call cunumeric routines from other code bases? My long term use case will require many many calls to the same function (literally just one numpy func) and I'd like to avoid all the JIT or whatever is happening inside cunumeric every time I call the function.

from cunumeric.

manopapad avatar manopapad commented on August 30, 2024

We haven't looked at ways to "ahead-of-time compile" a cuNumeric program. You could try an accelerated Python interpreter like pypy, but we haven't tried that, so no guarantees that it will work. :-)

Depending on your application, there may be ways to "keep alive" a Python interpreter, to send cuNumeric commands to. The easiest being to just drive the whole application through Python.

There is some work going on in the direction of reducing cuNumeric's overheads, e.g. reusing Python data structures between operations instead of allocating them fresh, caching the dependence analysis for pieces of the code that get executed repeatedly, and moving more functionality to C++. However, these are still concerned with speeding up the invocation of cuNumeric as part of a Python program.

from cunumeric.

ejmeitz avatar ejmeitz commented on August 30, 2024

Ok good to know.

I am using Julia which has the ability to start a Python interpreter alongside the Julia instance and pass Julia objects to that Python instance. I am unsure how that would play with legate though. Is the legate command just a custom python interpreter? If so it might just work.

This part is mostly me dreaming, but it would be cool to have bindings to cunumeric functions in Julia. I have literally no clue how legion/cunumeric works under the hood but its extremely easy to wrap c/c++/fortran/python libraries in Julia.

from cunumeric.

manopapad avatar manopapad commented on August 30, 2024

Is the legate command just a custom python interpreter?

Yes, you can run cuNumeric programs using a vanilla python interpreter, and (fundamentally) any compatible interpreter. Most options that you pass through the legate wrapper (e.g. --gpus and --fbmem) need to be passed through the LEGATE_CONFIG environment variable (e.g. LEGATE_CONFIG='--gpus 2 --fbmem 4000' python prog.py).

Do note that distributed launching becomes harder in that case (simply passing --nodes 2 and --launcher mpirun to LEGATE_CONFIG won't work), and we will likely need more details on your workflow to assist with that.

it would be cool to have bindings to cunumeric functions in Julia

Non-python bindings has been discussed, but is not near the top of our priority list at the moment.

from cunumeric.

lightsighter avatar lightsighter commented on August 30, 2024

This part is mostly me dreaming, but it would be cool to have bindings to cunumeric functions in Julia.

I too think doing a Legate Julia would be interesting and it's something we've discussed in the past, but we have limited resources at the moment. I know that folks at LANL have considered this as well. @pmccormick for visibility.

from cunumeric.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.