Giter VIP home page Giter VIP logo

Comments (10)

ahmedshingaly avatar ahmedshingaly commented on August 24, 2024 20

Hi.

I am facing a similar issue as well. I am trying to run a pre-trained styleGAN model (https://github.com/NVlabs/stylegan2) on my JupyterLab in a Tensorflow 1.14 GPU environment.

So, when I try to run the python code python run_generator.py generate-images --network=gdrive:networks/stylegan2-ffhq-config-f.pkl --seeds=6600-6625 --truncation-psi=0.5 as given in the link, I get the following error:

tensorflow.python.framework.errors_impl.NotFoundError: /trainman-mount/trainman-storage-d2b580e4-067b-44d3-9be3-be48cc5f0d71/stylegan2/dnnlib/tflib/_cudacache/fused_bias_act_1ac15fee5b354fc0d3aa1e7f98502e64.so: undefined symbol: _ZN10tensorflow12OpDefBuilder6OutputESs

I have no idea what does this _ZN10tensorflow12OpDefBuilder6OutputESs mean, but seems similar to the one raised in this thread. I also tried finding solutions for this error but all of them revolve around modifying some Makefile and there doesn't seem to be any use of a makefile for my problem since I am just running python code.

Any help will be much appreciated :)

In file stylegan2/dnnlib/tflib/custom_ops.py, line 127:
change from
compile_opts += ’ --compiler-options \’-fPIC -D_GLIBCXX_USE_CXX11_ABI=0\’’
to
compile_opts += ’ --compiler-options \’-fPIC -D_GLIBCXX_USE_CXX11_ABI=1\’’

from flownet2-tf.

Vedant2311 avatar Vedant2311 commented on August 24, 2024 4

Hi.

I am facing a similar issue as well. I am trying to run a pre-trained styleGAN model (https://github.com/NVlabs/stylegan2) on my JupyterLab in a Tensorflow 1.14 GPU environment.

So, when I try to run the python code python run_generator.py generate-images --network=gdrive:networks/stylegan2-ffhq-config-f.pkl --seeds=6600-6625 --truncation-psi=0.5 as given in the link, I get the following error:

tensorflow.python.framework.errors_impl.NotFoundError: /trainman-mount/trainman-storage-d2b580e4-067b-44d3-9be3-be48cc5f0d71/stylegan2/dnnlib/tflib/_cudacache/fused_bias_act_1ac15fee5b354fc0d3aa1e7f98502e64.so: undefined symbol: _ZN10tensorflow12OpDefBuilder6OutputESs

I have no idea what does this _ZN10tensorflow12OpDefBuilder6OutputESs mean, but seems similar to the one raised in this thread. I also tried finding solutions for this error but all of them revolve around modifying some Makefile and there doesn't seem to be any use of a makefile for my problem since I am just running python code.

Any help will be much appreciated :)

from flownet2-tf.

fperezgamonal avatar fperezgamonal commented on August 24, 2024

Final update: after fighting with it for quite a few days and with help with my university's IT staff, I got it solved. A soft link for cuda.h was the solution (and keep the Makefile as shown above if I am not mistaken).

I will close this issue now, feel free to open it if you encounter a similar problem and I'll try to help you as much as possible.

Cheers.

from flownet2-tf.

seni04 avatar seni04 commented on August 24, 2024

Final update: after fighting with it for quite a few days and with help with my university's IT staff, I got it solved. A soft link for cuda.h was the solution (and keep the Makefile as shown above if I am not mistaken).

I will close this issue now, feel free to open it if you encounter a similar problem and I'll try to help you as much as possible.

Cheers.

Hello sir, what do you mean by "A soft link for cuda.h was the solution"

how you do it ?

from flownet2-tf.

fperezgamonal avatar fperezgamonal commented on August 24, 2024

Hello @seni04 the technical stuff told me they had fixed by creating a soft link between the actual cuda version on the PC and the "standard" path where it is normally installed.

I assume they did something like:

ln -s /usr/bin/cuda-10.0 /usr/bin/cuda
But using the actual path where you installed CUDA as the first argument.
I'm sorry I cannot give you more details but I've just checked my IT tickets and found no extra details.
I hope this helps you,
PS: here is the actual (last) Makefile I used in any case (rename it back to Makefile)
Makefile.txt

Cheers,

Ferran.

from flownet2-tf.

seni04 avatar seni04 commented on August 24, 2024

Hello @seni04 the technical stuff told me they had fixed by creating a soft link between the actual cuda version on the PC and the "standard" path where it is normally installed.

I assume they did something like:

ln -s /usr/bin/cuda-10.0 /usr/bin/cuda
But using the actual path where you installed CUDA as the first argument.
I'm sorry I cannot give you more details but I've just checked my IT tickets and found no extra details.
I hope this helps you,
PS: here is the actual (last) Makefile I used in any case (rename it back to Makefile)
Makefile.txt

Cheers,

Ferran.

nvcc -c --expt-relaxed-constexpr -g -std=c++11 -DNDEBUG -I/usr/local/lib/python2.7/dist-packages/tensorflow/include -I"/usr/local/cuda-9.0/include" -DGOOGLE_CUDA=1 -D_MWAITXINTRIN_H_INCLUDED -D_FORCE_INLINES -D__STRICT_ANSI__ -D_GLIBCXX_USE_CXX11_ABI=0 src/ops/preprocessing/kernels/data_augmentation.cu.cc -x cu -Xcompiler -fPIC -o src/ops/build/data_augmentation.o
In file included from /usr/local/lib/python2.7/dist-packages/tensorflow/include/tensorflow/core/util/cuda_kernel_helper.h:21:0,
from src/ops/preprocessing/kernels/data_augmentation.cu.cc:7:
/usr/local/lib/python2.7/dist-packages/tensorflow/include/tensorflow/core/util/cuda_device_functions.h:32:31: fatal error: cuda/include/cuda.h: No such file or directory
compilation terminated.
Makefile:68: recipe for target 'preprocessing' failed
make: *** [preprocessing] Error 1

iam still get this error, already using the same makefile like yours

from flownet2-tf.

fperezgamonal avatar fperezgamonal commented on August 24, 2024

Hello again,

I am very sorry to see you are still facing the same issues. I totally understand your frustration since I was totally unable to successfully compile the ops in another computer to try to run more experiments in parallel (and I had the same configuration and Makefile!).

The only thing I can thing of is searching for this error since it is very reoccurring and try some of the proposed solutions and see if it works.
By the way, if you happen to solve this issue and run into a missing library (libcupti), I have just how I solved that. I did so by adding the path to the library to the LD_LIBRARY_PATH environment variable , as follows:
export LD_LIBRARY_PATH=/soft/easybuild/debian/8.8/Broadwell/software/CUDA/9.0.176/extras/CUPTI/lib64:$LD_LIBRARY_PATH

If I can find any more information on how to solve your error, I will post it here.
I wish you luck!

PS: I'll leave this open so more people can see this issue and hopefully provide a solution.
Cheers,
Feran.

from flownet2-tf.

stefanuddenberg avatar stefanuddenberg commented on August 24, 2024

I am facing the same issue. Trying to get this to work on my university's cluster and facing the same issue. I was able to get it working fine on my Windows machine, and my group has been able to get it to work on an EC2 instance, so I have no idea what the issue is exactly. From what I can tell, all the correct dependencies are installed... @Vedant2311 did you come up with a solution?

from flownet2-tf.

AliRashidnejad avatar AliRashidnejad commented on August 24, 2024

Hi.
I am facing a similar issue as well. I am trying to run a pre-trained styleGAN model (https://github.com/NVlabs/stylegan2) on my JupyterLab in a Tensorflow 1.14 GPU environment.
So, when I try to run the python code python run_generator.py generate-images --network=gdrive:networks/stylegan2-ffhq-config-f.pkl --seeds=6600-6625 --truncation-psi=0.5 as given in the link, I get the following error:

tensorflow.python.framework.errors_impl.NotFoundError: /trainman-mount/trainman-storage-d2b580e4-067b-44d3-9be3-be48cc5f0d71/stylegan2/dnnlib/tflib/_cudacache/fused_bias_act_1ac15fee5b354fc0d3aa1e7f98502e64.so: undefined symbol: _ZN10tensorflow12OpDefBuilder6OutputESs

I have no idea what does this _ZN10tensorflow12OpDefBuilder6OutputESs mean, but seems similar to the one raised in this thread. I also tried finding solutions for this error but all of them revolve around modifying some Makefile and there doesn't seem to be any use of a makefile for my problem since I am just running python code.
Any help will be much appreciated :)

In file stylegan2/dnnlib/tflib/custom_ops.py, line 127:
change from
compile_opts += ’ --compiler-options \’-fPIC -D_GLIBCXX_USE_CXX11_ABI=0\’’
to
compile_opts += ’ --compiler-options \’-fPIC -D_GLIBCXX_USE_CXX11_ABI=1\’’

thanks ahmedshingaly, this solved the similar issue for me

from flownet2-tf.

justusgraham avatar justusgraham commented on August 24, 2024

Also solved the issue for me. Would've been impossible to debug; thank you!

from flownet2-tf.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.