Giter VIP home page Giter VIP logo

Comments (15)

Yangqing avatar Yangqing commented on April 27, 2024

It seems to be a cuda problem with install path etc., as I found some related errors:

http://code.opencv.org/issues/2843
https://code.google.com/p/thrust/issues/detail?id=359#c5

Following those solutions might help.

from caffe.

tiger10guy avatar tiger10guy commented on April 27, 2024

I ran into the same problem on my arch machine. Switching to a different version of gcc helped (I got a different issue having to do with MKL). On an Ubuntu box (gcc 4.6.something) I was able to install caffe without issue, so I haven't put much more effort into the arch box.

The first compiler was gcc 4.8.1 and switching to 4.6.4 (gcc46 on AUR) gave the mkl error. By switching I could make more progress after I got the error in this issue, so it seems that it really did get past the cuda errors.

from caffe.

leezl avatar leezl commented on April 27, 2024

That seems like a good direction to go in.

If cuda is determining what counts as a file system by something gcc is doing (as in the thrust issue, with gcc having thrust's location count as an include location) then gcc could be my problem. I did check for the env variables mentioned in the thrust issue, but they don't exist in my environment.

Anyway, thanks, I'll look into gcc some more, maybe try changing versions and let you know if I make progress.

from caffe.

kloudkl avatar kloudkl commented on April 27, 2024

In fact, the same problem exists on Ubuntu too if the environment vairable CPLUS_INCLUDE_PATH is not empty. It might be better to add a script make.sh including "export CPLUS_INCLUDE_PATH=" to avoid it for all users.

from caffe.

leezl avatar leezl commented on April 27, 2024

Sadly, my env did not contain any of the gcc path variables: CPATH, C_INCLUDE_PATH, CPLUS_INCLUDE_PATH, OBJC_INCLUDE_PATH

And yet it still had problems, which was why the thrust solution didn't help.

However installing gcc 4.4 did help. (The 4.6 package is out of date right now). I got past the Cuda errors, and am now having the MKL problems mentioned by @tiger10guy, or my own totally different MKL problems. Not sure if I should count this as a separate issue or part of Arch Install issues:

/usr/bin/g++-4.4 examples/train_net.o libcaffe.a -o examples/train_net.bin -L/usr/lib -L/usr/local/lib -L/opt/cuda/lib64 -L/opt/cuda/lib -L/opt/intel/mkl/lib -L/opt/intel/mkl/lib/intel64 -lcudart -lcublas -lcurand -lprotobuf -lopencv_core -lopencv_highgui -lglog -lmkl_rt -lmkl_intel_thread -lleveldb -lsnappy -lpthread -lboost_system -lopencv_imgproc -Wall
/opt/intel/mkl/lib/intel64/libmkl_intel_thread.so: undefined reference to `mkl_spblas_lp64_zzeros'
/opt/intel/mkl/lib/intel64/libmkl_intel_thread.so: undefined reference to `mkl_lapack_xzpttrs'
/opt/intel/mkl/lib/intel64/libmkl_intel_thread.so: undefined reference to `mkl_spblas_ccsr1ttlnf__smout_par'
/opt/intel/mkl/lib/intel64/libmkl_intel_thread.so: undefined reference to `mkl_spblas_dcsr0ttluc__mmout_par'
/opt/intel/mkl/lib/intel64/libmkl_intel_thread.so: undefined reference to `mkl_blas_zdotu'
/opt/intel/mkl/lib/intel64/libmkl_intel_thread.so: undefined reference to ...

And it goes on with more errors for several pages. So having a license for intel-parallel-studio-xe, and installing it according to Arch's Wiki, does not take care of everything. I may be missing a component or a license for a component. I'm looking through the mkl userguide, and arch forums for now.

Thank you for your help.

from caffe.

leezl avatar leezl commented on April 27, 2024

Looks like I might have made it; I just trained the MNIST example.

Added mkl_core, and iomp5 as dependencies.

So in the Makefile:

LIBRARIES := cudart cublas curand protobuf opencv_core opencv_highgui \
    glog mkl_core mkl_rt mkl_intel_thread leveldb snappy pthread \
    iomp5 boost_system opencv_imgproc

Without this there were many undefined references.

My Trials and Tribulations

So, arch linux, painful but possible. Thank you all for your help.

from caffe.

escorciav avatar escorciav commented on April 27, 2024

Hi @leezl , I have the same error using Fedora 20 with gcc 4.8.2. I had to compile cuda-5.5 with the flag "-override" but all the examples of the cuda toolkit ran perfectly.
a) Do you believe that there is not a solution with gcc 4.8.2?
b) If there is not solution, Could you give me an idea of how complex is the MKL issues that you mentioned before?
c) I installed all dependencies of caffe with yum. If I change to gcc-4.7.1 or 4.6, Must I only recompile nvidia-driver, nvidia-cuda or you think that I need to compile boost and other dependencies too?

I'm googling about the error now but I have found old post about this error so far. It seems a really silly message but I don't know how to get more information about the nvcc.

Error

/home/escorciav/cuda-5.5/bin/nvcc -ccbin=/usr/bin/g++ -Xcompiler -fPIC -DNDEBUG -O2 -I/usr/include/python2.7 -I/usr/include/numpy -I/usr/local/include -I./src -I./include -I/home/escorciav/cuda-5.5/include -I/opt/intel/mkl/include -gencode arch=compute_20,code=sm_20 -gencode arch=compute_20,code=sm_21 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -c src/caffe/layers/dropout_layer.cu -o build/src/caffe/layers/dropout_layer.cuo
src/caffe/layers/dropout_layer.cu(37): error: kernel launches from templates are not allowed in system files

src/caffe/layers/dropout_layer.cu(67): error: kernel launches from templates are not allowed in system files

2 errors detected in the compilation of "/tmp/tmpxft_00002480_00000000-10_dropout_layer.cpp4.ii".
make: *** [build/src/caffe/layers/dropout_layer.cuo] Error 2

System details

$ uname -m && cat /etc/*release
x86_64
Fedora release 20 (Heisenbug)
NAME=Fedora
VERSION="20 (Heisenbug)"
ID=fedora
VERSION_ID=20
PRETTY_NAME="Fedora 20 (Heisenbug)"
ANSI_COLOR="0;34"
CPE_NAME="cpe:/o:fedoraproject:fedora:20"
HOME_URL="https://fedoraproject.org/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=20
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=20
Fedora release 20 (Heisenbug)
Fedora release 20 (Heisenbug)
$ gcc --version
gcc (GCC) 4.8.2 20131212 (Red Hat 4.8.2-7)
Copyright (C) 2013 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

from caffe.

shelhamer avatar shelhamer commented on April 27, 2024

Note MKL is no longer needed if you build the dev branch. CBLAS and ATLAS
can be used instead. Documentation coming soon.

Le dimanche 30 mars 2014, Victor Escorcia Castillo [email protected]
a Γ©crit :

Hi @leezl https://github.com/leezl , I have the same error using Fedora
20 with gcc 4.8.2. I had to compile cuda-5.5 with the flag "-override" but
all the examples of the cuda toolkit ran perfectly.
a) Do you believe that there is not a solution with gcc 4.8.2?
b) If there is not solution, Could you give me an idea of how complex is
the MKL issues that you mentioned before?
c) I installed all dependencies of caffe with yum. If I change to
gcc-4.7.1 or 4.6, Must I only recompile nvidia-driver, nvidia-cuda or you
think that I need to compile boost and other dependencies too?

I'm googling about the error now but I have found old post about this
error so far. It seems a really silly message but I don't know how to get
more information about the nvcc.
Error

/home/escorciav/cuda-5.5/bin/nvcc -ccbin=/usr/bin/g++ -Xcompiler -fPIC
-DNDEBUG -O2 -I/usr/include/python2.7 -I/usr/include/numpy
-I/usr/local/include -I./src -I./include -I/home/escorciav/cuda-5.5/include
-I/opt/intel/mkl/include -gencode arch=compute_20,code=sm_20 -gencode
arch=compute_20,code=sm_21 -gencode arch=compute_30,code=sm_30 -gencode
arch=compute_35,code=sm_35 -c src/caffe/layers/dropout_layer.cu -o
build/src/caffe/layers/dropout_layer.cuo
src/caffe/layers/dropout_layer.cu(37): error: kernel launches from
templates are not allowed in system files

src/caffe/layers/dropout_layer.cu(67): error: kernel launches from
templates are not allowed in system files

2 errors detected in the compilation of
"/tmp/tmpxft_00002480_00000000-10_dropout_layer.cpp4.ii".
make: *** [build/src/caffe/layers/dropout_layer.cuo] Error 2
System details

$ uname -m && cat /etc/*release
x86_64
Fedora release 20 (Heisenbug)
NAME=Fedora
VERSION="20 (Heisenbug)"
ID=fedora
VERSION_ID=20
PRETTY_NAME="Fedora 20 (Heisenbug)"
ANSI_COLOR="0;34"
CPE_NAME="cpe:/o:fedoraproject:fedora:20"
HOME_URL="https://fedoraproject.org/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=20
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=20
Fedora release 20 (Heisenbug)
Fedora release 20 (Heisenbug)
$ gcc --version
gcc (GCC) 4.8.2 20131212 (Red Hat 4.8.2-7)
Copyright (C) 2013 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

β€”
Reply to this email directly or view it on GitHubhttps://github.com//issues/27#issuecomment-39031701
.

Evan Shelhamer

from caffe.

leezl avatar leezl commented on April 27, 2024

First, cuda depends on an old DWARF version, or so I've been told, so either we don't use cuda, or cuda would have to be rewritten in order to use more recent gcc versions. I used gcc 4.6, which you should be able to install next to your current version (somehow, not sure on redhat), if you haven't already.

I don't think you should have to build anything but caffe with the older compiler. If there were no errors when you built nvidia-driver, and cuda, then they should be fine, however you built them. If you continue to get errors, I guess try building them with an earlier version too, but that may not be the problem. So, in the makefile.config, change the CXX path to point to your earlier version of g++. (Something like: CXX=/usr/bin/g++-4.6)

Second, as said above, you may want to do without MKL, and look into getting the dev branch working with CBLAS and ATLAS. I know nothing about this, so someone else would have to help, and like he said, docs coming soon.

If you do want to get the MKL version going, you'll have to find out how MKL works for redhat, which you may be able to just download, for non-commercial use from their site, license included: intel software
You may or may not find anything useful in the wiki on my caffe fork: Arch installation

Basically, you need to find a version of the intel software that contains MKL, such as Composer, get it and other dependencies, which may include the fortran one. Then you need to have the license which may come with your install, if you try to download from Intel. Then you need to make sure path variable are set so your compiler finds MKL. Then you may need to add some extra links to your makefile. I mentioned that part above.

from caffe.

escorciav avatar escorciav commented on April 27, 2024

I tried gcc-4.7.3 and it seems to overcome the error but I have linking problems. Could you give me an idea how to solve this or if it must be solved?
I don't want that libraries of gcc-4.7.3 have major priority than the gcc of the OS. By the way, the make finished without errors and I can follow without problems with
$make pycaffe; make matcaffe
and I ran the example of MNIST without problems too :)

Message/Error

/usr/bin/ld: skipping incompatible /usr/local/lib/libstdc++.so when searching for -lstdc++
/usr/bin/ld: skipping incompatible /usr/local/lib/libstdc++.a when searching for -lstdc++
/usr/bin/ld: skipping incompatible /usr/local/lib/libgcc_s.so when searching for -lgcc_s
/usr/bin/ld: skipping incompatible /usr/local/lib/libgcc_s.so when searching for -lgcc_s

from caffe.

leezl avatar leezl commented on April 27, 2024

If it finds a compatible version after rejecting those, then it should be fine; that error means it found 32-bit when it was looking for 64-bit or 64 when looking for 32.

from caffe.

escorciav avatar escorciav commented on April 27, 2024

Thank you @leezl, I don't want to make another issue because yours is well ranked in google. At next, I summarized my attempts and the solution to help other people (I am on Fedora 20 with default gcc-4.8.2):

  1. Move cuda folder to your $HOME or whatever folder does'nt work.
  2. Install cuda-6.0 doesn't help.
  3. Clear the environment variable CPLUS_INCLUDE_PATH doesn't work. (Surprisingly, this is the most cited solution by google but doesn't work in my case. Possibly, cuda developers solved this issue in cuda-5.5).
  4. The solution of udacity doesn't work too. https://www.udacity.com/wiki/cs344/troubleshoot_gcc47
    SOLUTION: INSTALL ANOTHER VERSION OF GCC
    In my case, I installed gcc-4.7.3 because I download the version 4.7.2 (which is tested by cuda-dev) and it had many compiling errors. Therefore, be patient and try other gcc version if your first gcc doesn't compile.
    Other of my fear/lazzy was to compile other software besides of gcc. As @leezl said, it is not necessary. In my case, I only had to compile gcc and everything worked. With respect to MKL, I used intel composer update 2 (student version) and I didn't have any trouble with MKL.

from caffe.

shelhamer avatar shelhamer commented on April 27, 2024

To follow-up on #27 (comment) about non-MKL installation, here's a barebones outline for the adventurous to last until the docs are finished:

  1. Install ATLAS BLAS
  2. Check out the dev branch of Caffe.
  3. Copy Makefile.config.example to Makefile.config then edit to configure your paths. Note the USE_MKL flag defaults to 0 for no MKL.
  4. Build and follow the rest of the installation instructions as usual.

Further questions on the non-MKL build will have to wait for the new docs.

from caffe.

magicknight avatar magicknight commented on April 27, 2024

I got the same error and I switch to gcc 4.6 then it works

from caffe.

PhoenixDai avatar PhoenixDai commented on April 27, 2024

I got this working by adding lib64 before lib

from caffe.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.