Giter VIP home page Giter VIP logo

caffe2_cpp_tutorial's People

Contributors

beardo01 avatar breadbread1984 avatar leovandriel avatar mambawong avatar shunchengwu avatar teyenliu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

caffe2_cpp_tutorial's Issues

distributed training example for c++

Hi,
We have distributed training example for python (resnet50_trainer.py) instead of a C++ version.
Do we have a similar example in C++ version, or could someone give a quick idea or hint for the distributed training framework in C++?

Thanks!

Fast Retrain Error

Hi, I tried to retrain GoogleNet and tested it with the default images in res/images. When I execute "./bin/train --model googlenet --folder res/images --layer pool5/7x7_s1" I get the following error:

CNN Training Example

E0125 17:12:00.842572 18837 common_gpu.cc:70] Found an unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. I will set the available devices to be zero.
optimizer: adam
device: cudnn
using cuda: true
dump-model: false
model: googlenet
layer: pool5/7x7_s1
image-dir: res/images
db-type: leveldb
size: 224
iters: 1000
test-runs: 50
batch: 64
lr: 0.0001
display: false
reshape: false
matrix: false

2 labels found:
0: cat #2
1: dog #2
4 files found
split model.. (at pool5/7x7_s1)
terminate called after throwing an instance of 'caffe2::EnforceNotMet'
what(): [enforce fail at common_gpu.cc:132] error == cudaSuccess. 30 vs 0. Error at: /home/daniel/caffe2/caffe2/core/common_gpu.cc:132: unknown error
*** Aborted at 1516896720 (unix time) try "date -d @1516896720" if you are using GNU date ***
PC: @ 0x7ff6526ad428 gsignal
*** SIGABRT (@0x3e800004995) received by PID 18837 (TID 0x7ff6626eafc0) from PID 18837; stack trace: ***
@ 0x7ff65bef5390 (unknown)
@ 0x7ff6526ad428 gsignal
@ 0x7ff6526af02a abort
@ 0x7ff652ff084d __gnu_cxx::__verbose_terminate_handler()
@ 0x7ff652fee6b6 (unknown)
@ 0x7ff652fee701 std::terminate()
@ 0x7ff652fee969 __cxa_rethrow
@ 0x7ff661c0a835 caffe2::CreateOperator()
@ 0x7ff661c5f080 caffe2::SimpleNet::SimpleNet()
@ 0x7ff661c434d6 caffe2::CreateNet()
@ 0x7ff661c43c9d caffe2::CreateNet()
@ 0x7ff661be11e2 caffe2::Workspace::RunNetOnce()
@ 0x5550bf caffe2::preprocess()
@ 0x557d8f caffe2::run()
@ 0x559f66 main
@ 0x7ff652698830 __libc_start_main
@ 0x551229 _start
@ 0x0 (unknown)
Abgebrochen (Speicherabzug geschrieben)

Could someone help me with this error please?

CMake Error: The following variables are used in this project

Hi
I have an issue during making the folder.

-- The C compiler identification is GNU 5.4.0
-- The CXX compiler identification is GNU 5.4.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE  
-- Found Protobuf: /usr/lib/x86_64-linux-gnu/libprotobuf.so  
CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
NCCL_LIB
    linked by target "intro" in directory /home/innnk/caffe2_cpp_tutorial
    linked by target "imagenet" in directory /home/innnk/caffe2_cpp_tutorial
    linked by target "train" in directory /home/innnk/caffe2_cpp_tutorial
    linked by target "dream" in directory /home/innnk/caffe2_cpp_tutorial
    linked by target "mnist" in directory /home/innnk/caffe2_cpp_tutorial
    linked by target "toy" in directory /home/innnk/caffe2_cpp_tutorial
    linked by target "pretrained" in directory /home/innnk/caffe2_cpp_tutorial
    linked by target "diff" in directory /home/innnk/caffe2_cpp_tutorial
    linked by target "inspect" in directory /home/innnk/caffe2_cpp_tutorial
    linked by target "retrain" in directory /home/innnk/caffe2_cpp_tutorial
    linked by target "rnn" in directory /home/innnk/caffe2_cpp_tutorial

-- Configuring incomplete, errors occurred!
See also "/home/innnk/caffe2_cpp_tutorial/build/CMakeFiles/CMakeOutput.log".
See also "/home/innnk/caffe2_cpp_tutorial/build/CMakeFiles/CMakeError.log".
make[1]: Entering directory '/home/innnk/caffe2_cpp_tutorial/build'
make[1]: *** 没有指明目标并且找不到 makefile。 停止。
make[1]: Leaving directory '/home/innnk/caffe2_cpp_tutorial/build'
Makefile:4: recipe for target 'all' failed
make: *** [all] Error 2

There are some error messages in CMakeError.log

  1. 在函数‘main’中: CheckSymbolExists.c:(.text+0x16):对‘pthread_create’未定义的引用(in function ‘main’: CheckSymbolExists.c:(.text+0x16):Undefined reference to 'pthread_create')
  2. CMakeFiles/cmTC_735cf.dir/build.make:97: recipe for target 'cmTC_735cf' failed
  3. Makefile:126: recipe for target 'cmTC_735cf/fast' failed
  4. /usr/bin/ld: can't find -lpthreads
    I've already installed caffe2 with cuda 8.0 and it is working well.
    Any suggestion will be appreciated. Thanks a lot~~

Cmake compile error

i can't get the code to compile. i have the following error:

david@david-W54-55SU1-SUW:~/Documents/Poker/caffe2_cpp_tutorial$ make
make[1]: Entering directory '/home/david/Documents/Poker/caffe2_cpp_tutorial/build'
CMake Error at CMakeLists.txt:5 (find_package):
By not providing "Findcaffe2.cmake" in CMAKE_MODULE_PATH this project has
asked CMake to find a package configuration file provided by "caffe2", but
CMake did not find one.

Could not find a package configuration file provided by "caffe2" with any
of the following names:

caffe2Config.cmake
caffe2-config.cmake

Add the installation prefix of "caffe2" to CMAKE_PREFIX_PATH or set
"caffe2_DIR" to a directory containing one of the above files. If "caffe2"
provides a separate development package or SDK, be sure it has been
installed.

I do have file "caffe2Config.cmake" which is at location "/home/david/caffe2/build". any ideas how i get this working. i'm new to cmake.

Example for loading data without using third party database?`

I have been playing with your (awesome) run_mnist example. I want to do something similar with a set of images/labels I already have. However, I am having issues with generating a LevelDB database that will work if I use the exising LevelDB databases from https://download.caffe2.ai/datasets/mnist/mnist.zip, it works fine, but if I use the make_mnist_db.exe from the Caffe2 project, I get this exception in run_mnist:

Run MNIST Exception: [enforce fail at conv_op_impl.h:46] C == filter.dim32(1) * group_. Convolution op: input channels does not match: # of input channels 28 is not equal to kernel channels * group:1*1 Error from operator:
input: "data" input: "conv1_w" input: "conv1_b" output: "conv1" type: "Conv" arg { name: "stride" i: 1 } arg { name: "pad" i: 0 } arg { name: "kernel" i: 5 } arg { name: "order" s: "NCHW" }

Is there an example somewhere of how to load this data directly without having to use AddTensorProtosDbInputOp?

make issue

hello, I have the following issue through making:

Scanning dependencies of target train
[ 55%] Building CXX object CMakeFiles/train.dir/src/caffe2/binaries/train.cc.o
[ 57%] Linking CXX executable bin/train
/usr/bin/ld: warning: libicui18n.so.56, needed by //home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5, not found (try using -rpath or -rpath-link)
/usr/bin/ld: warning: libicuuc.so.56, needed by //home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5, not found (try using -rpath or -rpath-link)
/usr/bin/ld: warning: libicudata.so.56, needed by //home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5, not found (try using -rpath or -rpath-link)
//home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to u_strToLower_56' //home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to ucnv_getStandardName_56'
//home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to ucnv_getAlias_56' //home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to uenum_next_56'
//home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to u_strToUpper_56' //home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to ucnv_setSubstChars_56'
//home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to ucal_getTimeZoneDisplayName_56' //home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to ucnv_fromUnicode_56'
//home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to u_errorName_56' //home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to uenum_close_56'
//home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to ucal_getDSTSavings_56' //home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to ucal_openTimeZoneIDEnumeration_56'
//home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to ucal_setMillis_56' //home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to ucol_close_56'
//home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to ucol_getSortKey_56' //home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to ucal_get_56'
//home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to ucol_open_56' //home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to ucnv_compareNames_56'
//home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to ucal_clone_56' //home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to ucal_open_56'
//home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to ucol_setAttribute_56' //home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to ucal_openCountryTimeZones_56'
//home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to ucnv_open_56' //home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to ucal_openTimeZones_56'
//home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to ucnv_countAliases_56' //home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to ucal_inDaylightTime_56'
//home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to ucal_close_56' //home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to ucnv_getAvailableName_56'
//home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to ucnv_getDefaultName_56' //home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to ucal_getDefaultTimeZone_56'
//home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to ucnv_toUnicode_56' //home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to ucol_strcoll_56'
//home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to ucnv_close_56' //home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to ucnv_getMaxCharSize_56'
//home/installzoo/Programs/Qt5.7.0/5.7/gcc_64/lib/libQt5Core.so.5: undefined reference to `ucnv_countAvailable_56'
collect2: error: ld returned 1 exit status
CMakeFiles/train.dir/build.make:134: recipe for target 'bin/train' failed
make[2]: *** [bin/train] Error 1
CMakeFiles/Makefile2:68: recipe for target 'CMakeFiles/train.dir/all' failed
make[1]: *** [CMakeFiles/train.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2

Saved model from "mnist" example cannot be read by "pretrained" example

Hi @leonardvandriel ,
I manage to load saved model built from MNIST example. However, it doesn't work because of lack of output. I inserted op->add_output(param); to line 317 solved this but another issue showed:
op Conv: Source for input data is unknown for net mnist_model_predict, operator input: "data" input: "conv1_w" input: "conv1_b" output: "conv1" type: "Conv" arg { name: "stride" i: 1 } arg { name: "pad" i: 0 } arg { name: "kernel" i: 5 }
I compared the pb file from model zoo and the one created by mnist example. It seems the data param is missing in the one created by mnist example.

Maybe a tutorial on how to use dumped proto to train models from c++?

First, thank you for the awesome work!

I wonder whether it would be possible to ask (beg) for a tutorial on how to train models from the dumped proto texts (from python)? According to the intro it is a main feature that dumped configuration can read into c++ but I haven't found any example.

Thank you soooooo much in advance.

the example /bin/pretrained segfaults

Hi,

Just a fresh build using cmake . -DCMAKE_BUILD_TYPE=Debug and I keep getting the following:

./bin/pretrained 

## Caffe2 Loading Pre-Trained Models Tutorial ##
https://caffe2.ai/docs/zoo.html
https://caffe2.ai/docs/tutorial-loading-pre-trained-models.html

init_net: res/squeezenet_init_net.pb
predict_net: res/squeezenet_predict_net.pb
image_file: res/image_file.jpg
size_to_fit: 227

image size: [1280 x 751]
X server found. dri2 connection failed! 
DRM_IOCTL_I915_GEM_APERTURE failed: Invalid argument
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [22]
param: 4, val: 0
X server found. dri2 connection failed! 
DRM_IOCTL_I915_GEM_APERTURE failed: Invalid argument
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [22]
param: 4, val: 0
beignet-opencl-icd: no supported GPU found, this is probably the wrong opencl-icd package for this hardware
(If you have multiple ICDs installed and OpenCL works, you can ignore this message)
scaled size: [386 x 227]
cropped size: [227 x 227]
value range: (-128, 127)

output: 
  100% 'shower cap' (793)
*** Aborted at 1501089994 (unix time) try "date -d @1501089994" if you are using GNU date ***
PC: @           0x4b7c85 std::_Rb_tree<>::_S_right()
*** SIGSEGV (@0xffffffff00000018) received by PID 26001 (TID 0x7eff0a2a1e80) from PID 24; stack trace: ***
    @     0x7eff06fd9390 (unknown)
    @           0x4b7c85 std::_Rb_tree<>::_S_right()
    @           0x4b5c41 std::_Rb_tree<>::_M_erase()
    @           0x4b5c53 std::_Rb_tree<>::_M_erase()
    @           0x4b5c53 std::_Rb_tree<>::_M_erase()
    @           0x4b3b4c std::_Rb_tree<>::~_Rb_tree()
    @           0x4b25b2 std::set<>::~set()
    @     0x7efefcece36a __cxa_finalize
    @     0x7eff09968b83 (unknown)
Segmentation fault (core dumped)

I run it under gdb and it seems to be from:

Thread 1 "pretrained" received signal SIGSEGV, Segmentation fault.
0x00000000004b7c85 in std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::cha
r_traits<char>, std::allocator<char> >, std::_Identity<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::less<std::__cxx
11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocato
r<char> > > >::_S_right (__x=0xffffffff00000000) at /usr/include/c++/5/bits/stl_tree.h:687
687           { return static_cast<_Link_type>(__x->_M_right); }
(gdb) bt
#0  0x00000000004b7c85 in std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std:
:char_traits<char>, std::allocator<char> >, std::_Identity<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::less<std::_
_cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allo
cator<char> > > >::_S_right (__x=0xffffffff00000000) at /usr/include/c++/5/bits/stl_tree.h:687
#1  0x00000000004b5c41 in std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std:
:char_traits<char>, std::allocator<char> >, std::_Identity<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::less<std::_
_cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allo
cator<char> > > >::_M_erase (this=0x7ffff7dcdb00 <caffe2::gRegisteredTypeNames[abi:cxx11]()::g_registered_type_names>, __x=0xffffffff00000000)
    at /usr/include/c++/5/bits/stl_tree.h:1612
#2  0x00000000004b5c53 in std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std:
:char_traits<char>, std::allocator<char> >, std::_Identity<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::less<std::_
_cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allo
cator<char> > > >::_M_erase (this=0x7ffff7dcdb00 <caffe2::gRegisteredTypeNames[abi:cxx11]()::g_registered_type_names>, __x=0x9b0550)
    at /usr/include/c++/5/bits/stl_tree.h:1612
#3  0x00000000004b5c53 in std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std:
:char_traits<char>, std::allocator<char> >, std::_Identity<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::less<std::_
_cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allo
cator<char> > > >::_M_erase (this=0x7ffff7dcdb00 <caffe2::gRegisteredTypeNames[abi:cxx11]()::g_registered_type_names>, __x=0x9acac0)
    at /usr/include/c++/5/bits/stl_tree.h:1612
#4  0x00000000004b3b4c in std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std:
:char_traits<char>, std::allocator<char> >, std::_Identity<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::less<std::_
_cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allo
cator<char> > > >::~_Rb_tree (this=0x7ffff7dcdb00 <caffe2::gRegisteredTypeNames[abi:cxx11]()::g_registered_type_names>, __in_chrg=<optimised out>)
    at /usr/include/c++/5/bits/stl_tree.h:858
#5  0x00000000004b25b2 in std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char,
 std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::~set (
    this=0x7ffff7dcdb00 <caffe2::gRegisteredTypeNames[abi:cxx11]()::g_registered_type_names>, __in_chrg=<optimised out>)
    at /usr/include/c++/5/bits/stl_set.h:90
#6  0x00007fffeab3e36a in __cxa_finalize (d=0x7ffff7dcc780) at cxa_finalize.c:56
#7  0x00007ffff75d8b83 in __do_global_dtors_aux () from /usr/local/lib/libCaffe2_CPU.so
#8  0x00007fffffffd5e0 in ?? ()
#9  0x00007ffff7de7de7 in _dl_fini () at dl-fini.c:235

So it looks like it is coming from libCaffe2_CPU directly, and if memory serves me right the std::_Rb_tree refers to an std::unordered_set.
Also, why does it revert to using the CPU and not GPU (cuda 8.0 installed and libCaffe2_GPU is present).

Batch normalization

I'm reading the example provided here, trying to adapt what I read to another net:
https://github.com/leonardvandriel/caffe2_cpp_tutorial/blob/master/include/caffe2/zoo/resnet.h

I just wanted to know, when doing a batch normalization, I see that you are adding some inputs:
predict.AddInput(p + "_s");
predict.AddInput(p + "_b");
predict.AddInput(p + "_rm");
predict.AddInput(p + "_riv");

I guess those are the scale, bias, mean and variance values of the input layer of the BN.
My question is: are those holders linked/computed by caffe2 or shall I link them / compute them in some way? I'm not sure why we need to add inputs.

CMake error Eigen3

By following your instructions, I get the following error:

CMake Error at CMakeLists.txt:6 (find_package):
  By not providing "FindEigen3.cmake" in CMAKE_MODULE_PATH this project has
  asked CMake to find a package configuration file provided by "Eigen3", but
  CMake did not find one.

  Could not find a package configuration file provided by "Eigen3" with any
  of the following names:

    Eigen3Config.cmake
    eigen3-config.cmake

  Add the installation prefix of "Eigen3" to CMAKE_PREFIX_PATH or set
  "Eigen3_DIR" to a directory containing one of the above files.  If "Eigen3"
  provides a separate development package or SDK, be sure it has been
  installed.

Im running Ubuntu 16.04 LTS.

How to add gradient ops for multi loss ?

@leonardvandriel Tanks for this great tutorial, i learn a lot from it. However i met a problem and don't know how to solve it. I want to add multi loss operators into mnist.cc example, SoftmaxWithLoss and CenterLoss(new operator by myself, simply translate caffe-face's centerloss), i add gradient operators with following code:

predict.AddCenterLossOp("fc3", "label", "loss_reg", 10, 0, 0.2); predict.AddLabelCrossEntropyOp("softmax", "label", "xent"); predict.AddAveragedLossOp("xent", "loss"); AddAccuracy(init, predict); predict.AddConstantFillWithOp(1.f, "loss", "loss_grad"); predict.AddConstantFillWithOp(1.f, "loss_reg", "loss_reg_grad"); predict.AddGradientOps();

then i print these two kinds of loss, it doesn't seem right, i think the backward gradients are wrong, wich should come from two parts, centerloss and softmax loss.

So, how can i add gradient ops for multi loss network, could you give me some examples? i really want to solve it, great thanks!!

Can't build with anaconda installed caffe2

-- The C compiler identification is AppleClang 9.0.0.9000039
-- The CXX compiler identification is AppleClang 9.0.0.9000039
-- Check for working C compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc
-- Check for working C compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++
-- Check for working CXX compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - found
-- Found Threads: TRUE
-- Found Protobuf: /usr/local/anaconda3/lib/libprotobuf.dylib (found version "3.4.0")
-- Found OpenCV: /usr/local/anaconda3 (found version "3.3.1")
-- Found CURL: /usr/local/anaconda3/lib/libcurl.dylib (found version "7.58.0")
../caffe2: warning: directory does not exist.
-- Configuring done
-- Generating done
-- Build files have been written to: /Users/chikinip/Projects/caffe2_cpp_tutorial/build
downloading test image (2)
######################################################################################################################################################### 100.0%
######################################################################################################################################################### 100.0%
downloading Squeezenet model (2)
######################################################################################################################################################### 100.0%
######################################################################################################################################################### 100.0%
downloading MNIST train data (2)
######################################################################################################################################################### 100.0%
######################################################################################################################################################### 100.0%
E0316 17:00:10.233049 2354848576 init_intrinsics_check.cc:59] CPU feature avx is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
E0316 17:00:10.234446 2354848576 init_intrinsics_check.cc:59] CPU feature avx2 is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
E0316 17:00:10.234477 2354848576 init_intrinsics_check.cc:59] CPU feature fma is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
downloading MNIST test data (2)
######################################################################################################################################################### 100.0%
######################################################################################################################################################### 100.0%
E0316 17:00:24.738350 2354848576 init_intrinsics_check.cc:59] CPU feature avx is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
E0316 17:00:24.740079 2354848576 init_intrinsics_check.cc:59] CPU feature avx2 is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
E0316 17:00:24.740113 2354848576 init_intrinsics_check.cc:59] CPU feature fma is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
downloading RNN and LSTM test data (1)
#=O=# # #
downloading CNN image test data (3)
-=O=# # # #
-=#=- # # #
-=#=- # # #
Scanning dependencies of target caffe2_cpp
[ 1%] Building CXX object CMakeFiles/caffe2_cpp.dir/src/caffe2/util/blob.cc.o
[ 3%] Building CXX object CMakeFiles/caffe2_cpp.dir/src/caffe2/util/external.pb.cc.o
[ 5%] Building CXX object CMakeFiles/caffe2_cpp.dir/src/caffe2/util/model.cc.o
[ 7%] Building CXX object CMakeFiles/caffe2_cpp.dir/src/caffe2/util/model.pb.cc.o
/Users/chikinip/Projects/caffe2_cpp_tutorial/src/caffe2/util/model.pb.cc:144:3: error: no member named 'protobuf_external_2eproto' in namespace 'caffe2'; did
you mean simply 'protobuf_external_2eproto'?
::caffe2::protobuf_external_2eproto::InitDefaults();
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
protobuf_external_2eproto
/Users/chikinip/Projects/caffe2_cpp_tutorial/src/caffe2/util/external.pb.h:8:11: note: 'protobuf_external_2eproto' declared here
namespace protobuf_external_2eproto {
^
/Users/chikinip/Projects/caffe2_cpp_tutorial/src/caffe2/util/model.pb.cc:144:13: error: no member named 'InitDefaults' in namespace 'protobuf_external_2eproto';
did you mean simply 'InitDefaults'?
::caffe2::protobuf_external_2eproto::InitDefaults();
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
InitDefaults
/Users/chikinip/Projects/caffe2_cpp_tutorial/src/caffe2/util/model.pb.h:63:6: note: 'InitDefaults' declared here
void InitDefaults();
^
/Users/chikinip/Projects/caffe2_cpp_tutorial/src/caffe2/util/model.pb.cc:186:3: error: no member named 'protobuf_external_2eproto' in namespace 'caffe2'; did
you mean simply 'protobuf_external_2eproto'?
::caffe2::protobuf_external_2eproto::AddDescriptors();
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
protobuf_external_2eproto
/Users/chikinip/Projects/caffe2_cpp_tutorial/src/caffe2/util/external.pb.h:8:11: note: 'protobuf_external_2eproto' declared here
namespace protobuf_external_2eproto {
^
3 errors generated.
make[3]: *** [CMakeFiles/caffe2_cpp.dir/src/caffe2/util/model.pb.cc.o] Error 1
make[2]: *** [CMakeFiles/caffe2_cpp.dir/all] Error 2
make[1]: *** [all] Error 2
make: *** [all] Error 2

Make error

I got the following error when running the makefile.

[ 1%] Building CXX object CMakeFiles/caffe2_cpp.dir/src/caffe2/util/net_serial.cc.o
In file included from /home/luyang/Documents/caffe2_cpp_tutorial-master/src/caffe2/util/net_serial.cc:1:0:
/home/luyang/Documents/caffe2_cpp_tutorial-master/include/caffe2/util/net.h:4:34: fatal error: caffe2/core/operator.h: No such file or directory
compilation terminated.
CMakeFiles/caffe2_cpp.dir/build.make:62: recipe for target 'CMakeFiles/caffe2_cpp.dir/src/caffe2/util/net_serial.cc.o' failed
make[2]: *** [CMakeFiles/caffe2_cpp.dir/src/caffe2/util/net_serial.cc.o] Error 1
CMakeFiles/Makefile2:332: recipe for target 'CMakeFiles/caffe2_cpp.dir/all' failed
make[1]: *** [CMakeFiles/caffe2_cpp.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2

It seems the code doesn't recognize the include comments, such as #include <caffe2/core/operator.h>
Do I need to make some changes to the makefile to make it able to recognize?

To clarify, I build my caffe2 using Anaconda for python 2.7 version. Thanks.

Using your own data on mnist - "unknown backprop operator type"

Hi, thanks for writing these tutorials c++ help is very sparse for caffe2

I'm looking to move onto using my own data for the mnist example but as AddInput uses a DB this is
obscuring how exactly the "data" and "label" elements are created. (where "data" and "label" are my input and output training files)

i have the data blobs correct, i think.
auto tensor = workspace.CreateBlob("data")->GetMutable();
auto tensor = workspace.CreateBlob("label")->GetMutable();
but these relate to the workspace.

if i define "data" as in input to the network

model.predict.AddInput("data");
model.AddFcOps("data", "fc1", 156, 100, test);
model.predict.AddDataOp("data", "fc1");
model.AddFcOps("fc1", "relu", 100, 50, test);
model.predict.AddReluOp("relu", "relu");
model.AddFcOps("fc1", "relu", 50, 100, test);
model.predict.AddReluOp("relu", "relu");
model.AddFcOps("fc1", "relu", 100, 50, test);
model.predict.AddReluOp("relu", "relu");
model.AddFcOps("relu", "pred", 50, 8, test);
model.predict.AddSoftmaxOp("pred", "softmax");

i get a run time error of :
terminate called after throwing an instance of 'caffe2::EnforceNotMet'
what(): [enforce fail at net_gradient.cc:211] . unknown backprop operator type: data

any help would be greatly appreciated :-)

Conv operator

I have two questions:
When I read this:
// >>> conv1 = brew.conv(model, data, 'conv1', dim_in=1, dim_out=20, kernel=5)
model.AddConvOps("data", "conv1", 1, 20, 1, 0, 5, test);

What is dim_out? How do I compute it?

Is this operation a 1d convolution? Is there a 2D convolution example?

Segmentation fault in malloc_consolidate() when using Intel MKL

Hi,

I encountered segmentation fault in malloc_consolidate() when running the MNIST training using Intel MKL. I tried it with OpenBLAS and it worked fine. However, I would like to use Intel MKL.

Segmentation fault happened after the training was done. It happened when the program was exiting.

Here is the backtrace from GDB.

Program received signal SIGSEGV, Segmentation fault.
0x00002aaab3be55d3 in malloc_consolidate () from /usr/lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.17-196.el7_4.2.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-8.el7.x86_64 libcom_err-1.42.9-10.el7.x86_64 libgcc-4.8.5-16.el7_4.1.x86_64 libselinux-2.5-11.el7.x86_64 libstdc++-4.8.5-16.el7_4.1.x86_64 openssl-libs-1.0.2k-8.el7.x86_64 pcre-8.32-17.el7.x86_64 zlib-1.2.7-17.el7.x86_64
(gdb) bt
#0  0x00002aaab3be55d3 in malloc_consolidate () from /usr/lib64/libc.so.6
#1  0x00002aaab3be64fe in _int_free () from /usr/lib64/libc.so.6
#2  0x00002aaaba1cc22d in mkl_serv_free_buffers ()
   from /blues/gpfs/home/software/spack-0.10.1/opt/spack/linux-centos7-x86_64/intel-17.0.4/intel-mkl-2017.3.196-v7uuj6zmthzln35n2hb7i5u5ybncv5ev/compilers_and_libraries_2017.4.196/linux/mkl/lib/intel64_lin/libmkl_core.so
#3  0x00002aaaba1dcb26 in mkl_core_fini ()
   from /blues/gpfs/home/software/spack-0.10.1/opt/spack/linux-centos7-x86_64/intel-17.0.4/intel-mkl-2017.3.196-v7uuj6zmthzln35n2hb7i5u5ybncv5ev/compilers_and_libraries_2017.4.196/linux/mkl/lib/intel64_lin/libmkl_core.so
#4  0x00002aaaaaabab3a in _dl_fini () from /lib64/ld-linux-x86-64.so.2
#5  0x00002aaab3ba2a69 in __run_exit_handlers () from /usr/lib64/libc.so.6
#6  0x00002aaab3ba2ab5 in exit () from /usr/lib64/libc.so.6
#7  0x00002aaab3b8bc0c in __libc_start_main () from /usr/lib64/libc.so.6
#8  0x000000000060e5a3 in _start ()

Please let me know if you have any thoughts on how to solve this problem.

Thanks!

Lack of Windows tutorial

I believe this tutorial is not completely OS-agnostic, as it supposes that the tutorial should run on Unix-like OSes very subtly. Windows has no convenient programming tools like make by default, so a separate CMake or VS tutorial for Windows is required.

about usage of ModelUtil

I read mnist.cc in your tutorial. I found that the operators are added in different way in AddTrainingOperators. I know that the operator which has initializable parameters is actually two operators in initNet and predictNet respectively. But for the operators which has no intializable parameters, can I add them through just model.Add....() functions rather than model.predict.Add...() functions?

Problems with "application" of the code

Dear Leonard

First of all thank you for creating & sharing this tutorial and please accept my apologies for contacting you via "issues" (as this is not really issue with your tutorial, but with an attempt to apply it - and code from here - on my "hello mnist"). I'm not really interested in creating nets in C++ but in loading and using.

As I've said I've tried to come up with simple C++ app (here) that should be loading net and using it for predictions but it fails to replicate results that I'm getting on python side.

If I run mnist.py I'll get two ".pb" files with net definition and initialization. If I load this net on python side and feed it with some image from DB then I'll get correct predictions:

timg = np.fromfile('test_img.dat', dtype=np.uint8).reshape([28,28])
workspace.FeedBlob('data', (timg/256.).reshape([1,1,28,28]).astype(np.float32))
workspace.RunNet(net_def.name)
workspace.FetchBlob('softmax')
array([[  1.23242417e-05,   6.76146897e-07,   9.01260137e-06,
          1.60285403e-04,   9.54966026e-07,   6.82772861e-06,
          2.20508967e-09,   9.99059498e-01,   2.71651220e-06,
          7.47664250e-04]], dtype=float32)

So it is pretty sure it is '7' but If I load the same net in C++ and run it on the same file then I get completely different results. My C++ net initialization side looks like:

    QByteArray img_bytes;
    caffe2::NetDef init_net, predict_net;
    caffe2::TensorCPU input;
    // predictor and it's input/output vectors
    std::unique_ptr<caffe2::Predictor> predictor;
    caffe2::Predictor::TensorVector input_vec;
    caffe2::Predictor::TensorVector output_vec;
...
    QFile f("mnist_init_net.pb");
...
    auto barr = f.readAll();
    if (! init_net.ParseFromArray(barr.data(), barr.size())) {
...
    f.setFileName("mnist_predict_net.pb");
...
    barr = f.readAll();
    if (! predict_net.ParseFromArray(barr.data(), barr.size())) {
...
    predictor.reset(new caffe2::Predictor(init_net, predict_net));
    input.Resize(std::vector<int>{{1, 1, IMG_H, IMG_W}});
    input_vec.resize(1, &input);

and running net is (img_bytes is an array keeping img data):

    float* data = input.mutable_data<float>();
    for (int i = 0; i < img_bytes.size(); ++i)
        *data++ = float(img_bytes[i])/256.f;

    if (! predictor->run(input_vec, &output_vec) || output_vec.size() < 1
                                                 || output_vec[0]->size() != 10)
...

The result on the same file is that '7' is at 17% (not 99.9%) and the remaining categories are around 5-10%. Right now I'm stuck and I don't know where the problem is, so I'd like to kindly ask you to guide me a bit - if you don't mind and have some spare time :).

Best regards
Andrzej

PS. BTW everything is run on CPU (no CUDA) and on the very same computer if that matters.

Linking problem

What's the reason of this linking problem? Thanks.
make all
[ 40%] Built target caffe2_cpp
[ 43%] Linking CXX executable ../bin/train
Undefined symbols for architecture x86_64:
"caffe2::OpSchema::Arg(char const*, char const*, bool)", referenced from:
___cxx_global_var_init.3 in libcaffe2_cpp.a(affine_scale_op.cc.o)
___cxx_global_var_init.3 in libcaffe2_cpp.a(back_mean_op.cc.o)
___cxx_global_var_init.3 in libcaffe2_cpp.a(diagonal_op.cc.o)
___cxx_global_var_init.1 in libcaffe2_cpp.a(time_plot_op.cc.o)
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[3]: *** [../bin/train] Error 1
make[2]: *** [CMakeFiles/train.dir/all] Error 2
make[1]: *** [all] Error 2
make: *** [all] Error 2

pre-processing images

Hi,

I'm trying to understand how things have changes from caffe to caffe2, and I've opened an issue there but I have a few separate questions for this repository,
and in specific the pretrained example, especially lines 55 to 87.

  • I see you calculate and normalise each image, regardless of the network (which used to be the case?)
  • You split the data in three separate channels (B, G, R?) but then you append it all in one vector?
  • are those tensors single dimensional because of the network, or am I misunderstanding it?
  • how can I upload the data to a GPU tensor since I'm doing all the opencv pre-processing using cuda?

Many thanks (and have a nice weekend!),
Alex

Loading our own model for Deep Dream

I wanted to feed my own model into Deep Dream to see what kinds of images it would produce. I modified the code so that I could specify the init_net and predict_net to be loaded in from a local folder(similar to pretrained.cc), as opposed to specifying the model name, which would load in from the zoo keeper.

It worked, and I was able to load in and run a locally saved squeezenet model, and also a retrained googlenet model. However, when I tried loading in a model that I had trained through different code, I ran into this issue:

[libprotobuf FATAL /usr/include/google/protobuf/repeated_field.h:886] CHECK failed: (index) < (size()):
terminate called after throwing an instance of 'google::protobuf::FatalException'
what(): CHECK failed: (index) < (size()):

As I'm investigating the issue further, I wanted to ask and clarify if the dream code is set up in such a way that it only supports networks that have an identical architecture to the ones in the model zoo? If not, I'm assuming that there is something at fault with the model I am loading in. Are there any guidelines that would help ensure that a model that I train would be accepted by the dream code? Thanks!

SpatialBN AddInput during training

https://github.com/leonardvandriel/caffe2_cpp_tutorial/blob/3ca1b6702f3ec73df7a5d0cf3fbb4cd58281baaa/src/caffe2/util/model.cc#L176

During training, it seems that

predict.AddInput(output + "_mean");
predict.AddInput(output + "_var");

still required.


The other thing is:
https://github.com/leonardvandriel/caffe2_cpp_tutorial/blob/3ca1b6702f3ec73df7a5d0cf3fbb4cd58281baaa/src/caffe2/util/net_gradient.cc#L129
Why not add all meta.ops_? Instead of only using meta.ops_[0]. In the second case, some operations like BatchMatMul, MatMul, Mul will fail because of the lack of the gradient value of the second input.

about rnn tutorial

I can't figure out the role of prepare network. It seems the prepare network copies the last hidden status and cell status to the input of the LSTM after every sample is trained. But the next training sample is not what following the previous sequence in the text, so can't make use of the hidden and cell status. Is its purpose just to make the initial hidden and cell status random? Could I replace the prepare network with a random tensor filler?

resnet50 training error

Hi, I get the follow error when I try to train ResNet50 any thoughts?

/caffe2_cpp_tutorial-master$ sudo ./bin/train -model resnet50 -folder /home/ubuntu/data/data/ -epochs 200 -batch 94 -device cpu -skip_preprocess true

## CNN Training Example ##

optimizer: adam
device: cpu
using cuda: false
dump-model: false
model: resnet50
layer:
image-dir: /home/ubuntu/data/data/
db-type: leveldb
size: 224
epochs: 200
test-runs: 50
batch: 94
lr: 0.0001
skip_preprocess: true
zero-one: false
display: false
reshape: false

2 labels found:
  0: Dogs #65
  1: Cats #50
115 files found
230 images cached
no gradient for operator Sum
E1109 21:03:51.395666  1922 operator_schema.cc:57] Input index 0 and output idx 0 (res5c_branch2c) are set to be in-place but this is actually not supported by op SpatialBN
terminate called after throwing an instance of 'caffe2::EnforceNotMet'
  what():  [enforce fail at operator_gradient.h:70] schema->Verify(def_). (GradientMaker) Operator def did not pass schema checking: input: "res5c_branch2c" input: "res5c_branch2c_scale" input: "res5c_branch2c_bias" input: "res5c_branch2c_mean" input: "res5c_branch2c_var" output: "res5c_branch2c" type: "SpatialBN" arg { name: "is_test" i: 1 } arg { name: "epsilon" f: 0.001 } arg { name: "momentum" f: 0.1 } arg { name: "order" s: "NCHW" }
*** Aborted at 1510261431 (unix time) try "date -d @1510261431" if you are using GNU date ***
PC: @     0x7f2dbb609428 gsignal
*** SIGABRT (@0x782) received by PID 1922 (TID 0x7f2dc5a38e00) from PID 1922; stack trace: ***
    @     0x7f2dc3166390 (unknown)
    @     0x7f2dbb609428 gsignal
    @     0x7f2dbb60b02a abort
    @     0x7f2dbbf4c84d __gnu_cxx::__verbose_terminate_handler()
    @     0x7f2dbbf4a6b6 (unknown)
    @     0x7f2dbbf4a701 std::terminate()
    @     0x7f2dbbf4a919 __cxa_throw
    @           0x5b0283 caffe2::GradientMakerBase::VerifyOp()
    @           0x5b0311 caffe2::GradientMakerBase::Get()
    @     0x7f2dc50cb173 caffe2::GetGradientForOp()
    @           0x5842cd caffe2::NetUtil::AddGradientOp()
    @           0x584cd0 caffe2::NetUtil::AddAllGradientOp()
    @           0x5985b8 caffe2::ModelUtil::AddGradientOps()
    @           0x598628 caffe2::ModelUtil::AddTrainOps()
    @           0x52e90c caffe2::run()
    @           0x52ff57 main
    @     0x7f2dbb5f4830 __libc_start_main
    @           0x528d09 _start
    @                0x0 (unknown)
Aborted (core dumped)

Detectron

Hello,

Facebook research recently released the Detectron module based on caffe2, to do object detection.
This module includes python code.
I tried to load pretrained nets with a c++ code inspired from your repo, but it failed.

Did you try to use your code to load such nets ?

Thanks for your work :)

Issue building project as CMake external project

Hello again,

I've looked through the changes/helper functions you've made since the initial project and really like them (more than mine). I am going to try to use them in my own project by linking to your repo via cmake. However, I am running into some issues.

Project (if you wish to reproduce)
https://github.com/charlesrwest/CRWMachineLearning

I'm including your project by the following:

include(ExternalProject)

ExternalProject_Add(
  caffe2_cpp_tutorial

  GIT_REPOSITORY "https://github.com/leonardvandriel/caffe2_cpp_tutorial.git"
  GIT_TAG "master"
  
  UPDATE_COMMAND ""
  PATCH_COMMAND ""
  
  SOURCE_DIR "${CMAKE_SOURCE_DIR}/3rdparty/caffe2_cpp_tutorial"
  CMAKE_ARGS -DCMAKE_INSTALL_PREFIX=${GLOBAL_OUTPUT_PATH}/caffe2_cpp_tutorial
 
  TEST_COMMAND ""
)

Building from /build inside of your project works on my machine. However, building as a sub-project only manages to build the libraries and then crashes (linker error) when it tries to build the binaries (I've yet to determine a way to get it only to build the libraries as an external project).

charlesrwest@Telos:/cpp/projects/caffe2/cpp/CRWMachineLearning/build$ cmake ../
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Found Protobuf: /usr/lib/x86_64-linux-gnu/libprotobuf.so
-- Configuring done
-- Generating done
-- Build files have been written to: /home/charlesrwest/cpp/projects/caffe2/cpp/CRWMachineLearning/build
charlesrwest@Telos:
/cpp/projects/caffe2/cpp/CRWMachineLearning/build$ make
Scanning dependencies of target caffe2_cpp_tutorial
[ 7%] Creating directories for 'caffe2_cpp_tutorial'
[ 15%] Performing download step (git clone) for 'caffe2_cpp_tutorial'
Cloning into 'caffe2_cpp_tutorial'...
Already on 'master'
Your branch is up-to-date with 'origin/master'.
[ 23%] No patch step for 'caffe2_cpp_tutorial'
[ 30%] No update step for 'caffe2_cpp_tutorial'
[ 38%] Performing configure step for 'caffe2_cpp_tutorial'
-- The C compiler identification is GNU 5.4.0
-- The CXX compiler identification is GNU 5.4.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Found Protobuf: /usr/lib/x86_64-linux-gnu/libprotobuf.so
-- Configuring done
-- Generating done
-- Build files have been written to: /home/charlesrwest/cpp/projects/caffe2/cpp/CRWMachineLearning/build/caffe2_cpp_tutorial-prefix/src/caffe2_cpp_tutorial-build
[ 46%] Performing build step for 'caffe2_cpp_tutorial'
Scanning dependencies of target caffe2_cpp
[ 2%] Building CXX object CMakeFiles/caffe2_cpp.dir/src/caffe2/util/tensor.cc.o
[ 5%] Building CXX object CMakeFiles/caffe2_cpp.dir/src/caffe2/util/model.cc.o
[ 7%] Building CXX object CMakeFiles/caffe2_cpp.dir/src/caffe2/util/net.cc.o
[ 10%] Building CXX object CMakeFiles/caffe2_cpp.dir/src/caffe2/util/blob.cc.o
[ 12%] Building CXX object CMakeFiles/caffe2_cpp.dir/src/caffe2/operator/affine_scale_op.cc.o
[ 15%] Building CXX object CMakeFiles/caffe2_cpp.dir/src/caffe2/operator/mean_stdev_op.cc.o
[ 17%] Building CXX object CMakeFiles/caffe2_cpp.dir/src/caffe2/operator/back_mean_op.cc.o
[ 20%] Building CXX object CMakeFiles/caffe2_cpp.dir/src/caffe2/operator/diagonal_op.cc.o
[ 23%] Building CXX object CMakeFiles/caffe2_cpp.dir/src/caffe2/operator/show_worst_op.cc.o
[ 25%] Building CXX object CMakeFiles/caffe2_cpp.dir/src/caffe2/operator/cout_op.cc.o
[ 28%] Building CXX object CMakeFiles/caffe2_cpp.dir/src/caffe2/operator/zero_one_op.cc.o
[ 30%] Linking CXX static library libcaffe2_cpp.a
[ 30%] Built target caffe2_cpp
[ 33%] Building NVCC (Device) object CMakeFiles/caffe2_cpp_gpu.dir/src/caffe2/operator/caffe2_cpp_gpu_generated_mean_stdev_op.cu.o
[ 35%] Building NVCC (Device) object CMakeFiles/caffe2_cpp_gpu.dir/src/caffe2/operator/caffe2_cpp_gpu_generated_back_mean_op.cu.o
[ 38%] Building NVCC (Device) object CMakeFiles/caffe2_cpp_gpu.dir/src/caffe2/operator/caffe2_cpp_gpu_generated_diagonal_op.cu.o
[ 41%] Building NVCC (Device) object CMakeFiles/caffe2_cpp_gpu.dir/src/caffe2/operator/caffe2_cpp_gpu_generated_affine_scale_op.cu.o
Scanning dependencies of target caffe2_cpp_gpu
[ 43%] Linking CXX static library libcaffe2_cpp_gpu.a
[ 43%] Built target caffe2_cpp_gpu
Scanning dependencies of target intro
[ 46%] Building CXX object CMakeFiles/intro.dir/src/caffe2/binaries/intro.cc.o
[ 48%] Linking CXX executable /home/charlesrwest/cpp/projects/caffe2/cpp/CRWMachineLearning/3rdparty/caffe2_cpp_tutorial/bin/intro
/usr/bin/ld: libcaffe2_cpp.a(blob.cc.o): undefined reference to symbol 'curandDestroyGenerator'
/usr/local/cuda-8.0/lib64/libcurand.so.8.0: error adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status
CMakeFiles/intro.dir/build.make:122: recipe for target '/home/charlesrwest/cpp/projects/caffe2/cpp/CRWMachineLearning/3rdparty/caffe2_cpp_tutorial/bin/intro' failed
make[5]: *** [/home/charlesrwest/cpp/projects/caffe2/cpp/CRWMachineLearning/3rdparty/caffe2_cpp_tutorial/bin/intro] Error 1
CMakeFiles/Makefile2:68: recipe for target 'CMakeFiles/intro.dir/all' failed
make[4]: *** [CMakeFiles/intro.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make[3]: *** [all] Error 2
CMakeFiles/caffe2_cpp_tutorial.dir/build.make:112: recipe for target 'caffe2_cpp_tutorial-prefix/src/caffe2_cpp_tutorial-stamp/caffe2_cpp_tutorial-build' failed
make[2]: *** [caffe2_cpp_tutorial-prefix/src/caffe2_cpp_tutorial-stamp/caffe2_cpp_tutorial-build] Error 2
CMakeFiles/Makefile2:67: recipe for target 'CMakeFiles/caffe2_cpp_tutorial.dir/all' failed
make[1]: *** [CMakeFiles/caffe2_cpp_tutorial.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2

Error compiling cpp

I tried to compile the .cpp file and got errors like
"intro.cpp:(.text+0x1c23): undefined reference to `caffe2::NetDef::~NetDef()'" and a lot more similar to this

How do I compile your .cpp files without getting these errors?

Composable network definitions and exchange emails?

Hey Leonard,

Thank you for publishing this tutorial code. I've been pondering the best way to go about constructing caffe2 networks in C++ and have recently completed some work that while not complete, might have some potential.

If you have time, please take a look and tell me if you think this might be the way to go or if you have been going in a different direction:
https://github.com/charlesrwest/Caffe2ComputeModules/blob/master/src/executables/trainSineWave/main.cpp

Thanks,
Charlie West

P.S.
I would be interested in corresponding about this. My email is [email protected]. Please drop me a line if you would like to talk.

Can support directly read from lmdb format and can share your a small part of training dataset?

Your job is very good! The tools, retrain and train, both can not support directly read from lmdb format dataset. But very much dataset are lmdb. When i am traing GoogleNet on my custom dataset, get some errors. So i think my dataset have some errors. So i want to know what yours dataset are. Or where can i get "ILSVRC 2014 > GoogleNet/Inception" standard dataset. Or can u share your training a small part of dataset? Or about how to make dataset from some yourself images.

2017-09-19_102147

2017-09-19_103103

Can't find ModelDef or ModelMeta

I've built caffe2 and successfully "borrowed" your tutorial and toy tests, but when I try to use your MNIST test, it can't find ModelDef or ModelMeta (build errors). Where are these defined?

Issue with training googlenet from scratch

Hi , I got this error when I tried to train googlenet from scratch, any thoughts?

/caffe2_cpp_tutorial-master$ sudo ./bin/train --model googlenet --folder /home/ubuntu/data/SmallTrain

CNN Training Example

optimizer: adam
device: cudnn
using cuda: true
dump-model: false
model: googlenet
layer:
image-dir: /home/ubuntu/data/SmallTrain
db-type: leveldb
size: 224
epochs: 1000
test-runs: 50
batch: 64
lr: 0.0001
skip_preprocess: false
zero-one: false
display: false
reshape: false

2 labels found:
0: Dogs #65
1: Cats #50
115 files found
115 images cached

training..
terminate called after throwing an instance of 'caffe2::EnforceNotMet'
what(): [enforce fail at blob.h:76] IsType(). wrong type for the Blob instance. Blob contains nullptr (uninitialized) while caller expects caffe2::Tensorcaffe2::CUDAContext .
Offending Blob name: _conv2/norm2_scale.
Error from operator:
input: "conv2/3x3" input: "conv2/norm2" input: "_conv2/norm2_scale" input: "conv2/norm2_grad" output: "conv2/3x3_grad" name: "" type: "LRNGradient" arg { name: "size" i: 5 } arg { name: "alpha" f: 0.0001 } arg { name: "beta" f: 0.75 } arg { name: "bias" f: 1 } arg { name: "order" s: "NCHW" } device_option { device_type: 1 } engine: "CUDNN" is_gradient_op: true
*** Aborted at 1509939039 (unix time) try "date -d @1509939039" if you are using GNU date ***
PC: @ 0x7eff29eb5428 gsignal
*** SIGABRT (@0x6a2) received by PID 1698 (TID 0x7eff342e4e00) from PID 1698; stack trace: ***
@ 0x7eff31a12390 (unknown)
@ 0x7eff29eb5428 gsignal
@ 0x7eff29eb702a abort
@ 0x7eff2a7f884d __gnu_cxx::__verbose_terminate_handler()
@ 0x7eff2a7f66b6 (unknown)
@ 0x7eff2a7f6701 std::terminate()
@ 0x7eff2a7f6969 __cxa_rethrow
@ 0x5be8fd caffe2::Operator<>::Run()
@ 0x7eff33963a98 caffe2::SimpleNet::Run()
@ 0x7eff3398b060 caffe2::Workspace::RunNet()
@ 0x5294ae caffe2::run_trainer()
@ 0x52f1a4 caffe2::run()
@ 0x52ff57 main
@ 0x7eff29ea0830 __libc_start_main
@ 0x528d09 _start
@ 0x0 (unknown)
Aborted (core dumped)

about AddInput

I am wondering about the what model.predict.AddInput() does in mnist.cc in your tutorial. I see that blobs are created automatically when the related operators are created with the help of model helper. So I am not sure what model.predict.AddInput() does with all the blobs already created. I can't find its counterpart in the python tutorial. Therefore, I can only rely on your help. Thx.

Issues trying to train SqueezeNet

When I try to retrain the final layer of SqueezeNet, I get this error:

terminate called after throwing an instance of 'caffe2::EnforceNotMet'
what(): [enforce fail at cross_entropy_op.cu:34] X.ndim() == 2. 4 vs 2 Error from operator:
input: "softmaxout" input: "label" output: "xent" type: "LabelCrossEntropy" device_option { device_type: 1 }

Similar error also arises when I try to train SqueezeNet from scratch. However, if I change the device option to cpu, the error changes slightly (not sure if it provides any insight, but here it is for reference):

terminate called after throwing an instance of 'caffe2::EnforceNotMet'
what(): [enforce fail at accuracy_op.cc:10] X.ndim() == 2. 4 vs 2 Error from operator:
input: "softmax" input: "label" output: "accuracy" type: "Accuracy"

GoogleNet, on the other hand, gives no issues while training or retraining, so it seems to be something specific to the architecture of SqueezeNet. Any ideas?

nullptr in dream

I can get dream to run on the first layer of GoogleNet, but layers beyond that (for example, as in the command you have in the readme) result in the error below:

loading model..
split model..
running model..
start size: 73
terminate called after throwing an instance of 'caffe2::EnforceNotMet'
what(): [enforce fail at blob.h:76] IsType(). wrong type for the Blob instance. Blob contains nullptr (uninitialized) while caller expects caffe2::Tensorcaffe2::CUDAContext .
Offending Blob name: _norm2_scale.
Error from operator:
input: "conv2" input: "norm2" input: "_norm2_scale" input: "norm2_grad" output: "conv2_grad" name: "" type: "LRNGradient" arg { name: "size" i: 5 } arg { name: "alpha" f: 0.0001 } arg { name: "beta" f: 0.75 } arg { name: "bias" f: 1 } arg { name: "order" s: "NCHW" } device_option { device_type: 1 } engine: "CUDNN" is_gradient_op: true
*** Aborted at 1507278978 (unix time) try "date -d @1507278978" if you are using GNU date ***
PC: @ 0x7f26d0972428 gsignal
*** SIGABRT (@0x3e800001da2) received by PID 7586 (TID 0x7f26d98d8e80) from PID 7586; stack trace: ***
@ 0x7f26d6439390 (unknown)
@ 0x7f26d0972428 gsignal
@ 0x7f26d097402a abort
@ 0x7f26d12b584d __gnu_cxx::__verbose_terminate_handler()
@ 0x7f26d12b36b6 (unknown)
@ 0x7f26d12b3701 std::terminate()
@ 0x7f26d12b3969 __cxa_rethrow
@ 0x5bd9ab caffe2::Operator<>::Run()
@ 0x7f26d8eec888 caffe2::SimpleNet::Run()
@ 0x52d677 caffe2::run()
@ 0x52e3de main
@ 0x7f26d095d830 __libc_start_main
@ 0x526739 _start
@ 0x0 (unknown)
Aborted (core dumped)

Similar thing happens if I run it with AlexNet -- first layer runs fine, but latter layers run into norm1_scale or norm2_scale being uninitialized. Any ideas on how to resolve this?

CMake Error: variables not found (GTEST_LIB)

Hi, I get an error when executing cmake. How can I solve that issue?

CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
GTEST_LIB
linked by target "net_test" in directory /home/daniel/cvhci/caffe2/caffe2_cpp_tutorial
linked by target "net_gradient_test" in directory /home/daniel/cvhci/caffe2/caffe2_cpp_tutorial

the AddOptimizerOps create optimizer for blobs before stopgradient

I use AddOptimizerOps to create optimizer operators automatically. The optimizer operators are created even for blobs which are before the stopgradient operator. The stopgradient operator can surely prevent gradient calculation from being performed on blobs before it, but optimizers are still created for these blobs, so that caffe2 will complain about xxx_grad (which is not created because of stopgradient) is unknown.

Classifying with GPU

Hi I have retrained a network and use the "pretrained" example to classify images. But this is very slow. How can I use the GPU for classifying to make it faster?

Tensor::ShareData compilation error.

When compiling with make, I get the following error:

/usr/bin/c++  -DWITH_CUDA -DWITH_GPU -DWITH_OPENCV -I/home/brett/Programming/github/caffe2_cpp_tutorial/include -I/usr/include/eigen3 -I/usr/include/opencv  -std=c++11 -fPIC    -std=gnu++11 -o CMakeFiles/caffe2_cpp.dir/src/caffe2/util/blob.cc.o -c /home/brett/Programming/github/caffe2_cpp_tutorial/src/caffe2/util/blob.cc
/home/brett/Programming/github/caffe2_cpp_tutorial/src/caffe2/util/blob.cc: In member function ‘void caffe2::BlobUtil::Set(const TensorCPU&, bool)’:
/home/brett/Programming/github/caffe2_cpp_tutorial/src/caffe2/util/blob.cc:24:28: error: no matching function for call to ‘caffe2::Tensor<caffe2::CUDAContext>::ShareData(const TensorCPU&)’
     tensor->ShareData(value);
                            ^
In file included from /home/brett/Programming/github/caffe2_cpp_tutorial/include/caffe2/util/blob.h:5:0,
                 from /home/brett/Programming/github/caffe2_cpp_tutorial/src/caffe2/util/blob.cc:1:
/usr/local/include/caffe2/core/tensor.h:416:8: note: candidate: void caffe2::Tensor<Context>::ShareData(const caffe2::Tensor<Context>&) [with Context = caffe2::CUDAContext]
   void ShareData(const Tensor& src) {
        ^~~~~~~~~
/usr/local/include/caffe2/core/tensor.h:416:8: note:   no known conversion for argument 1 from ‘const TensorCPU {aka const caffe2::Tensor<caffe2::CPUContext>}’ to ‘const caffe2::Tensor<caffe2::CUDAContext>&’
CMakeFiles/caffe2_cpp.dir/build.make:62: recipe for target 'CMakeFiles/caffe2_cpp.dir/src/caffe2/util/blob.cc.o' failed

I'm using the current caffe2 from github and here is my CMakeCache. My system is Debian/unstable.

What version of caffe2 are you using?

Your transcription of MNIST to C++ includes caffe/util/net.h and uses NetUtil class.
Neither of them are available in current caffe2 trunk on github?

Make issue

Hello,

I have an issue during making the folder. I cannot found some function in operator folder.
Can you suggest me how to set up library path or something?
(I already installed caffe2 with cuda 8.0 and it is working well)

jaeseok@jaeseok-To-be-filled-by-O-E-M:~/workspace/caffe2_cpp_tutorial-master$ make
make[1]: Entering directory '/home/jaeseok/workspace/caffe2_cpp_tutorial-master/build'
-- Configuring done
-- Generating done
-- Build files have been written to: /home/jaeseok/workspace/caffe2_cpp_tutorial-master/build
make[2]: Entering directory '/home/jaeseok/workspace/caffe2_cpp_tutorial-master/build'
make[3]: Entering directory '/home/jaeseok/workspace/caffe2_cpp_tutorial-master/build'
make[3]: Leaving directory '/home/jaeseok/workspace/caffe2_cpp_tutorial-master/build'
[ 61%] Built target caffe2_cpp
make[3]: Entering directory '/home/jaeseok/workspace/caffe2_cpp_tutorial-master/build'
make[3]: Leaving directory '/home/jaeseok/workspace/caffe2_cpp_tutorial-master/build'
[ 88%] Built target caffe2_cpp_gpu
make[3]: Entering directory '/home/jaeseok/workspace/caffe2_cpp_tutorial-master/build'
make[3]: Leaving directory '/home/jaeseok/workspace/caffe2_cpp_tutorial-master/build'
make[3]: Entering directory '/home/jaeseok/workspace/caffe2_cpp_tutorial-master/build'
[ 94%] Linking CXX executable ../bin/intro
## libcaffe2_cpp_gpu.a(caffe2_cpp_gpu_generated_mean_stdev_op.cu.o): In function caffe2::(anonymous namespace)::CAFFE_ANONYMOUS_VARIABLE_CUDAMeanStdev()': /home/jaeseok/workspace/caffe2_cpp_tutorial-master/src/caffe2/operator/mean_stdev_op.cu:44: undefined reference to caffe2::(anonymous namespace)::CAFFE2_PLEASE_ADD_OPERATOR_SCHEMA_FOR_MeanStdev()'
libcaffe2_cpp_gpu.a(caffe2_cpp_gpu_generated_affine_scale_op.cu.o): In function caffe2::(anonymous namespace)::CAFFE_ANONYMOUS_VARIABLE_CUDAAffineScale()': /home/jaeseok/workspace/caffe2_cpp_tutorial-master/src/caffe2/operator/affine_scale_op.cu:85: undefined reference to caffe2::(anonymous namespace)::CAFFE2_PLEASE_ADD_OPERATOR_SCHEMA_FOR_AffineScale()'
libcaffe2_cpp_gpu.a(caffe2_cpp_gpu_generated_affine_scale_op.cu.o): In function caffe2::(anonymous namespace)::CAFFE_ANONYMOUS_VARIABLE_CUDAAffineScaleGradient()': /home/jaeseok/workspace/caffe2_cpp_tutorial-master/src/caffe2/operator/affine_scale_op.cu:86: undefined reference to caffe2::(anonymous namespace)::CAFFE2_PLEASE_ADD_OPERATOR_SCHEMA_FOR_AffineScaleGradient()'
libcaffe2_cpp_gpu.a(caffe2_cpp_gpu_generated_diagonal_op.cu.o): In function caffe2::(anonymous namespace)::CAFFE_ANONYMOUS_VARIABLE_CUDADiagonal()': /home/jaeseok/workspace/caffe2_cpp_tutorial-master/src/caffe2/operator/diagonal_op.cu:85: undefined reference to caffe2::(anonymous namespace)::CAFFE2_PLEASE_ADD_OPERATOR_SCHEMA_FOR_Diagonal()'
libcaffe2_cpp_gpu.a(caffe2_cpp_gpu_generated_diagonal_op.cu.o): In function caffe2::(anonymous namespace)::CAFFE_ANONYMOUS_VARIABLE_CUDADiagonalGradient()': /home/jaeseok/workspace/caffe2_cpp_tutorial-master/src/caffe2/operator/diagonal_op.cu:86: undefined reference to caffe2::(anonymous namespace)::CAFFE2_PLEASE_ADD_OPERATOR_SCHEMA_FOR_DiagonalGradient()'
libcaffe2_cpp_gpu.a(caffe2_cpp_gpu_generated_back_mean_op.cu.o): In function caffe2::(anonymous namespace)::CAFFE_ANONYMOUS_VARIABLE_CUDABackMean()': /home/jaeseok/workspace/caffe2_cpp_tutorial-master/src/caffe2/operator/back_mean_op.cu:72: undefined reference to caffe2::(anonymous namespace)::CAFFE2_PLEASE_ADD_OPERATOR_SCHEMA_FOR_BackMean()'
libcaffe2_cpp_gpu.a(caffe2_cpp_gpu_generated_back_mean_op.cu.o): In function caffe2::(anonymous namespace)::CAFFE_ANONYMOUS_VARIABLE_CUDABackMeanGradient()': /home/jaeseok/workspace/caffe2_cpp_tutorial-master/src/caffe2/operator/back_mean_op.cu:73: undefined reference to caffe2::(anonymous namespace)::CAFFE2_PLEASE_ADD_OPERATOR_SCHEMA_FOR_BackMeanGradient()'
collect2: error: ld returned 1 exit status
CMakeFiles/intro.dir/build.make:148: recipe for target '../bin/intro' failed
make[3]: *** [../bin/intro] Error 1
make[3]: Leaving directory '/home/jaeseok/workspace/caffe2_cpp_tutorial-master/build'
CMakeFiles/Makefile2:142: recipe for target 'CMakeFiles/intro.dir/all' failed
make[2]: *** [CMakeFiles/intro.dir/all] Error 2
make[2]: Leaving directory '/home/jaeseok/workspace/caffe2_cpp_tutorial-master/build'
Makefile:83: recipe for target 'all' failed
make[1]: *** [all] Error 2
make[1]: Leaving directory '/home/jaeseok/workspace/caffe2_cpp_tutorial-master/build'
Makefile:4: recipe for target 'all' failed
make: *** [all] Error 2

how to decide whether an operator is trainable

I am confusing about whether I shall put an operator into trainable_ops or non_trainable_ops. How do you decide it? I saw Scale operator was put in non_trainable_ops. Shall I put Pow in non_trainable_ops as well?

Cannot create operator of type '*****' on the device 'CUDA'.

I got an error in my own code, it seems that the 'AffineChannel' op is not implemented by cuda in caffe c++.Anyone get a similar error? Are there some tricks to solve the problem?

WARNING: Logging before InitGoogleLogging() is written to STDERR
E0127 10:30:38.059101 13294 operator.cc:130] Cannot find operator schema for AffineChannel. Will skip schema checking.
terminate called after throwing an instance of 'caffe2::EnforceNotMet'
what(): [enforce fail at operator.cc:190] op. Cannot create operator of type 'AffineChannel' on the device 'CUDA'.

how can I create all gradient operators in a separate network?

sometimes, I need to use different kind of loss functions at the end of one network structure. Currently, I have to create several networks with differences only in the last operators. Is there a way to create gradient operators in a separated network with your model helper, so that I can execute forward network, loss value calculating network and backward network sequentially to make it possible. Thx

Make Issue

CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
CURAND_LIB

Can you explain what is causing this error? Note that original there were two variables not found the NCCL was solved by the additional hint to export a new path to the .bashsrc.

CMakeLists.txt update for CURAND

Hi,

The CUDA cmake module includes a variable ${CUDA_curand_LIBRARY},
so you don't need to manually search for find_library(CURAND_LIB curand).

The NCCL appears to be REQUIRED if using CUDA? In which case,
the CMakeLists.txt could maybe benefit from a message(STATUS "NCCL lib in: ${NCCL}") ?
Also, the cmake .. and make in the root directory is a wee bit confusing.

Another thing, I kept getting segfaults, because I hadn't run ./script/download_resource.sh so maybe this should be in the README.md for silly people such as myself? 😀

By the way, thanks a ton for putting this repo together, it is really useful!!

Best,
Alex

diff-patch:

diff --git a/CMakeLists.txt b/CMakeLists.txt
index 558a85b..b7c2bb0 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -23,7 +23,6 @@ find_library(CAFFE2_GPU_LIB Caffe2_GPU)
 find_library(GLOG_LIB glog)
 find_library(GFLAGS_LIB gflags)
 find_library(NCCL_LIB nccl)
-find_library(CURAND_LIB curand)
 
 if(NOT CAFFE2_LIB)
   message(FATAL_ERROR "Caffe2 lib not found")
@@ -75,7 +74,7 @@ if(OpenCV_LIBS)
 endif()
 
 if(CUDA_LIBRARIES)
-  list(APPEND ALL_LIBRARIES ${CUDA_LIBRARIES} ${CUDA_CUDART_LIBRARY} ${NCCL_LIB} ${CURAND_LIB})
+    list(APPEND ALL_LIBRARIES ${CUDA_LIBRARIES} ${CUDA_CUDART_LIBRARY} ${NCCL_LIB} ${CUDA_curand_LIBRARY})
   add_definitions(-DWITH_CUDA)
 endif()

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.