torontodeeplearning / convnet Goto Github PK
View Code? Open in Web Editor NEWA GPU implementation of Convolutional Neural Nets in C++
License: BSD 2-Clause "Simplified" License
A GPU implementation of Convolutional Neural Nets in C++
License: BSD 2-Clause "Simplified" License
I am happily using this awesome beautifully designed CNN tool.
If I publish a paper based on it, how should I cite it?
I'm trying to run the code on the GPU of my macbook pro. I got it working over a month ago. But now I'm getting this error:
Extracting features for dataset of size 10 # batches 1 # left overs 0
Writing to output.h5
Adding Dataspace output of size 10 1000
Adding Dataspace hidden7 of size 10 4096
Batch 1
Assertion failed: (mat->tex_obj != 0), function getTextureObject, file cudamat_conv_util.cu, line 38.
The GPU is GeForce GT 650M and the docs say it supports textures (cuda compatibility 3.0).
Any ideas what the issue could be?
It happens when I make the CPU feature tools.
I downloaded the exact versions of protobuf and hdf5 as in the INSTALL but make gave error like this:
convnet_config.pb.h:17:2: error: #error This file was generated by an older version of protoc which is
convnet_config.pb.h:18:2: error: #error incompatible with your Protocol Buffer headers. Please
convnet_config.pb.h:19:2: error: #error regenerate this file with a newer version of protoc.
Please help.
I'm using g++ 4.6 and CUDA 5.5. I successfully build the libcudamat* files, but the main build fails to find several of the references within them:
g++46 -L/usr/gapps/brain/installs/opencv3beta/lib -L/usr/gapps/brain/installs/generic/lib -L/opt/cudatoolkit-5.5/lib64 -L/g/g13/boakye1/workspace/convnet/cudamat -I/usr/gapps/brain/installs/opencv3beta/include -I/usr/gapps/brain/installs/generic/include -I/opt/cudatoolkit-5.5/include -Isrc -Ideps -I/g/g13/uname/workspace/convnet/cudamat -DUSE_GEMM obj/convnet_config.pb.o obj/util.o obj/matrix.o obj/loss_functions.o obj/layer.o obj/image_iterators.o obj/video_iterators.o obj/datahandler.o obj/datawriter.o obj/optimizer.o obj/edge.o obj/edge_with_weight.o obj/avgpool_edge.o obj/conv_edge.o obj/conv_onetoone_edge.o obj/downsample_edge.o obj/fc_edge.o obj/local_edge.o obj/maxpool_edge.o obj/response_norm_edge.o obj/rgb_to_yuv_edge.o obj/upsample_edge.o obj/convnet.o obj/train_convnet.o obj/multigpu_convnet.o -o bin/train_convnet -lopencv_core -lopencv_imgcodecs -lopencv_imgproc -lopencv_videoio -lhdf5 -ljpeg -lX11 -lpthread -lprotobuf -lcublas -ldl -lgomp -lcudamat -lcudart -Wl,-rpath=/g/g13/uname/workspace/convnet/cudamat -Wl,-rpath=/usr/gapps/brain/installs/opencv3beta/lib -Wl,-rpath=/usr/gapps/brain/installs/generic/lib -Wl,-rpath=/opt/cudatoolkit-5.5/lib64 -lcudamat_conv_gemm
obj/matrix.o: In function Matrix::~Matrix()': matrix.cc:(.text+0x9): undefined reference to
destroy_tex'
obj/matrix.o: In function Matrix::WriteValue(int, int, float)': matrix.cc:(.text+0x5f5): undefined reference to
write_at'
obj/matrix.o: In function Matrix::ReadValue(int, int)': matrix.cc:(.text+0x69a): undefined reference to
read_from'
obj/matrix.o: In function Matrix::CopyP2PAsync(Matrix&)': matrix.cc:(.text+0x744): undefined reference to
copy_on_device_p2p_async'
....and so on. I've tried with and without the GEMM kernels but am still unsuccessful. Also, I had to change "-Wno-unused-result" to "-Wno-unused-value" in the Makefile. Any idea what the issue might be?
Hi.
how i should treat {k, n} in the equation in [1] using your variables in net-model file?
(I understand that the alpha is 'add_scale' and beta is 'pow_scale')
[1] '3.3 Local Response Normalization' in Krizhevsky et al.'s paper.
And now I found the assertion error mentioned before is occur in RESOPONSE_NORM edge.
According the install instruction. Set the paths to the dependencies in convnet/cpu/Makefile
I don't find the file in convnet about the version.Please help me.
Hi, Nitish,
How to obtain efficiency gain with multigpu training? Some screenshots of training information are attached for your information. In my result, the one-batch training time is nearly the same for both one-gpu and two-gpu cases.
training with only one GPU board with one-step training time around 0.890:
training with two GPU board with one-step training time around 0.884:
when running
train_convnet --model=net.pbtxt --train=train_data.pbtxt --val=val_data.pbtxt --board 0
or
train_convnet --model=net.pbtxt --train=train_plus_val_data.pbtxt --val=val_data.pbtxt --board 0
this is what we get:
Warning : Could not set up P2P, GPU-to-GPU communication will be slow. CUDA error
Using board 0
Using board 0
Using board 0
Using board 0
Max for Used for computing average length of incoming weight vectors. 48 on gpu 0
Max for Used for computing average length of incoming weight vectors. 128 on gpu 0
Layer input: 28x28
Layer hidden1_conv: 25x25
Layer hidden1_maxpool: 11x11
Layer hidden2_conv: 8x8
Layer hidden2_maxpool: 3x3
Layer output: 1x1
Training data set size 60000
Validation data set size 10000
Max for shared bias 30000 on gpu 0
Allocating new temp memory of size 30000 on gpu 0
Initialized weight: Dense Uniform. Initial scale 0.981715
Initialized weight: Dense Uniform. Initial scale 0.999742
Initialized weight: Dense Uniform. Initial scale 0.993758
input:hidden1_conv Convolutional Kernel: 4-4-1 : 48 Layer: 28-28 : 25-25
hidden1_conv:hidden1_maxpool MaxPool Kernel: 4-4-48 : 48 Layer: 25-25 : 11-11
hidden1_maxpool:hidden2_conv Convolutional Kernel: 4-4-48 : 128 Layer: 11-11 : 8-8
hidden2_conv:hidden2_maxpool MaxPool Kernel: 4-4-128 : 128 Layer: 8-8 : 3-3
hidden2_maxpool:output Fully Connected :3-3-128:10
Checkpointing at ./checkpoint_dir/mnist_conv_20141127173240
----- SUMMARY ------
databuffer 209.618 MB
edge 0.422585 MB
layer 35.0544 MB
misc 1.07556 MB
ones 2 MB
temp 0.114441 MB
TOTAL 248.285 MB
Step 1Invalid start / end 0 0
Step 1Invalid start / end 0 0 ?
make error:
g++ -L/home/xijing/local/lib -L/usr/local/cuda/lib64 -L/home/xijing/Work/convnet/cudamat -I/home/xijing/local/include -I/usr/local/cuda/include -Isrc -Ideps -DUSE_GEMM obj/image_iterators.o obj/image2hdf5.o obj/util.o -o bin/image2hdf5 -lopencv_core -lopencv_imgcodecs -lopencv_imgproc -lopencv_videoio -lhdf5 -ljpeg -lX11 -lpthread -lprotobuf -lcublas -ldl -lgomp -lcudamat -lcudart -Wl,-rpath=/home/xijing/Work/convnet/cudamat -Wl,-rpath=/home/xijing/local/lib -Wl,-rpath=/usr/local/cuda/lib64 -lcudamat_conv_gemm
make: *** No rule to make target `obj/video2hdf5.o', needed by `bin/video2hdf5'. Stop.
Hi,
How can we configure convnet to display localization?
Thanks
Atiqur
In Caffe and cuda-convnet, there is "size" parameter in the local response norm layer.
In this code, there is "frac_of_filters_response_norm" in the RESPONSE_NORM Edge.
How are the two parameters, "size" and "frac_of_filters_response_norm", related? I think
size = number of total neurons * frac_of_filters_response_norm
Is that correct?
Execuse me, I have some question about convnet.
I have train the convnet on multiGPUs by using "boards=01", however, the result is the train only on one board rather than two boards. I get the result by using "nvidia-smi" and find that there two processes but only one of the boards is used.
So I want to know how to train the model by data parallel and model parallel using multiGPUs, do I have to modify something?
My mailbox is [email protected] and look forward to your response.
Thank you very much.
Hi Nitish,
Would you write out how to create a model file from raw-jpeg-images for imagenet (:Alex's net) on README-file?
Though my efforts, some problems occurs and i couldn't reach the success.
/usr/local/CImg-1.5.9/CImg.h:269:21: error: jpeglib.h: No such file or directory
this seems CImg depends on jpeglib? where to download it? it hasn't been mentioned in your INSTALL file.
Been playing around with this a little bit and wanted to open an issue in case anyone else is doing the same. This assumes you're using homebrew.
current progress:
brew uninstall opencv && brew install --devel opencv
.-D_XOPEN_SOURCE=600 -D_DARWIN_C_SOURCE
#include <stdlib.h>
to the top of CPUMatrix.hln -s /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.10.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers/X11/ /usr/local/include/X11
now however I'm getting a bunch of errors in Cimg:
../..//deps/CImg/CImg.h: In constructor 'cimg_library::cimg::X11_info::X11_info()': ../..//deps/CImg/CImg.h:2527:22: error: 'XInitThreads' was not declared in this scope XInitThreads(); ^ ../..//deps/CImg/CImg.h: In static member function 'static int cimg_library::CImgDisplay::screen_width()': ../..//deps/CImg/CImg.h:7308:45: error: 'XOpenDisplay' was not declared in this scope Display *const _dpy = XOpenDisplay(0); ^ ../..//deps/CImg/CImg.h:7312:27: error: 'XCloseDisplay' was not declared in this scope XCloseDisplay(_dpy);
I'm going to step away for a bit, but if anyone else has gotten this working I'd be happy to hear from you.
Hi,
I want to train using my own dataset. Can you please guide me to some walkthrough guide on how to train and (later) validate the CNN?
When I run:
python run_convnet.py
I got the following error:
Traceback (most recent call last):
File "run_convnet.py", line 57, in
main()
File "run_convnet.py", line 28, in main
board = cn.LockGPU()
File "/home/fs/ylu/Code/convnet/py/util.py", line 19, in LockGPU
board = gpu_lock.obtain_lock_id()
File "/home/fs/ylu/Code/convnet/cudamat/gpu_lock2.py", line 83, in obtain_lock_id
id = obtain_lock_id_to_hog()
File "/home/fs/ylu/Code/convnet/cudamat/gpu_lock2.py", line 100, in obtain_lock_id_to_hog
for id in board_ids():
File "/home/fs/ylu/Code/convnet/cudamat/gpu_lock2.py", line 30, in board_ids
p = Popen(['/u/tang/bin/get_num_gpu_boards'], stdout=PIPE)
File "/usr/lib/python2.7/subprocess.py", line 710, in init
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1335, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
I fixed it by modifying 27-32 lines in convnet/cudamat/gpu_lock2.py. From:
p = Popen(['/u/tang/bin/get_num_gpu_boards'], stdout=PIPE)
nBoards = int(p.stdout.read())
return range(nBoards)
to:
from glob import glob
board_devs = glob(_dev_prefix + '[0-9]*')
return range(len(board_devs))
hi,
does it run on osx + NVIDIA GeForce GT 750M with 2GB GDDR5 memory?
Thanks,
Gerald
Since we only have one GPU is it safe to ignore this warning?
Hi Nitish,
(Firstly, I'd like to mention I have not the knowledge about GEMM.)
I don't know the rule of the variable 'partial_sum' in Edge group.
I had understand that the default value '0' is ok for all but the speed is very slow.
But I met the error also.
Though I had investigated to get the rule how to decide this value, I don't get the rule.
What's the rule for it?
Hi Nitish,
I met this kind of error when i run train_convnet.
"train_convnet: cudamat_conv_gemm.cu:691: void convOutpGemm(cudamat, cudamat_, cudamat*, Shape4D, Shape4D, Shape4D, ConvDesc, float, float, bool): Assertion `num_input_channels * filterModuleMult == num_input_channels3Mult' failed."
after analyzing, I found the error occur a part below.
void ConvEdge::ComputeOuter(Matrix& input, Matrix& deriv_output) {
...
if (partial_sum_locs > 1) {
...
} else {
...
}
dt_temp(:targets)'s element shape[3] is strange...
when I change like it,
'if (partial_sum_locs > 1) { --> if (0) {'
the problem was not generated.
In pbtxt2dot.py, the Sort method seems to be calling an undefined GetName method. I had to add the method to get it working.
def Sort(model):
def GetName(edge):
return '%s:%s' % (edge.source, edge.dest)
S = []
L = []
Please correct me if I am wrong. Thanks.
Actually I have been using Nitish's deepnet toolbox (https://github.com/nitishsrivastava/deepnet) for almost a year and I have successfully installed it on my windows machine, which is not difficult. Now I noticed that Convnet is a more wonderful toolbox of CNNs, and it is similar to deepnet, thus I am wondering is it easy to install Convnet on Windows system?
There is a bug in Line 296-312
void DataHandler::LoadChunk(DataIterator& it, Matrix& mat, vector& random_rows) {
float* data_ptr = mat.GetHostData();
int num_dims = it.GetDims();
int num_rand = (chunk_size_ + random_access_chunk_size_ - 1) / random_access_chunk_size_;
int row, end;
for (int i = 0; i < num_rand; i++) {
row = random_rows[i];
end = (row + random_access_chunk_size_) % dataset_size_;
if (end < row) {
it.Get(data_ptr, row, dataset_size_);
it.Get(data_ptr + num_dims * (dataset_size_ - row), 0, end);
} else {
it.Get(data_ptr, row, end);
}
data_ptr += num_dims * random_access_chunk_size_;
}
}
One possible way to fix it could be :
void DataHandler::LoadChunk(DataIterator& it, Matrix& mat, vector& random_rows) {
float* data_ptr = mat.GetHostData();
int num_dims = it.GetDims();
int num_rand = (chunk_size_ + random_access_chunk_size_ - 1) / random_access_chunk_size_;
int row, end;
for (int i = 0; i < num_rand; i++) {
row = random_rows[i];
end = (row + random_access_chunk_size_) % dataset_size_;
if (end < row) {
it.Get(data_ptr, row, dataset_size_);
int remain_size = random_access_chunk_size_ - (dataset_size_ - row);
if (remain_size > 0)
{
it.Get(data_ptr + num_dims * (dataset_size_ - row), 0, remain_size);
}
} else {
it.Get(data_ptr, row, end);
}
data_ptr += num_dims * random_access_chunk_size_;
}
}
I try to run the py/run_convnet.py in the following way:
The outputs as follows:
Using board 0
[u'input', u'hidden1_conv', u'hidden1_maxpool', u'hidden1_rnorm', u'hidden2_conv', u'hidden2_maxpool', u'hidden2_rnorm', u'hidden3_conv', u'hidden4_conv', u'hidden5_conv', u'hidden5_maxpool', u'hidden6', u'hidden7', u'output']
python: cudamat_conv_gemm.cu:477: void convUpGemm(cudamat, cudamat_, cudamat*, Shape4D, Shape4D, Shape4D, ConvDesc, float, float, bool): Assertion `output_channel_end - output_channel_begin == num_output_channels3' failed.
Aborted (core dumped)
How can i fix it ? Thank you!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.