Comments (15)
BVLC/caffe#1917
is this problem the same as this Issue?
from faster_rcnn.
I get a similar error. Matlab simply shuts down when re-running the matlab demo.
Often a reboot is required to get it to run again.
from faster_rcnn.
Yup, reboot is the only way for me to get it working again.
from faster_rcnn.
I can't reproduce this bug on Windows. Ross also hasn't reported this bug on Ubuntu.
In the head for script_faster_rcnn_demo, we clear caffe mex (mexLock() is commented), so there should be any error thrown by caffe in the second calling.
I think we should make sure that the mex is cleared on your machine as expected.
from faster_rcnn.
I can reproduce the error in linux. It's low priority since it just affects the demo script and not training or testing. To clarify comments in the thread: a "reboot" of the computer is not required, just a restart of matlab.
from faster_rcnn.
So... It seems to my that there is somehow a GPU memory leak. The GPU memory usage grows linearly with every iteration of the main loop, until MATLAB crashes.
Is it wrong to assume that GPU memory usage is relatively constant with each forward pass, after "warm-up" ?
from faster_rcnn.
So is it a problem that mex doesn't clean up after itself correctly after all?
- For me on Linux, the free gpu memory before the 2nd run (4205486080) is 1MB less than before the first run (4206583808). That looks like a leek indeed.
- I also get a protobuf issue on the second run (Linux):
fast_rcnn startup done
GPU 1: free memory 4205486080
Use GPU 1
[libprotobuf ERROR google/protobuf/descriptor_database.cc:57] File already exists in database: caffe.proto
[libprotobuf FATAL google/protobuf/descriptor.cc:954] CHECK failed: generated_database_->Add(encoded_file_descriptor, size):
------------------------------------------------------------------------
std::terminate() detected at Mon Oct 12 13:07:26 2015
------------------------------------------------------------------------
Configuration:
Crash Decoding : Disabled
Crash Mode : continue (default)
Current Graphics Driver: Unknown software
Current Visual : None
Default Encoding : UTF-8
GNU C Library : 2.19 stable
Host Name : ip-172-31-21-65
MATLAB Architecture : glnxa64
MATLAB Root : /usr/local/MATLAB/R2015a
MATLAB Version : 8.5.0.197613 (R2015a)
OpenGL : software
Operating System : Linux 3.13.0-44-generic #73-Ubuntu SMP Tue Dec 16 00:22:43 UTC 2014 x86_64
Processor ID : x86 Family 6 Model 45 Stepping 7, GenuineIntel
Virtual Machine : Java 1.7.0_60-b19 with Oracle Corporation Java HotSpot(TM) 64-Bit Server VM mixed mode
Window System : No active display
Fault Count: 1
...
Stack Trace (captured):
[ 0] 0x00007f53a6b6570e /usr/local/MATLAB/R2015a/bin/glnxa64/libmwfl.so+00988942 _ZN2fl4diag5linux6x86_6412context_base12capture_dataEv+00000030
...
[ 12] 0x00007f52bd507c12 /usr/local/MATLAB/R2015a/bin/glnxa64/libprotobuf.so.8+00433170 _ZN6google8protobuf14DescriptorPool24InternalAddGeneratedFileEPKvi+00000194
[ 13] 0x00007f52bdc6c37c /home/ubuntu/src/faster_rcnn/external/caffe/matlab/+caffe/private/caffe_.mexa64+00443260
...
from faster_rcnn.
This bug does not just affect the demo script, but also training and testing on Ubuntu.
When I re-run 'script_faster_rcnn_VOC2007_ZF.m', it happened too.
from faster_rcnn.
When I run script_faster_rcnn_demo
errors in caffe_log:
F1028 15:47:12.852134 2204 syncedmem.cpp:51] Check failed: error == cudaSuccess (4 vs. 0) unspecified launch failure
F1028 15:47:12.852134 2204 syncedmem.cpp:51] Check failed: error == cudaSuccess (4 vs. 0) unspecified launch failure
from faster_rcnn.
I can reproduce this problem.When I re-running script_faster_rcnn_demo.m, matlab crash:
[libprotobuf ERROR google/protobuf/descriptor_database.cc:57] File already exists in database: caffe.proto
[libprotobuf FATAL google/protobuf/descriptor.cc:1018] CHECK failed: generated_database_->Add(encoded_file_descriptor, size):
Caught "std::exception" Exception message is:
CHECK failed: generated_database_->Add(encoded_file_descriptor, size):
from faster_rcnn.
so how could I solve this problem?
I don't really understand. thx
from faster_rcnn.
how to solve the problem?
I met the bug on Ubuntu 14.04
[libprotobuf ERROR google/protobuf/descriptor_database.cc:57] File already exists in database: caffe.proto
[libprotobuf FATAL google/protobuf/descriptor.cc:954] CHECK failed: generated_database_->Add(encoded_file_descriptor, size):
from faster_rcnn.
I have solved the last issue ...
THE BUG:
Bug on Ubuntu 14.04,
[libprotobuf ERROR google/protobuf/descriptor_database.cc:57] File already exists in database:
caffe.proto
[libprotobuf FATAL google/protobuf/descriptor.cc:954] CHECK failed:
generated_database_->Add(encoded_file_descriptor, size):
in the first time I am running training or testing phase everything works fine at the first running,
but if the matlab is still on and I am trying to run it once again the bug occurs.
SOLUTION
it seems that it related to clear mex issues.
- I comment the clear mex from m file
- in the mex file I commented out the mexLock() function.
It seems to works ok. I would like to know why using the mex clear at all.
from faster_rcnn.
I have encountered the same bug, and solved it by re-compiling opencv with out dnn module. I found that caffe, protobuf, opencv-dnn couldn't work together. It seems to be a bug in either protobuf or opencv.
There are two solutions:
- statically link to protobuf (i.e., link to protobuf.a, NOT protobuf.so)
OR
- remove opencv_contrib/modules/cnn, and re-compile opencv
from faster_rcnn.
Problem solved:
#112 (comment)
from faster_rcnn.
Related Issues (20)
- demo Check failed HOT 3
- No detection on "training image" with a trained net
- GPU memory is not enough(2g),how to deal with it
- A compiling problem with Win10+VS2017+MATLAB R2016b+CUDA9.1: undefined reference
- the error when training with VOC2007 HOT 2
- Error when compiling nms_gpu_mex
- Error '_MSC_VER': value '1600' doesn't match value '1800' in nms_gpu_mex.o running faster_rcnn_build.m file HOT 1
- training faster rcnn on ROI only ?
- Fast RCNN loss doesn't change when trained with my own dateset. Not fine tuned on pretrained model
- error at hNet = caffe_('get_net', model_file, phase_name); HOT 1
- runtime question
- Some problems when i train VOC2007 database HOT 1
- I just finished demo and try to train with pascal_voc dataset. (1) If I command /py-faster-rcnn/experiments/scripts/faster_rcnn_alt_opt.sh 0 ZF pascal_voc ./faster_rcnn_alt_opt.sh: line 46: ./tools/train_faster_rcnn_alt_opt.py: No such file or directory above error message is shown up. HOT 1
- error when run faster rcnn in (stage one of fast rcnn ) win10, GPU 1080 TI, Matlab 2018a ; i RUN demo without error ; any solve ? HOT 1
- Is there validatoin set in faster_rcnn ?
- Concatenate a fixed value to fc7 features
- how can i train faster R-CNN using a new customised network model?
- Failed when running 'faster_rcnn_build.m' HOT 1
- What is the mean image?
- Loss is not changing
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from faster_rcnn.