Giter VIP home page Giter VIP logo

Comments (15)

corganhejijun avatar corganhejijun commented on June 17, 2024 1

BVLC/caffe#1917
is this problem the same as this Issue?

from faster_rcnn.

KapSteR avatar KapSteR commented on June 17, 2024

I get a similar error. Matlab simply shuts down when re-running the matlab demo.
Often a reboot is required to get it to run again.

from faster_rcnn.

varun-nagaraja avatar varun-nagaraja commented on June 17, 2024

Yup, reboot is the only way for me to get it working again.

from faster_rcnn.

ShaoqingRen avatar ShaoqingRen commented on June 17, 2024

@varun-nagaraja @KapSteR

I can't reproduce this bug on Windows. Ross also hasn't reported this bug on Ubuntu.

In the head for script_faster_rcnn_demo, we clear caffe mex (mexLock() is commented), so there should be any error thrown by caffe in the second calling.

I think we should make sure that the mex is cleared on your machine as expected.

from faster_rcnn.

rbgirshick avatar rbgirshick commented on June 17, 2024

I can reproduce the error in linux. It's low priority since it just affects the demo script and not training or testing. To clarify comments in the thread: a "reboot" of the computer is not required, just a restart of matlab.

from faster_rcnn.

KapSteR avatar KapSteR commented on June 17, 2024

So... It seems to my that there is somehow a GPU memory leak. The GPU memory usage grows linearly with every iteration of the main loop, until MATLAB crashes.

Is it wrong to assume that GPU memory usage is relatively constant with each forward pass, after "warm-up" ?

from faster_rcnn.

kukuruza avatar kukuruza commented on June 17, 2024

So is it a problem that mex doesn't clean up after itself correctly after all?

  1. For me on Linux, the free gpu memory before the 2nd run (4205486080) is 1MB less than before the first run (4206583808). That looks like a leek indeed.
  2. I also get a protobuf issue on the second run (Linux):
fast_rcnn startup done
GPU 1: free memory 4205486080
Use GPU 1

[libprotobuf ERROR google/protobuf/descriptor_database.cc:57] File already exists in database: caffe.proto
[libprotobuf FATAL google/protobuf/descriptor.cc:954] CHECK failed: generated_database_->Add(encoded_file_descriptor, size): 

------------------------------------------------------------------------
          std::terminate() detected at Mon Oct 12 13:07:26 2015
------------------------------------------------------------------------

Configuration:
  Crash Decoding      : Disabled
  Crash Mode          : continue (default)
  Current Graphics Driver: Unknown software 
  Current Visual      : None
  Default Encoding    : UTF-8
  GNU C Library       : 2.19 stable
  Host Name           : ip-172-31-21-65
  MATLAB Architecture : glnxa64
  MATLAB Root         : /usr/local/MATLAB/R2015a
  MATLAB Version      : 8.5.0.197613 (R2015a)
  OpenGL              : software
  Operating System    : Linux 3.13.0-44-generic #73-Ubuntu SMP Tue Dec 16 00:22:43 UTC 2014 x86_64
  Processor ID        : x86 Family 6 Model 45 Stepping 7, GenuineIntel
  Virtual Machine     : Java 1.7.0_60-b19 with Oracle Corporation Java HotSpot(TM) 64-Bit Server VM mixed mode
  Window System       : No active display

Fault Count: 1

...
Stack Trace (captured):
[  0] 0x00007f53a6b6570e    /usr/local/MATLAB/R2015a/bin/glnxa64/libmwfl.so+00988942 _ZN2fl4diag5linux6x86_6412context_base12capture_dataEv+00000030
...
[ 12] 0x00007f52bd507c12 /usr/local/MATLAB/R2015a/bin/glnxa64/libprotobuf.so.8+00433170 _ZN6google8protobuf14DescriptorPool24InternalAddGeneratedFileEPKvi+00000194
[ 13] 0x00007f52bdc6c37c /home/ubuntu/src/faster_rcnn/external/caffe/matlab/+caffe/private/caffe_.mexa64+00443260
...

from faster_rcnn.

BlueCrow1991 avatar BlueCrow1991 commented on June 17, 2024

This bug does not just affect the demo script, but also training and testing on Ubuntu.

When I re-run 'script_faster_rcnn_VOC2007_ZF.m', it happened too.

from faster_rcnn.

YingjieYin avatar YingjieYin commented on June 17, 2024

When I run script_faster_rcnn_demo
errors in caffe_log:
F1028 15:47:12.852134 2204 syncedmem.cpp:51] Check failed: error == cudaSuccess (4 vs. 0) unspecified launch failure
F1028 15:47:12.852134 2204 syncedmem.cpp:51] Check failed: error == cudaSuccess (4 vs. 0) unspecified launch failure

from faster_rcnn.

fengyuxi55 avatar fengyuxi55 commented on June 17, 2024

I can reproduce this problem.When I re-running script_faster_rcnn_demo.m, matlab crash:

[libprotobuf ERROR google/protobuf/descriptor_database.cc:57] File already exists in database: caffe.proto
[libprotobuf FATAL google/protobuf/descriptor.cc:1018] CHECK failed: generated_database_->Add(encoded_file_descriptor, size):
Caught "std::exception" Exception message is:
CHECK failed: generated_database_->Add(encoded_file_descriptor, size):

from faster_rcnn.

roytseng-tw avatar roytseng-tw commented on June 17, 2024

so how could I solve this problem?
I don't really understand. thx

from faster_rcnn.

gjyin avatar gjyin commented on June 17, 2024

how to solve the problem?
I met the bug on Ubuntu 14.04
[libprotobuf ERROR google/protobuf/descriptor_database.cc:57] File already exists in database: caffe.proto
[libprotobuf FATAL google/protobuf/descriptor.cc:954] CHECK failed: generated_database_->Add(encoded_file_descriptor, size):

from faster_rcnn.

esason avatar esason commented on June 17, 2024

I have solved the last issue ...
THE BUG:
Bug on Ubuntu 14.04,
[libprotobuf ERROR google/protobuf/descriptor_database.cc:57] File already exists in database:
caffe.proto
[libprotobuf FATAL google/protobuf/descriptor.cc:954] CHECK failed:
generated_database_->Add(encoded_file_descriptor, size):
in the first time I am running training or testing phase everything works fine at the first running,
but if the matlab is still on and I am trying to run it once again the bug occurs.

SOLUTION
it seems that it related to clear mex issues.

  • I comment the clear mex from m file
  • in the mex file I commented out the mexLock() function.

It seems to works ok. I would like to know why using the mex clear at all.

from faster_rcnn.

ZiangYan avatar ZiangYan commented on June 17, 2024

I have encountered the same bug, and solved it by re-compiling opencv with out dnn module. I found that caffe, protobuf, opencv-dnn couldn't work together. It seems to be a bug in either protobuf or opencv.

There are two solutions:

  1. statically link to protobuf (i.e., link to protobuf.a, NOT protobuf.so)

OR

  1. remove opencv_contrib/modules/cnn, and re-compile opencv

from faster_rcnn.

hongkaiyu2012 avatar hongkaiyu2012 commented on June 17, 2024

Problem solved:
#112 (comment)

from faster_rcnn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.