Giter VIP home page Giter VIP logo

Comments (24)

warmspringwinds avatar warmspringwinds commented on July 18, 2024 1

@extragoya I also solved the problem by decreasing the learning rate.
Seemed to make it work.

from hed.

zeakey avatar zeakey commented on July 18, 2024 1

@cchenzhou In my experience the loss is of little impact, the loss value is still big even model reach convergence. I suggest you to test your model on validation set. Do you use your own dataset? NOTE the positive/negative ration because SoftmaxLoss in HED will count adjust the positive/negative ratio.

from hed.

extragoya avatar extragoya commented on July 18, 2024

I also get this problem with my own dataset. The only way I could get the loss to stop exploding was to use learning rates of 1*10e-8, but I feel like this is getting way too small...I'd be interested to know what other people had to do to get this working on their other datasets.

from hed.

zeakey avatar zeakey commented on July 18, 2024

smaller learning rate may be helps

from hed.

cchenzhou avatar cchenzhou commented on July 18, 2024

hi, I also to use learning rates of 1*10e -8,but the value of loss id very big even Twenty thousand.what is your learning rate?thank you.

from hed.

codecolony avatar codecolony commented on July 18, 2024

@cchenzhou - Don't worry about bigger loss value. It means nothing from my experience. The results should show up as expected. Just make sure that loss comes down slowly.

from hed.

cchenzhou avatar cchenzhou commented on July 18, 2024

@codecolony thank you very much.When the iteration equals 100000 the training is stop,but the loss value equals 22758.3.I can understand why.Any idea about how to fix the error?

from hed.

zeakey avatar zeakey commented on July 18, 2024

@brisker It clearly tells you your ground-truth size doesn't match corresponding image size, because ImageLabelMapDataLayer cannot open the image file. You may need to check whether the image exists.

from hed.

extragoya avatar extragoya commented on July 18, 2024

@brisker Notice the error says: "Could not open or find file ../../data/HED-BSDS/train/aug_gt_scale_0.5/157.5_1_0/159045.png" - the problem is it can't find an image in your training list. The image_labelmap_data_layer will keep running, however, and then throw the final error when the sizes don't match. I would avoid the relative paths and make sure there are no spaces or anything else funny in your training list.

from hed.

extragoya avatar extragoya commented on July 18, 2024

@brisker For one reason or the other, it's unable to open the file. I would assume that something is wrong with the path - what do you have as root_folder under image_data_param in your train_val.prototxt?

from hed.

brisker avatar brisker commented on July 18, 2024

@extragoya
At the begining, it was "../../data/HED-BSDS/", and error occurs.
Then, I replace it with "/home/jcc/code/hed/data/HED-BSDS/"
Still error...

from hed.

brisker avatar brisker commented on July 18, 2024

@extragoya
I think the error clearly shows that the code has found the train_pair.lst, so why can not find the images?

from hed.

extragoya avatar extragoya commented on July 18, 2024

@brisker The path to the train_pair.lst is given in your train_val prototx file, whereas the paths to the images are given in train_pair.lst and your root_folder - so the path to the list can be correct whereas the paths to the images could be incorrect. Also, train_pair.lst is a text file, so it is opened differently than an image. Do you have opencv installed, and did you turn opencv on in your makefile.config?

from hed.

zeakey avatar zeakey commented on July 18, 2024

@brisker
Could you plz give me the situation about the directories, including where you run solve.py(by default you should run solve.py in hed/examples/hed/), and where you put your dataset, and an example line of your train_pair.lst.

Further more, please check the permission of the dataset, do you have reading permision ?

from hed.

brisker avatar brisker commented on July 18, 2024

@extragoya
I have opencv 3 installed, but I can not see any flags indicating whether turning opencv on in the makefile.config here: https://github.com/s9xie/hed/blob/master/Makefile.config.example I think I got the opencv well compiled with caffe.

from hed.

brisker avatar brisker commented on July 18, 2024

@zeakey
I run solve.py in hed/examples/hed/. and the data folder is in /home/jcc/code/hed ,just the root folder.
The line of train.lst seems like train/aug_gt_scale_0.5/157.5_1_0/159045.png

from hed.

extragoya avatar extragoya commented on July 18, 2024

@brisker Ok, later versions of caffe have an option to turn OPEN_CV off and also to specify what version you're using: https://github.com/BVLC/caffe/blob/master/Makefile.config.example. It may not apply with HED's version of caffe, but that is why I asked.

from hed.

brisker avatar brisker commented on July 18, 2024

@extragoya @zeakey
Thanks a lot for your replies!
No other advice?

from hed.

zeakey avatar zeakey commented on July 18, 2024

from hed.

extragoya avatar extragoya commented on July 18, 2024

@zeakey @brisker Agreed. I would add that you should try to determine if the problem is whether the code cannot find the image, or can find the image but cannot load it.

from hed.

cchenzhou avatar cchenzhou commented on July 18, 2024

@zeakey I have been tested my model on test set,but the result is very bad.I can't see any contour on the output image,so I think my model don't reach convergence.I use their BSDS500 and only change the learning rate.Could you tell me any parameters should be change or adjust the positive/negative ratio?Thank you!

from hed.

codecolony avatar codecolony commented on July 18, 2024

@cchenzhou - What is the learning rate you're using?

from hed.

brisker avatar brisker commented on July 18, 2024

@codecolony
@zeakey
@extragoya
Hi, the modified SigmoidCrossEntropy Loss layer has a line
that reads like -:

bottom_diff[i * dim + j] *= 1 * count_neg / (count_pos + count_neg);

why not

 bottom_diff[i * dim + j] = 1 * count_neg / (count_pos + count_neg);

?
It seems that the gradients are multiplied every iteration in the loops. It is a little confusing to me. Why does the author write like that?

from hed.

dxytz avatar dxytz commented on July 18, 2024

@cchenzhou
Hi,
My model can't reach convergence too. How about yours? My label map is a binary image(0-255) with one channel. Do you have any advice?
Tank you!

from hed.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.