Giter VIP home page Giter VIP logo

Comments (5)

rykov8 avatar rykov8 commented on July 29, 2024

@neil454 sorry for late reply.
I believe, that you need to change the last activation to sigmoid (you have already done it) and use binary_crossentropy as your loss function for conf_loss and average it along classes, some example code (I don't guarantee that it works) for the method in class MultiboxLoss (mainly taken from Keras binary crossentropy loss):

def _multilabel_loss(self, y_true, y_pred):
        """Compute multilabel loss.
        # Arguments
            y_true: Ground truth targets,
                tensor of shape (?, num_boxes, num_classes).
            y_pred: Predicted logits,
                tensor of shape (?, num_boxes, num_classes).
        # Returns
            multilabel_loss: Multilabel loss, tensor of shape (?, num_boxes).
        """
        # this is just to be sure not to compute log(0)
        y_pred = tf.clip_by_value(y_pred, 1e-15, 1 - 1e-15)
        # convert to logits
        logits = tf.log(y_pred / (1 - y_pred))
        # binary crossentropy
        multilabel_loss = tf.nn.sigmoid_cross_entropy_with_logits(labels=y_true,
                                                                  logits=logits)
        # average along classes (not sure, probably, tf crossentropy above do this itself)
        multilabel_loss = tf.reduce_mean(multilabel_loss, reduction_indices=-1)
        return multilabel_loss

from ssd_keras.

neil454 avatar neil454 commented on July 29, 2024

Thanks for the reply!

I tried that code for conf_loss, but I ended up with a model that returned way too many detections. I'd expect most of these boxes to have a high confidence on the background class, but that's rarely the case (I haven't seen background confidence above 0.7 for any of the boxes).

Have you or anyone successfully even trained on regular VOC2007 or COCO using just the base VGG pre-trained weights? I just tried this, and I couldn't get good results (usually could only detect persons at 0.6-0.7 confidence, rarely any other classes). I tried to follow the SSD paper's training scheme as well.

Therefore, I don't know if the problem lies with my unique problem (data & custom loss) or the training code itself.

from ssd_keras.

rykov8 avatar rykov8 commented on July 29, 2024

@neil454 sorry for late reply again.
I haven't trained on regular VOC2007, but on my dataset training works quite well. I believe, that in the repo there is a bug in generator in random_sized_crop method, unfortunately, I have no time to fix it know, so, if you use random_sized_crop, you may try switch it off.

from ssd_keras.

oarriaga avatar oarriaga commented on July 29, 2024

@neil454 Hello Neil I have successfully trained VOC2007 unfortunately I didn't program any metrics; therefore I couldn't explicitly compare with the original SSD paper. However, I did disable random_sized_crop. The best validation loss I got was of 2.10 at iteration 09.

from ssd_keras.

neil454 avatar neil454 commented on July 29, 2024

@oarriaga Hello and thanks for the info.

I've always had do_crop=False, so that wouldn't be an issue.

When you trained on VOC2007, I'm assuming you didn't train completely from scratch, so how did you load the base VGG network weights for the first part of SSD?

Since I'm just trying to verify that this works, I just took the keras weights that @rykov8 provided and loaded just the VGG portion, instead of converting the actual VGG weights.

Also, what value did you use for neg_pos_ratio in MultiboxLoss? If I use 2.0 or 3.0, I end up with a model that rarely detects anything but person (too many negatives). In fact, I just tried lowering this to 1.0, and now I get a model that can detect most objects, but there are usually several bboxes, and many false positives, although I'm not done training this model.

Did you ever encounter any of these issues when training on VOC2007? In general, how does your trained model performance compare to the provided weights (ported directly from caffe)?

from ssd_keras.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.