Hello, thank you so much for this Keras implementation of SSD! I hav

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Modifying SSD to support multiple labels per bounding box output? about ssd_keras HOT 5 OPEN

rykov8 commented on July 29, 2024

Modifying SSD to support multiple labels per bounding box output?

from ssd_keras.

Comments (5)

rykov8 commented on July 29, 2024

@neil454 sorry for late reply.
I believe, that you need to change the last activation to sigmoid (you have already done it) and use binary_crossentropy as your loss function for conf_loss and average it along classes, some example code (I don't guarantee that it works) for the method in class MultiboxLoss (mainly taken from Keras binary crossentropy loss):

def _multilabel_loss(self, y_true, y_pred):
        """Compute multilabel loss.
        # Arguments
            y_true: Ground truth targets,
                tensor of shape (?, num_boxes, num_classes).
            y_pred: Predicted logits,
                tensor of shape (?, num_boxes, num_classes).
        # Returns
            multilabel_loss: Multilabel loss, tensor of shape (?, num_boxes).
        """
        # this is just to be sure not to compute log(0)
        y_pred = tf.clip_by_value(y_pred, 1e-15, 1 - 1e-15)
        # convert to logits
        logits = tf.log(y_pred / (1 - y_pred))
        # binary crossentropy
        multilabel_loss = tf.nn.sigmoid_cross_entropy_with_logits(labels=y_true,
                                                                  logits=logits)
        # average along classes (not sure, probably, tf crossentropy above do this itself)
        multilabel_loss = tf.reduce_mean(multilabel_loss, reduction_indices=-1)
        return multilabel_loss

from ssd_keras.

neil454 commented on July 29, 2024

Thanks for the reply!

I tried that code for conf_loss, but I ended up with a model that returned way too many detections. I'd expect most of these boxes to have a high confidence on the background class, but that's rarely the case (I haven't seen background confidence above 0.7 for any of the boxes).

Have you or anyone successfully even trained on regular VOC2007 or COCO using just the base VGG pre-trained weights? I just tried this, and I couldn't get good results (usually could only detect persons at 0.6-0.7 confidence, rarely any other classes). I tried to follow the SSD paper's training scheme as well.

Therefore, I don't know if the problem lies with my unique problem (data & custom loss) or the training code itself.

from ssd_keras.

rykov8 commented on July 29, 2024

@neil454 sorry for late reply again.
I haven't trained on regular VOC2007, but on my dataset training works quite well. I believe, that in the repo there is a bug in generator in random_sized_crop method, unfortunately, I have no time to fix it know, so, if you use random_sized_crop, you may try switch it off.

from ssd_keras.

oarriaga commented on July 29, 2024

@neil454 Hello Neil I have successfully trained VOC2007 unfortunately I didn't program any metrics; therefore I couldn't explicitly compare with the original SSD paper. However, I did disable random_sized_crop. The best validation loss I got was of 2.10 at iteration 09.

from ssd_keras.

neil454 commented on July 29, 2024

@oarriaga Hello and thanks for the info.

I've always had do_crop=False, so that wouldn't be an issue.

When you trained on VOC2007, I'm assuming you didn't train completely from scratch, so how did you load the base VGG network weights for the first part of SSD?

Since I'm just trying to verify that this works, I just took the keras weights that @rykov8 provided and loaded just the VGG portion, instead of converting the actual VGG weights.

Also, what value did you use for neg_pos_ratio in MultiboxLoss? If I use 2.0 or 3.0, I end up with a model that rarely detects anything but person (too many negatives). In fact, I just tried lowering this to 1.0, and now I get a model that can detect most objects, but there are usually several bboxes, and many false positives, although I'm not done training this model.

Did you ever encounter any of these issues when training on VOC2007? In general, how does your trained model performance compare to the provided weights (ported directly from caffe)?

from ssd_keras.

Modifying SSD to support multiple labels per bounding box output? about ssd_keras HOT 5 OPEN

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent