Firstly, thanks for your sharing. I tried to train the model on Pascalvoc2012 , an

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

the model could not converge about face-detection-with-mobilenet-ssd HOT 13 CLOSED

bruceyang2012 commented on July 20, 2024

the model could not converge

from face-detection-with-mobilenet-ssd.

Comments (13)

hongrui16 commented on July 20, 2024

@bruceyang2012 @zjuyang

from face-detection-with-mobilenet-ssd.

bruceyang2012 commented on July 20, 2024

@hongrui16 May be there are someting wrong with your dataloder. Too little information for me to point out specific problems.

from face-detection-with-mobilenet-ssd.

hongrui16 commented on July 20, 2024

@bruceyang2012
thanks you very much,
make further explanation.

i tried to train the model in keras.
i converted face_train.ipynb to .py file. certainly, i made some modification, but it did matter little. after revising, the train code could run.

What matters, i revised the image loading part, face_generator.py， in order to load the pascalvoc dataset directly.
But i am sure that ‘’yield (np.array(batch_X), y_true)‘’ is in right format.
https://github.com/bruceyang2012/Face-detection-with-mobilenet-ssd/blob/master/face_generator.py#L1098

i think the key parts are model structure, loss calculation, box generator and box encoder. i did nothing changed.

i also rewrote the [email protected] calculation code in order to evaluate the model performance.

I want to confirm the key parts listed as above are correct. Bruce.

after that, i can narrow the scope to find out the error location.

In all, generous Bruce, the key parts are correct, right?

from face-detection-with-mobilenet-ssd.

hongrui16 commented on July 20, 2024

@bruceyang2012
this is my email address: [email protected]
could i add your wechat?
if possible, could you send your wechat to my email box.

from face-detection-with-mobilenet-ssd.

hongrui16 commented on July 20, 2024

@bruceyang2012

from face-detection-with-mobilenet-ssd.

bruceyang2012 commented on July 20, 2024

@hongrui16 I think you should check the parts you have changed. I haven't done any experiment on voc or coco. I think you can first do an experiment with a subset of these data and visualize your label on the image to ensure that your input is correct. If you haven't made any changes, the model part of the code should be fine.

from face-detection-with-mobilenet-ssd.

hongrui16 commented on July 20, 2024

@bruceyang2012 , hello, Generous Bruce, thank you very much.
i uncommented the sereval last lines and added some printed information in generator function in face_generator.py as following:
(in order to see the input format clearly, i set batch_size = 4)
if train:
if diagnostics:
yield (np.array(batch_X), y_true, batch_y, this_filenames, original_images, original_labels)
else:
#print "============================", np.ndim((np.array(batch_X), y_true))
print "Batch Shape :", np.array(batch_X).shape
print 'this_filenames:', this_filenames
print "batch_y:", batch_y
print 'y_true.shape:', y_true.shape
yield (np.array(batch_X), y_true)

i got the input information printed as following:
Batch Shape : (4, 300, 300, 3)
this_filenames: ['VOC2012/JPEGImages/2011_001514.jpg', 'VOC2012/JPEGImages/2011_001146.jpg', 'VOC2012/JPEGImages/2009_005015.jpg', 'VOC2012/JPEGImages/2011_003545.jpg']
batch_y: [array([[ 1, 41, 231, 90, 190]]), array([[ 6, 14, 282, 15, 285],
[ 6, 277, 299, 106, 154]]), array([[ 4, 97, 263, 31, 156],
[ 4, 39, 57, 117, 151]]), array([[ 15, 0, 194, 34, 300]])]
y_true.shape: (4, 2268, 33)
the image has been resize to (300,300)

Apart from y_true value, i have confirmed the batch_y vaule and format, and they are all correct.
So, it seems that i have loaded the data correctly.
y_true has been encoded in box encoder, and you also mentioned that this part are correct and i did nothing changed. So we can assume that y_true value is correct.

The following information was the output during training:

Epoch 80/100
lr remains 0.000572594814003
321/321 [==============================] - 123s 384ms/step - loss: 0.0753 - acc: 0.3788 - val_loss: 0.1655 - val_acc: 0.3933

Epoch 00080: val_loss did not improve from 0.14949
Epoch 81/100
lr changed to 0.000526787228882
321/321 [==============================] - 127s 397ms/step - loss: 0.0735 - acc: 0.3794 - val_loss: 0.1681 - val_acc: 0.3966

Epoch 00081: val_loss did not improve from 0.14949
Epoch 82/100
lr remains 0.000526787247509
321/321 [==============================] - 128s 398ms/step - loss: 0.0718 - acc: 0.3785 - val_loss: 0.1646 - val_acc: 0.3924

Epoch 00082: val_loss did not improve from 0.14949

In the training setting:
monitor='val_loss', steps_per_epoch = max(1,int(n_train_samples//batch_size))
I fact, Epoch is more than 82, because i have loaded the pretrained weight.
we can see from the above, the acc and val_acc are still very low and cannot be imporved.

In test part, i set
NMS_CONF = 0.4
NMS_IOU = 0.6
The final result is as following:
num_tp [ 0. 685. 260. 605. 253. 220. 289. 715. 1084. 531. 221. 178.
1035. 360. 331. 4981. 125. 479. 226. 422. 316.]
num_fp [ 0. 1125. 904. 577. 650. 315. 1088. 1348. 1615. 939. 397. 293.
1631. 853. 754. 5541. 266. 987. 1025. 684. 545.]
num_pos [ 0. 527. 456. 680. 539. 789. 377. 1282. 793. 1461. 398. 392.
911. 472. 473. 9375. 645. 515. 342. 393. 512.]
[email protected]: 0.3449509485065984

num_tp[i]: number of true positive boxes in classes[i]
num_fp[i]: number of false positive boxes in classes[i]
num_pos[i]: number of positive ground truth boxes in classes[i]
classes[0] stands for background

Q1: It seems that the train codes work correctly, right? @bruceyang2012

We also can see from above results, the [email protected] is very low,
but in another project, https://github.com/chuanqi305/MobileNet-SSD, they have got a much better result.
VOC0712 mAP=0.727.

Q2: if codes have been implemented correctly, how can i improve my result? @bruceyang2012

Thank you very very very much.

from face-detection-with-mobilenet-ssd.

hongrui16 commented on July 20, 2024

@bruceyang2012 many thanks

from face-detection-with-mobilenet-ssd.

bruceyang2012 commented on July 20, 2024

@hongrui16 This code is only a demo for learning, and I did not do more experiments to evaluate it. If you want to use this code to get better results, you can refer to the setting methods of other project, such as data augmentation, anchor setting, learning rate and so on. I don't have time to do these experiments now. You'll have to try them yourself.

from face-detection-with-mobilenet-ssd.

hongrui16 commented on July 20, 2024

@bruceyang2012 Okey, many thanks

from face-detection-with-mobilenet-ssd.

hongrui16 commented on July 20, 2024

@bruceyang2012
Finially, i found where there was the problem.
Indeed, there is an error in the box encoder code in 'ssd_box_encode_decode_utils.py' . Because the box outout format setting is ''box_output_format = ['class_id', 'xmin', 'xmax', 'ymin', 'ymax'])'', the right code should be as following:

if self.limit_boxes:
x_coords = boxes_tensor[:,:,:,[0, 1]]
x_coords[x_coords >= self.img_width] = self.img_width - 1
x_coords[x_coords < 0] = 0
boxes_tensor[:,:,:,[0, 1]] = x_coords
y_coords = boxes_tensor[:,:,:,[2, 3]]
y_coords[y_coords >= self.img_height] = self.img_height - 1
y_coords[y_coords < 0] = 0
boxes_tensor[:,:,:,[2, 3]] = y_coords

if self.normalize_coords:
boxes_tensor[:, :, :, :2] /= self.img_width
boxes_tensor[:, :, :, 2:] /= self.img_height

After i made these modifications, it worked well.

from face-detection-with-mobilenet-ssd.

bruceyang2012 commented on July 20, 2024

@hongrui16 You are right, wider face and VOC have different labeling formats. I forgot that.

from face-detection-with-mobilenet-ssd.

hongrui16 commented on July 20, 2024

@bruceyang2012 got it. thanks.

from face-detection-with-mobilenet-ssd.

the model could not converge about face-detection-with-mobilenet-ssd HOT 13 CLOSED

Comments (13)

Related Issues (14)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent