Giter VIP home page Giter VIP logo

Comments (13)

hongrui16 avatar hongrui16 commented on July 20, 2024

@bruceyang2012 @zjuyang

from face-detection-with-mobilenet-ssd.

bruceyang2012 avatar bruceyang2012 commented on July 20, 2024

@hongrui16 May be there are someting wrong with your dataloder. Too little information for me to point out specific problems.

from face-detection-with-mobilenet-ssd.

hongrui16 avatar hongrui16 commented on July 20, 2024

@bruceyang2012
thanks you very much,
make further explanation.

i tried to train the model in keras.
i converted face_train.ipynb to .py file. certainly, i made some modification, but it did matter little. after revising, the train code could run.

What matters, i revised the image loading part, face_generator.py, in order to load the pascalvoc dataset directly.
But i am sure that ‘’yield (np.array(batch_X), y_true)‘’ is in right format.
https://github.com/bruceyang2012/Face-detection-with-mobilenet-ssd/blob/master/face_generator.py#L1098

i think the key parts are model structure, loss calculation, box generator and box encoder. i did nothing changed.

i also rewrote the [email protected] calculation code in order to evaluate the model performance.

I want to confirm the key parts listed as above are correct. Bruce.

after that, i can narrow the scope to find out the error location.

In all, generous Bruce, the key parts are correct, right?

from face-detection-with-mobilenet-ssd.

hongrui16 avatar hongrui16 commented on July 20, 2024

@bruceyang2012
this is my email address: [email protected]
could i add your wechat?
if possible, could you send your wechat to my email box.

from face-detection-with-mobilenet-ssd.

hongrui16 avatar hongrui16 commented on July 20, 2024

@bruceyang2012

from face-detection-with-mobilenet-ssd.

bruceyang2012 avatar bruceyang2012 commented on July 20, 2024

@hongrui16 I think you should check the parts you have changed. I haven't done any experiment on voc or coco. I think you can first do an experiment with a subset of these data and visualize your label on the image to ensure that your input is correct. If you haven't made any changes, the model part of the code should be fine.

from face-detection-with-mobilenet-ssd.

hongrui16 avatar hongrui16 commented on July 20, 2024

@bruceyang2012 , hello, Generous Bruce, thank you very much.
i uncommented the sereval last lines and added some printed information in generator function in face_generator.py as following:
(in order to see the input format clearly, i set batch_size = 4)
if train:
if diagnostics:
yield (np.array(batch_X), y_true, batch_y, this_filenames, original_images, original_labels)
else:
#print "============================", np.ndim((np.array(batch_X), y_true))
print "Batch Shape :", np.array(batch_X).shape
print 'this_filenames:', this_filenames
print "batch_y:", batch_y
print 'y_true.shape:', y_true.shape
yield (np.array(batch_X), y_true)

i got the input information printed as following:
Batch Shape : (4, 300, 300, 3)
this_filenames: ['VOC2012/JPEGImages/2011_001514.jpg', 'VOC2012/JPEGImages/2011_001146.jpg', 'VOC2012/JPEGImages/2009_005015.jpg', 'VOC2012/JPEGImages/2011_003545.jpg']
batch_y: [array([[ 1, 41, 231, 90, 190]]), array([[ 6, 14, 282, 15, 285],
[ 6, 277, 299, 106, 154]]), array([[ 4, 97, 263, 31, 156],
[ 4, 39, 57, 117, 151]]), array([[ 15, 0, 194, 34, 300]])]
y_true.shape: (4, 2268, 33)
the image has been resize to (300,300)

Apart from y_true value, i have confirmed the batch_y vaule and format, and they are all correct.
So, it seems that i have loaded the data correctly.
y_true has been encoded in box encoder, and you also mentioned that this part are correct and i did nothing changed. So we can assume that y_true value is correct.

The following information was the output during training:

Epoch 80/100
lr remains 0.000572594814003
321/321 [==============================] - 123s 384ms/step - loss: 0.0753 - acc: 0.3788 - val_loss: 0.1655 - val_acc: 0.3933

Epoch 00080: val_loss did not improve from 0.14949
Epoch 81/100
lr changed to 0.000526787228882
321/321 [==============================] - 127s 397ms/step - loss: 0.0735 - acc: 0.3794 - val_loss: 0.1681 - val_acc: 0.3966

Epoch 00081: val_loss did not improve from 0.14949
Epoch 82/100
lr remains 0.000526787247509
321/321 [==============================] - 128s 398ms/step - loss: 0.0718 - acc: 0.3785 - val_loss: 0.1646 - val_acc: 0.3924

Epoch 00082: val_loss did not improve from 0.14949

In the training setting:
monitor='val_loss', steps_per_epoch = max(1,int(n_train_samples//batch_size))
I fact, Epoch is more than 82, because i have loaded the pretrained weight.
we can see from the above, the acc and val_acc are still very low and cannot be imporved.

In test part, i set
NMS_CONF = 0.4
NMS_IOU = 0.6
The final result is as following:
num_tp [ 0. 685. 260. 605. 253. 220. 289. 715. 1084. 531. 221. 178.
1035. 360. 331. 4981. 125. 479. 226. 422. 316.]
num_fp [ 0. 1125. 904. 577. 650. 315. 1088. 1348. 1615. 939. 397. 293.
1631. 853. 754. 5541. 266. 987. 1025. 684. 545.]
num_pos [ 0. 527. 456. 680. 539. 789. 377. 1282. 793. 1461. 398. 392.
911. 472. 473. 9375. 645. 515. 342. 393. 512.]

[email protected]: 0.3449509485065984

num_tp[i]: number of true positive boxes in classes[i]
num_fp[i]: number of false positive boxes in classes[i]
num_pos[i]: number of positive ground truth boxes in classes[i]
classes[0] stands for background

Q1: It seems that the train codes work correctly, right? @bruceyang2012

We also can see from above results, the [email protected] is very low,
but in another project, https://github.com/chuanqi305/MobileNet-SSD, they have got a much better result.
VOC0712 mAP=0.727.

Q2: if codes have been implemented correctly, how can i improve my result? @bruceyang2012

Thank you very very very much.

from face-detection-with-mobilenet-ssd.

hongrui16 avatar hongrui16 commented on July 20, 2024

@bruceyang2012 many thanks

from face-detection-with-mobilenet-ssd.

bruceyang2012 avatar bruceyang2012 commented on July 20, 2024

@hongrui16 This code is only a demo for learning, and I did not do more experiments to evaluate it. If you want to use this code to get better results, you can refer to the setting methods of other project, such as data augmentation, anchor setting, learning rate and so on. I don't have time to do these experiments now. You'll have to try them yourself.

from face-detection-with-mobilenet-ssd.

hongrui16 avatar hongrui16 commented on July 20, 2024

@bruceyang2012 Okey, many thanks

from face-detection-with-mobilenet-ssd.

hongrui16 avatar hongrui16 commented on July 20, 2024

@bruceyang2012
Finially, i found where there was the problem.
Indeed, there is an error in the box encoder code in 'ssd_box_encode_decode_utils.py' . Because the box outout format setting is ''box_output_format = ['class_id', 'xmin', 'xmax', 'ymin', 'ymax'])'', the right code should be as following:

if self.limit_boxes:
x_coords = boxes_tensor[:,:,:,[0, 1]]
x_coords[x_coords >= self.img_width] = self.img_width - 1
x_coords[x_coords < 0] = 0
boxes_tensor[:,:,:,[0, 1]] = x_coords
y_coords = boxes_tensor[:,:,:,[2, 3]]
y_coords[y_coords >= self.img_height] = self.img_height - 1
y_coords[y_coords < 0] = 0
boxes_tensor[:,:,:,[2, 3]] = y_coords

if self.normalize_coords:
boxes_tensor[:, :, :, :2] /= self.img_width
boxes_tensor[:, :, :, 2:] /= self.img_height

After i made these modifications, it worked well.

from face-detection-with-mobilenet-ssd.

bruceyang2012 avatar bruceyang2012 commented on July 20, 2024

@hongrui16 You are right, wider face and VOC have different labeling formats. I forgot that.

from face-detection-with-mobilenet-ssd.

hongrui16 avatar hongrui16 commented on July 20, 2024

@bruceyang2012 got it. thanks.

from face-detection-with-mobilenet-ssd.

Related Issues (14)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.