happynear / amsoftmax Goto Github PK
View Code? Open in Web Editor NEWA simple yet effective loss function for face verification.
License: MIT License
A simple yet effective loss function for face verification.
License: MIT License
Should we save the norm1 layer for deploy? or just get the output from fc5.
I train my model[Webface] in parms of s=30, m=0.35. The Result in lfw is 98.53%. I try to change parms but it get worse result. Is the parms is the best in your tests? Can data strength help me improve the effect? thanks for your advice
您好,我有看您的实验都是使用的64层spereface结构进行训练的!那么请问,您有在别的结构上训练过么?我看arcface实用您的损失,arcface的结构能够达到99.7~99.8(仅适用vgg2或者ms)
I tried to add batch normalization on your modified resnet20, but the loss became 87.3365. As far as I know, BN helps learning more quickly, Is it possible to add batch normalization with amsoftmax?
Here is the prototxt
layer {
name: "input"
type: "Input"
top: "data"
input_param {
shape {
dim: 1
dim: 3
dim: 160
dim: 160
}
}
}
layer {
name: "conv1_1"
type: "Convolution"
bottom: "data"
top: "conv1_1"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
stride: 2
weight_filler {
type: "xavier"
}
}
}
layer {
name: "conv1_1/bn"
type: "BatchNorm"
bottom: "conv1_1"
top: "conv1_1"
param {
lr_mult: 0
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
}
layer {
name: "conv1_1/scale"
type: "Scale"
bottom: "conv1_1"
top: "conv1_1"
param {
lr_mult: 1
decay_mult: 0
}
param {
lr_mult: 2
decay_mult: 0
}
scale_param {
bias_term: true
}
}
layer {
name: "relu1_1"
type: "PReLU"
bottom: "conv1_1"
top: "conv1_1"
}
layer {
name: "conv1_2"
type: "Convolution"
bottom: "conv1_1"
top: "conv1_2"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
stride: 1
weight_filler {
type: "xavier"
}
}
}
layer {
name: "conv1_2/bn"
type: "BatchNorm"
bottom: "conv1_2"
top: "conv1_2"
param {
lr_mult: 0
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
}
layer {
name: "conv1_2/scale"
type: "Scale"
bottom: "conv1_2"
top: "conv1_2"
param {
lr_mult: 1
decay_mult: 0
}
param {
lr_mult: 2
decay_mult: 0
}
scale_param {
bias_term: true
}
}
layer {
name: "relu1_2"
type: "PReLU"
bottom: "conv1_2"
top: "conv1_2"
}
layer {
name: "conv1_3"
type: "Convolution"
bottom: "conv1_2"
top: "conv1_3"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
stride: 1
weight_filler {
type: "xavier"
}
}
}
layer {
name: "conv1_3/bn"
type: "BatchNorm"
bottom: "conv1_3"
top: "conv1_3"
param {
lr_mult: 0
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
}
layer {
name: "conv1_3/scale"
type: "Scale"
bottom: "conv1_3"
top: "conv1_3"
param {
lr_mult: 1
decay_mult: 0
}
param {
lr_mult: 2
decay_mult: 0
}
scale_param {
bias_term: true
}
}
layer {
name: "relu1_3"
type: "PReLU"
bottom: "conv1_3"
top: "conv1_3"
}
layer {
name: "res1_3"
type: "Eltwise"
bottom: "conv1_1"
bottom: "conv1_3"
top: "res1_3"
}
layer {
name: "conv2_1"
type: "Convolution"
bottom: "res1_3"
top: "conv2_1"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0
}
convolution_param {
num_output: 128
pad: 1
kernel_size: 3
stride: 2
weight_filler {
type: "xavier"
}
}
}
layer {
name: "conv2_1/bn"
type: "BatchNorm"
bottom: "conv2_1"
top: "conv2_1"
param {
lr_mult: 0
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
}
layer {
name: "conv2_1/scale"
type: "Scale"
bottom: "conv2_1"
top: "conv2_1"
param {
lr_mult: 1
decay_mult: 0
}
param {
lr_mult: 2
decay_mult: 0
}
scale_param {
bias_term: true
}
}
layer {
name: "relu2_1"
type: "PReLU"
bottom: "conv2_1"
top: "conv2_1"
}
layer {
name: "conv2_2"
type: "Convolution"
bottom: "conv2_1"
top: "conv2_2"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0
}
convolution_param {
num_output: 128
pad: 1
kernel_size: 3
stride: 1
weight_filler {
type: "xavier"
}
}
}
layer {
name: "conv2_2/bn"
type: "BatchNorm"
bottom: "conv2_2"
top: "conv2_2"
param {
lr_mult: 0
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
}
layer {
name: "conv2_2/scale"
type: "Scale"
bottom: "conv2_2"
top: "conv2_2"
param {
lr_mult: 1
decay_mult: 0
}
param {
lr_mult: 2
decay_mult: 0
}
scale_param {
bias_term: true
}
}
layer {
name: "relu2_2"
type: "PReLU"
bottom: "conv2_2"
top: "conv2_2"
}
layer {
name: "conv2_3"
type: "Convolution"
bottom: "conv2_2"
top: "conv2_3"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0
}
convolution_param {
num_output: 128
pad: 1
kernel_size: 3
stride: 1
weight_filler {
type: "xavier"
}
}
}
layer {
name: "conv2_3/bn"
type: "BatchNorm"
bottom: "conv2_3"
top: "conv2_3"
param {
lr_mult: 0
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
}
layer {
name: "conv2_3/scale"
type: "Scale"
bottom: "conv2_3"
top: "conv2_3"
param {
lr_mult: 1
decay_mult: 0
}
param {
lr_mult: 2
decay_mult: 0
}
scale_param {
bias_term: true
}
}
layer {
name: "relu2_3"
type: "PReLU"
bottom: "conv2_3"
top: "conv2_3"
}
layer {
name: "res2_3"
type: "Eltwise"
bottom: "conv2_1"
bottom: "conv2_3"
top: "res2_3"
}
layer {
name: "conv2_4"
type: "Convolution"
bottom: "res2_3"
top: "conv2_4"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0
}
convolution_param {
num_output: 128
pad: 1
kernel_size: 3
stride: 1
weight_filler {
type: "xavier"
}
}
}
layer {
name: "conv2_4/bn"
type: "BatchNorm"
bottom: "conv2_4"
top: "conv2_4"
param {
lr_mult: 0
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
}
layer {
name: "conv2_4/scale"
type: "Scale"
bottom: "conv2_4"
top: "conv2_4"
param {
lr_mult: 1
decay_mult: 0
}
param {
lr_mult: 2
decay_mult: 0
}
scale_param {
bias_term: true
}
}
layer {
name: "relu2_4"
type: "PReLU"
bottom: "conv2_4"
top: "conv2_4"
}
layer {
name: "conv2_5"
type: "Convolution"
bottom: "conv2_4"
top: "conv2_5"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0
}
convolution_param {
num_output: 128
pad: 1
kernel_size: 3
stride: 1
weight_filler {
type: "xavier"
}
}
}
layer {
name: "conv2_5/bn"
type: "BatchNorm"
bottom: "conv2_5"
top: "conv2_5"
param {
lr_mult: 0
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
}
layer {
name: "conv2_5/scale"
type: "Scale"
bottom: "conv2_5"
top: "conv2_5"
param {
lr_mult: 1
decay_mult: 0
}
param {
lr_mult: 2
decay_mult: 0
}
scale_param {
bias_term: true
}
}
layer {
name: "relu2_5"
type: "PReLU"
bottom: "conv2_5"
top: "conv2_5"
}
layer {
name: "res2_5"
type: "Eltwise"
bottom: "res2_3"
bottom: "conv2_5"
top: "res2_5"
}
layer {
name: "conv3_1"
type: "Convolution"
bottom: "res2_5"
top: "conv3_1"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
stride: 2
weight_filler {
type: "xavier"
}
}
}
layer {
name: "conv3_1/bn"
type: "BatchNorm"
bottom: "conv3_1"
top: "conv3_1"
param {
lr_mult: 0
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
}
layer {
name: "conv3_1/scale"
type: "Scale"
bottom: "conv3_1"
top: "conv3_1"
param {
lr_mult: 1
decay_mult: 0
}
param {
lr_mult: 2
decay_mult: 0
}
scale_param {
bias_term: true
}
}
layer {
name: "relu3_1"
type: "PReLU"
bottom: "conv3_1"
top: "conv3_1"
}
layer {
name: "conv3_2"
type: "Convolution"
bottom: "conv3_1"
top: "conv3_2"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
stride: 1
weight_filler {
type: "xavier"
}
}
}
layer {
name: "conv3_2/bn"
type: "BatchNorm"
bottom: "conv3_2"
top: "conv3_2"
param {
lr_mult: 0
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
}
layer {
name: "conv3_2/scale"
type: "Scale"
bottom: "conv3_2"
top: "conv3_2"
param {
lr_mult: 1
decay_mult: 0
}
param {
lr_mult: 2
decay_mult: 0
}
scale_param {
bias_term: true
}
}
layer {
name: "relu3_2"
type: "PReLU"
bottom: "conv3_2"
top: "conv3_2"
}
layer {
name: "conv3_3"
type: "Convolution"
bottom: "conv3_2"
top: "conv3_3"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
stride: 1
weight_filler {
type: "xavier"
}
}
}
layer {
name: "conv3_3/bn"
type: "BatchNorm"
bottom: "conv3_3"
top: "conv3_3"
param {
lr_mult: 0
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
}
layer {
name: "conv3_3/scale"
type: "Scale"
bottom: "conv3_3"
top: "conv3_3"
param {
lr_mult: 1
decay_mult: 0
}
param {
lr_mult: 2
decay_mult: 0
}
scale_param {
bias_term: true
}
}
layer {
name: "relu3_3"
type: "PReLU"
bottom: "conv3_3"
top: "conv3_3"
}
layer {
name: "res3_3"
type: "Eltwise"
bottom: "conv3_1"
bottom: "conv3_3"
top: "res3_3"
}
layer {
name: "conv3_4"
type: "Convolution"
bottom: "res3_3"
top: "conv3_4"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
stride: 1
weight_filler {
type: "xavier"
}
}
}
layer {
name: "conv3_4/bn"
type: "BatchNorm"
bottom: "conv3_4"
top: "conv3_4"
param {
lr_mult: 0
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
}
layer {
name: "conv3_4/scale"
type: "Scale"
bottom: "conv3_4"
top: "conv3_4"
param {
lr_mult: 1
decay_mult: 0
}
param {
lr_mult: 2
decay_mult: 0
}
scale_param {
bias_term: true
}
}
layer {
name: "relu3_4"
type: "PReLU"
bottom: "conv3_4"
top: "conv3_4"
}
layer {
name: "conv3_5"
type: "Convolution"
bottom: "conv3_4"
top: "conv3_5"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
stride: 1
weight_filler {
type: "xavier"
}
}
}
layer {
name: "conv3_5/bn"
type: "BatchNorm"
bottom: "conv3_5"
top: "conv3_5"
param {
lr_mult: 0
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
}
layer {
name: "conv3_5/scale"
type: "Scale"
bottom: "conv3_5"
top: "conv3_5"
param {
lr_mult: 1
decay_mult: 0
}
param {
lr_mult: 2
decay_mult: 0
}
scale_param {
bias_term: true
}
}
layer {
name: "relu3_5"
type: "PReLU"
bottom: "conv3_5"
top: "conv3_5"
}
layer {
name: "res3_5"
type: "Eltwise"
bottom: "res3_3"
bottom: "conv3_5"
top: "res3_5"
}
layer {
name: "conv3_6"
type: "Convolution"
bottom: "res3_5"
top: "conv3_6"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
stride: 1
weight_filler {
type: "xavier"
}
}
}
layer {
name: "conv3_6/bn"
type: "BatchNorm"
bottom: "conv3_6"
top: "conv3_6"
param {
lr_mult: 0
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
}
layer {
name: "conv3_6/scale"
type: "Scale"
bottom: "conv3_6"
top: "conv3_6"
param {
lr_mult: 1
decay_mult: 0
}
param {
lr_mult: 2
decay_mult: 0
}
scale_param {
bias_term: true
}
}
layer {
name: "relu3_6"
type: "PReLU"
bottom: "conv3_6"
top: "conv3_6"
}
layer {
name: "conv3_7"
type: "Convolution"
bottom: "conv3_6"
top: "conv3_7"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
stride: 1
weight_filler {
type: "xavier"
}
}
}
layer {
name: "conv3_7/bn"
type: "BatchNorm"
bottom: "conv3_7"
top: "conv3_7"
param {
lr_mult: 0
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
}
layer {
name: "conv3_7/scale"
type: "Scale"
bottom: "conv3_7"
top: "conv3_7"
param {
lr_mult: 1
decay_mult: 0
}
param {
lr_mult: 2
decay_mult: 0
}
scale_param {
bias_term: true
}
}
layer {
name: "relu3_7"
type: "PReLU"
bottom: "conv3_7"
top: "conv3_7"
}
layer {
name: "res3_7"
type: "Eltwise"
bottom: "res3_5"
bottom: "conv3_7"
top: "res3_7"
}
layer {
name: "conv3_8"
type: "Convolution"
bottom: "res3_7"
top: "conv3_8"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
stride: 1
weight_filler {
type: "xavier"
}
}
}
layer {
name: "conv3_8/bn"
type: "BatchNorm"
bottom: "conv3_8"
top: "conv3_8"
param {
lr_mult: 0
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
}
layer {
name: "conv3_8/scale"
type: "Scale"
bottom: "conv3_8"
top: "conv3_8"
param {
lr_mult: 1
decay_mult: 0
}
param {
lr_mult: 2
decay_mult: 0
}
scale_param {
bias_term: true
}
}
layer {
name: "relu3_8"
type: "PReLU"
bottom: "conv3_8"
top: "conv3_8"
}
layer {
name: "conv3_9"
type: "Convolution"
bottom: "conv3_8"
top: "conv3_9"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
stride: 1
weight_filler {
type: "xavier"
}
}
}
layer {
name: "conv3_9/bn"
type: "BatchNorm"
bottom: "conv3_9"
top: "conv3_9"
param {
lr_mult: 0
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
}
layer {
name: "conv3_9/scale"
type: "Scale"
bottom: "conv3_9"
top: "conv3_9"
param {
lr_mult: 1
decay_mult: 0
}
param {
lr_mult: 2
decay_mult: 0
}
scale_param {
bias_term: true
}
}
layer {
name: "relu3_9"
type: "PReLU"
bottom: "conv3_9"
top: "conv3_9"
}
layer {
name: "res3_9"
type: "Eltwise"
bottom: "res3_7"
bottom: "conv3_9"
top: "res3_9"
}
layer {
name: "conv4_1"
type: "Convolution"
bottom: "res3_9"
top: "conv4_1"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0
}
convolution_param {
num_output: 512
pad: 1
kernel_size: 3
stride: 2
weight_filler {
type: "xavier"
}
}
}
layer {
name: "conv4_1/bn"
type: "BatchNorm"
bottom: "conv4_1"
top: "conv4_1"
param {
lr_mult: 0
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
}
layer {
name: "conv4_1/scale"
type: "Scale"
bottom: "conv4_1"
top: "conv4_1"
param {
lr_mult: 1
decay_mult: 0
}
param {
lr_mult: 2
decay_mult: 0
}
scale_param {
bias_term: true
}
}
layer {
name: "relu4_1"
type: "PReLU"
bottom: "conv4_1"
top: "conv4_1"
}
layer {
name: "conv4_2"
type: "Convolution"
bottom: "conv4_1"
top: "conv4_2"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0
}
convolution_param {
num_output: 512
pad: 1
kernel_size: 3
stride: 1
weight_filler {
type: "xavier"
}
}
}
layer {
name: "conv4_2/bn"
type: "BatchNorm"
bottom: "conv4_2"
top: "conv4_2"
param {
lr_mult: 0
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
}
layer {
name: "conv4_2/scale"
type: "Scale"
bottom: "conv4_2"
top: "conv4_2"
param {
lr_mult: 1
decay_mult: 0
}
param {
lr_mult: 2
decay_mult: 0
}
scale_param {
bias_term: true
}
}
layer {
name: "relu4_2"
type: "PReLU"
bottom: "conv4_2"
top: "conv4_2"
}
layer {
name: "conv4_3"
type: "Convolution"
bottom: "conv4_2"
top: "conv4_3"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0
}
convolution_param {
num_output: 512
pad: 1
kernel_size: 3
stride: 1
weight_filler {
type: "xavier"
}
}
}
layer {
name: "conv4_3/bn"
type: "BatchNorm"
bottom: "conv4_3"
top: "conv4_3"
param {
lr_mult: 0
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
}
layer {
name: "conv4_3/scale"
type: "Scale"
bottom: "conv4_3"
top: "conv4_3"
param {
lr_mult: 1
decay_mult: 0
}
param {
lr_mult: 2
decay_mult: 0
}
scale_param {
bias_term: true
}
}
layer {
name: "relu4_3"
type: "PReLU"
bottom: "conv4_3"
top: "conv4_3"
}
layer {
name: "res4_3"
type: "Eltwise"
bottom: "conv4_1"
bottom: "conv4_3"
top: "res4_3"
}
layer {
name: "fc5"
type: "InnerProduct"
bottom: "res4_3"
top: "fc5"
inner_product_param {
num_output: 512
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0
}
}
}
Hi, your paper shows the result of AM-Softmax w/o FN with the m = 0.35 and 0.4.
(1).with FN : Fai = s * (cos(theta) - m) s=30, m=0.35
#prototxt
layer {
name: "fc6_l2"
type: "InnerProduct"
bottom: "norm1"
top: "fc6"
param {
lr_mult: 1
}
inner_product_param{
num_output: 10516
normalize: true
weight_filler {
type: "xavier"
}
bias_term: false
}
}
layer {
name: "label_specific_margin"
type: "LabelSpecificAdd"
bottom: "fc6"
bottom: "label"
top: "fc6_margin"
label_specific_add_param {
bias: -0.35
}
}
layer {
name: "fc6_margin_scale"
type: "Scale"
bottom: "fc6_margin"
top: "fc6_margin_scale"
param {
lr_mult: 0
decay_mult: 0
}
scale_param {
filler{
type: "constant"
value: 30
}
}
}
layer {
name: "softmax_loss"
type: "SoftmaxWithLoss"
bottom: "fc6_margin_scale"
bottom: "label"
top: "softmax_loss"
loss_weight: 1
}
(2).w/o FN : s not needed, Fai = ||x|| * cos(theta) - m, still use m = 0.35?
#prototxt
layer {
name: "fc6_l2"
type: "InnerProduct"
bottom: "norm1"
top: "fc6"
param {
lr_mult: 1
}
inner_product_param{
num_output: 10516
normalize: false
weight_filler {
type: "xavier"
}
bias_term: false
}
}
layer {
name: "label_specific_margin"
type: "LabelSpecificAdd"
bottom: "fc6"
bottom: "label"
top: "fc6_margin"
label_specific_add_param {
bias: -0.35
}
}
layer {
name: "softmax_loss"
type: "SoftmaxWithLoss"
bottom: "fc6_margin"
bottom: "label"
top: "softmax_loss"
loss_weight: 1
}
Can you show your prototxt and trainning log? thx.
@happynear I can't get any download link about CASIA-Webface, neither cleaned CASIA Dataset nor dirty.
The official download link and Baiduyun links that I found on the Internet cannot be accessed now,Can you give me a download link, thank you!
I want to test caffemodel on the the real world problem, so I use mtcnn landmarks and align image like the below:
`
Mat transform(Mat image, //cropped face image
vector<Point2f> dst) //dst are the landmark of the face
{
float image_height=96.0/image.cols;
float image_width=112.0/image.rows;
for (int i = 0; i < dst.size(); ++i)
{
dst[i].y*=image_height; //in this line I will scale the points to the new size of image(96,112)
dst[i].x*=image_width;
}
cv::resize(image,image,Size(96,112));
vector<Point2f> src;
src.push_back(Point2f(30.2946,51.6963));
src.push_back(Point2f(65.5318,51.5014));
src.push_back(Point2f(48.0252,71.7366));
src.push_back(Point2f(33.5493,92.3655));
src.push_back(Point2f(62.7299,92.2041));
cv::Mat R = cv::estimateRigidTransform(dst,src,false);
Mat out;
cv::warpAffine(image,out,R,Size(96,112));
return out;
}
`
what I got is something like the below image
As you can see there is a black area in the top and rights side of the image, So I'm confused that is this normal?
I only see the layer name of fc6_L2 in face_train_test.prototxt.
Hi thanks for you amsoftmax:
Could you tell us if the ResNet20 is a pretrained model? And if it is pretrained with the amsoftmax!
Dear @happynear, first of all thanks for your work and uploaded results! I would like to ask about alignment step, is it really important to get good performance? I have not tried your code ( will do it this week ), but tried ResNet-18 with VGG2 without alignment step. SoftMax and CenterLoss gave about 90% both, CenterLoss also provided much better localization, but surprisingly ArcFace result was 70% only, I will try CosFace this week but I expect more or less the same. Did you try to train something without alignment?
Thank you! I will post my result with CosFace
Hi, first of all thanks to great job,
I compile your caffe and want to test per-trained weight so I put an image to model input's and it give me 512 float array in 0.4 second on 1080ti gpu(is this ok?) after that I want to set a mini-batch for model input but for 10 images it give me 3.6 second that just slightly faster than when we set one image.
there is a gnap prototxt
I am not sure what is this for? Any document?
thank you
Dear AMSoftmax team,
Thanks for the great work, I've checked out your paper and prepare to try my owndata on the repo, however I'm a bit confused, would you mind telling me the relation between Amsoftmax and Sphereface? I noticed AMSoftmax seems to be the latest result on your experiment, but there's no many instructions of manipulation . As far as I recognized , should the only difference between Sphereface and AMSoftmax be the prototxt file? besides I can not only obtain AMSoftmax repo, but to keep Sphereface repo and follow the steps to train?
hi,I read your paper and your target logit curves code ,I feel puzzled,your paper say that Wf is also called as the target logit ,but I feel that your target logit curves is not correspond with the definied Wf,it seems not to think about the f.
你好,对于slover配置文件iter_size参数我在网上查了相关资料,caffe官方并没有这个参数的说明,我的问题是,如果设置这个参数相当于调整了batch_size,那么迭代次数是不是应该减少呢,因为我设置了这个参数为n后,训练的速度相应的变慢了n倍,求解答
Does the AMS loss have the similar converge curve with softmax loss? In my exps, the AMS loss (set m=1) changes little during training, even after lots of iters.
I didn't find any comparison between AMSoftmax and ArcFace on the internet, could you tell me your result ?
As I don't have enough GPU memory, I set iter_size: 8
in face_solver.prototxt
and batch_size: 32
in face_train_test.prototxt
, after 30000 iter it didn't converge on CASIA-Webface. I am confused that if I have done anything wrong.
Thanks in advanced.
Hi, thanks for your great job. I wonder how I can debug the value of margin and scale to get better result.I use the default setting(m=0.35, scale=30)on my face recognition dataset, the final training loss is about 3 and it can't decrease. So I come here to ask the quesion, thank you~
您好,我想得到不同损失函数在mnist数据集上球面分布的图,自己使用不同的loss function去训练,得到的相应的模型图,但是还是无法画图?您的那个jet.mat文件是如何生成的?谢谢
AVE 60.93%
是我哪里操作失误吗?准确率为什么跟您论文里描述的差了那么多呢?
hi @happynear,
i read the code of innerproduct
, find that, you only norm the weights
in the forward
pass. so why it does not need corresponding bp
?
thanks.
Hi @xisi789
Could you please again share the Casia webface database or Replay-Attack database for Face Anti Spoofing. I couldnot download it since the share file has expired
Hi, I found that the model trained with AMS may get higher similarity bettem a pair of abnormal enroll and probe images (the probe is low qulaity, wrong aligned, or even not a face, the enroll is not a good id photo). The similarity may be around 0.4 or even higher while the ones trained with softmax may be just around 0. So have you ever met the same problems? Is it because that the margin push the feature space much compact than softmax?
Thanks!
不好意思,打扰啦
没问题了
Hi,thanks for your work. Did you ever try to put it together? I think put it together shouldn't make it worse. But , when I use it to train lenet(dataset:mnist) , it actually get worse result.
thanks very much for your contribution. i am using gray images to train the model with only about 1000 identities, how need i set the m and s. both of them should be set smaller?
I set an image to AMSoftmax
, which net
prototxt is (face_deploy_mirror_normalize.prototxt
) and weight is (your pretrained weights
) . after loading weights I put an image to net input and run forward()
method on it. Then I wanted to explore how the flip layer works but after plot the output of flip_data
blobs I see something goes wrong, the flip layer has flipped data vertically(I mean up down) !! is it Okay?
result of code:
The code is something like below:
net=caffe.Net(
'face_deploy_mirror_normalize.prototxt',
'face_train_test_iter_30000.caffemodel',
caffe.TEST);
def return_layer_name(layer_name,i):
output=net.blobs[layer_name].data[i]
output=np.swapaxes(output,0,2)
return output
img=caffe.io.load_image('Anthony_Hopkins_0002.jpg')
img=caffe.io.resize(img,(96,112))
img=np.expand_dims(img,0)
img=np.swapaxes(img,1,3)
net.blobs['data'].data[...]=img
net.forward()
output=net.blobs['norm1'].data[0]
out1=return_layer_name('data_input_0_split_0',0)
out2=return_layer_name('flip_data',0)
fig = plt.figure(figsize=(15,15))
plt.subplot(1,2,1)
plt.imshow(out1)
plt.subplot(1,2,2)
plt.imshow(out2)
Could you give a short introduction to this https://github.com/happynear/AMSoftmax/tree/master/prototxt/auto, please?
Great work, appreciate it. Can I use the same caffe.binding project without any modification?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.