Hi, the model size for the model InsightFace_MX is about 160MB, but IL_Resnet_E_IR_GBN

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

The capacity of parameters,about auroua/insightface_tf

Comments (39)

HsuTzuJen commented on August 20, 2024 3

I just use the 50 layers resnet from slim.nets.resnet_v2.py:
resnet_v2_50(inputs,
num_classes=None,
is_training=True,
global_pool=False,
output_stride=None,
reuse=None,
scope='resnet_v2_50'):
I achieved 0.99216666666 with batch size 128.
I continue the training with the cutout method(batch size 128), and I achieved 9941666666666666.
Finally, I achieved 0.9955 with batch size 256 at 787150 iter.
I am now looking forward to achieving higher acc with using batch size 1024.

from insightface_tf.

TengliEd commented on August 20, 2024 1

The momentum optimizer increments global step when applying gradients so inc_op is redundant.

from insightface_tf.

auroua commented on August 20, 2024

I don't know why, so I added Max Batch Size Test.

from insightface_tf.

HsuTzuJen commented on August 20, 2024

the number of weights for logit is 85164*512. It's about 43.6M! But we do not need the logit after training right? I am wondering if it is possible that we save the net only without logit?

from insightface_tf.

HsuTzuJen commented on August 20, 2024

Hi, I use the network "tensorflow/tensorflow/contrib/slim/python/slim/nets/resnet_v2.py" instead, and it works! The model size is down to 630M for MS1M and 330M for VGGface2. And I can train the network with batch-size up to 384 with only 1 GTX 1080ti 11G.

from insightface_tf.

auroua commented on August 20, 2024

Could you test the model size after adding the prelu and adding the same numbers of Batch Norm layers as the paper.

from insightface_tf.

HsuTzuJen commented on August 20, 2024

I did not read the paper yet, but I will do it after reading it.
I am now interested in implement NASNet and training with your code.

from insightface_tf.

HsuTzuJen commented on August 20, 2024

Hi, I added the same numbers of PRelu and Batch_norm layers to resnet_v2.py.
The model size is 640MB with MS1M dataset.
Differences:
1.It seems that there is no gamma in batch_norm in 'tensorflow/tensorflow/contrib/layers/python/layers/layers.py.'
2.There are 14 units in block 3 in L_Resnet_E_IR.py, but 6 units only in block 3 in resnet_v2.py.
3.There is on more conv in each bottleneck in resnet_v2.py.

I notice that there are 2 BN operations between units in L_Resnet_E_IR.py. Maybe we should leave one only.
Below are the details of trainable variables. Note that preact is batch_norm followed by Relu.

Trainable weights in L_Resnet_E_IR.py:

resnet_v1_50/conv1/W_conv2d:0
resnet_v1_50/bn0/beta:0
resnet_v1_50/bn0/gamma:0
resnet_v1_50/prelu0/alphas:0
resnet_v1_50/block1/unit_1/bottleneck_v1/shortcut_conv/W_conv2d:0
resnet_v1_50/block1/unit_1/bottleneck_v1/shortcut_bn/BatchNorm/beta:0
resnet_v1_50/block1/unit_1/bottleneck_v1/shortcut_bn/BatchNorm/gamma:0
resnet_v1_50/block1/unit_1/bottleneck_v1/conv1_bn1/beta:0
resnet_v1_50/block1/unit_1/bottleneck_v1/conv1_bn1/gamma:0
resnet_v1_50/block1/unit_1/bottleneck_v1/conv1/W_conv2d:0
resnet_v1_50/block1/unit_1/bottleneck_v1/conv1_bn2/beta:0
resnet_v1_50/block1/unit_1/bottleneck_v1/conv1_bn2/gamma:0
resnet_v1_50/block1/unit_1/bottleneck_v1/prelu_layer/alphas:0
resnet_v1_50/block1/unit_1/bottleneck_v1/conv2/W_conv2d:0
resnet_v1_50/block1/unit_1/bottleneck_v1/conv2_bn/BatchNorm/beta:0
resnet_v1_50/block1/unit_1/bottleneck_v1/conv2_bn/BatchNorm/gamma:0
resnet_v1_50/block1/unit_2/bottleneck_v1/conv1_bn1/beta:0
resnet_v1_50/block1/unit_2/bottleneck_v1/conv1_bn1/gamma:0
resnet_v1_50/block1/unit_2/bottleneck_v1/conv1/W_conv2d:0
resnet_v1_50/block1/unit_2/bottleneck_v1/conv1_bn2/beta:0
resnet_v1_50/block1/unit_2/bottleneck_v1/conv1_bn2/gamma:0
resnet_v1_50/block1/unit_2/bottleneck_v1/prelu_layer/alphas:0
resnet_v1_50/block1/unit_2/bottleneck_v1/conv2/W_conv2d:0
resnet_v1_50/block1/unit_2/bottleneck_v1/conv2_bn/BatchNorm/beta:0
resnet_v1_50/block1/unit_2/bottleneck_v1/conv2_bn/BatchNorm/gamma:0
resnet_v1_50/block1/unit_3/bottleneck_v1/conv1_bn1/beta:0
resnet_v1_50/block1/unit_3/bottleneck_v1/conv1_bn1/gamma:0
resnet_v1_50/block1/unit_3/bottleneck_v1/conv1/W_conv2d:0
resnet_v1_50/block1/unit_3/bottleneck_v1/conv1_bn2/beta:0
resnet_v1_50/block1/unit_3/bottleneck_v1/conv1_bn2/gamma:0
resnet_v1_50/block1/unit_3/bottleneck_v1/prelu_layer/alphas:0
resnet_v1_50/block1/unit_3/bottleneck_v1/conv2/W_conv2d:0
resnet_v1_50/block1/unit_3/bottleneck_v1/conv2_bn/BatchNorm/beta:0
resnet_v1_50/block1/unit_3/bottleneck_v1/conv2_bn/BatchNorm/gamma:0
resnet_v1_50/block2/unit_1/bottleneck_v1/shortcut_conv/W_conv2d:0
resnet_v1_50/block2/unit_1/bottleneck_v1/shortcut_bn/BatchNorm/beta:0
resnet_v1_50/block2/unit_1/bottleneck_v1/shortcut_bn/BatchNorm/gamma:0
resnet_v1_50/block2/unit_1/bottleneck_v1/conv1_bn1/beta:0
resnet_v1_50/block2/unit_1/bottleneck_v1/conv1_bn1/gamma:0
resnet_v1_50/block2/unit_1/bottleneck_v1/conv1/W_conv2d:0
resnet_v1_50/block2/unit_1/bottleneck_v1/conv1_bn2/beta:0
resnet_v1_50/block2/unit_1/bottleneck_v1/conv1_bn2/gamma:0
resnet_v1_50/block2/unit_1/bottleneck_v1/prelu_layer/alphas:0
resnet_v1_50/block2/unit_1/bottleneck_v1/conv2/W_conv2d:0
resnet_v1_50/block2/unit_1/bottleneck_v1/conv2_bn/BatchNorm/beta:0
resnet_v1_50/block2/unit_1/bottleneck_v1/conv2_bn/BatchNorm/gamma:0
resnet_v1_50/block2/unit_2/bottleneck_v1/conv1_bn1/beta:0
resnet_v1_50/block2/unit_2/bottleneck_v1/conv1_bn1/gamma:0
resnet_v1_50/block2/unit_2/bottleneck_v1/conv1/W_conv2d:0
resnet_v1_50/block2/unit_2/bottleneck_v1/conv1_bn2/beta:0
resnet_v1_50/block2/unit_2/bottleneck_v1/conv1_bn2/gamma:0
resnet_v1_50/block2/unit_2/bottleneck_v1/prelu_layer/alphas:0
resnet_v1_50/block2/unit_2/bottleneck_v1/conv2/W_conv2d:0
resnet_v1_50/block2/unit_2/bottleneck_v1/conv2_bn/BatchNorm/beta:0
resnet_v1_50/block2/unit_2/bottleneck_v1/conv2_bn/BatchNorm/gamma:0
resnet_v1_50/block2/unit_3/bottleneck_v1/conv1_bn1/beta:0
resnet_v1_50/block2/unit_3/bottleneck_v1/conv1_bn1/gamma:0
resnet_v1_50/block2/unit_3/bottleneck_v1/conv1/W_conv2d:0
resnet_v1_50/block2/unit_3/bottleneck_v1/conv1_bn2/beta:0
resnet_v1_50/block2/unit_3/bottleneck_v1/conv1_bn2/gamma:0
resnet_v1_50/block2/unit_3/bottleneck_v1/prelu_layer/alphas:0
resnet_v1_50/block2/unit_3/bottleneck_v1/conv2/W_conv2d:0
resnet_v1_50/block2/unit_3/bottleneck_v1/conv2_bn/BatchNorm/beta:0
resnet_v1_50/block2/unit_3/bottleneck_v1/conv2_bn/BatchNorm/gamma:0
resnet_v1_50/block2/unit_4/bottleneck_v1/conv1_bn1/beta:0
resnet_v1_50/block2/unit_4/bottleneck_v1/conv1_bn1/gamma:0
resnet_v1_50/block2/unit_4/bottleneck_v1/conv1/W_conv2d:0
resnet_v1_50/block2/unit_4/bottleneck_v1/conv1_bn2/beta:0
resnet_v1_50/block2/unit_4/bottleneck_v1/conv1_bn2/gamma:0
resnet_v1_50/block2/unit_4/bottleneck_v1/prelu_layer/alphas:0
resnet_v1_50/block2/unit_4/bottleneck_v1/conv2/W_conv2d:0
resnet_v1_50/block2/unit_4/bottleneck_v1/conv2_bn/BatchNorm/beta:0
resnet_v1_50/block2/unit_4/bottleneck_v1/conv2_bn/BatchNorm/gamma:0
resnet_v1_50/block3/unit_1/bottleneck_v1/shortcut_conv/W_conv2d:0
resnet_v1_50/block3/unit_1/bottleneck_v1/shortcut_bn/BatchNorm/beta:0
resnet_v1_50/block3/unit_1/bottleneck_v1/shortcut_bn/BatchNorm/gamma:0
resnet_v1_50/block3/unit_1/bottleneck_v1/conv1_bn1/beta:0
resnet_v1_50/block3/unit_1/bottleneck_v1/conv1_bn1/gamma:0
resnet_v1_50/block3/unit_1/bottleneck_v1/conv1/W_conv2d:0
resnet_v1_50/block3/unit_1/bottleneck_v1/conv1_bn2/beta:0
resnet_v1_50/block3/unit_1/bottleneck_v1/conv1_bn2/gamma:0
resnet_v1_50/block3/unit_1/bottleneck_v1/prelu_layer/alphas:0
resnet_v1_50/block3/unit_1/bottleneck_v1/conv2/W_conv2d:0
resnet_v1_50/block3/unit_1/bottleneck_v1/conv2_bn/BatchNorm/beta:0
resnet_v1_50/block3/unit_1/bottleneck_v1/conv2_bn/BatchNorm/gamma:0
resnet_v1_50/block3/unit_2/bottleneck_v1/conv1_bn1/beta:0
resnet_v1_50/block3/unit_2/bottleneck_v1/conv1_bn1/gamma:0
resnet_v1_50/block3/unit_2/bottleneck_v1/conv1/W_conv2d:0
resnet_v1_50/block3/unit_2/bottleneck_v1/conv1_bn2/beta:0
resnet_v1_50/block3/unit_2/bottleneck_v1/conv1_bn2/gamma:0
resnet_v1_50/block3/unit_2/bottleneck_v1/prelu_layer/alphas:0
resnet_v1_50/block3/unit_2/bottleneck_v1/conv2/W_conv2d:0
resnet_v1_50/block3/unit_2/bottleneck_v1/conv2_bn/BatchNorm/beta:0
resnet_v1_50/block3/unit_2/bottleneck_v1/conv2_bn/BatchNorm/gamma:0
resnet_v1_50/block3/unit_3/bottleneck_v1/conv1_bn1/beta:0
resnet_v1_50/block3/unit_3/bottleneck_v1/conv1_bn1/gamma:0
resnet_v1_50/block3/unit_3/bottleneck_v1/conv1/W_conv2d:0
resnet_v1_50/block3/unit_3/bottleneck_v1/conv1_bn2/beta:0
resnet_v1_50/block3/unit_3/bottleneck_v1/conv1_bn2/gamma:0
resnet_v1_50/block3/unit_3/bottleneck_v1/prelu_layer/alphas:0
resnet_v1_50/block3/unit_3/bottleneck_v1/conv2/W_conv2d:0
resnet_v1_50/block3/unit_3/bottleneck_v1/conv2_bn/BatchNorm/beta:0
resnet_v1_50/block3/unit_3/bottleneck_v1/conv2_bn/BatchNorm/gamma:0
resnet_v1_50/block3/unit_4/bottleneck_v1/conv1_bn1/beta:0
resnet_v1_50/block3/unit_4/bottleneck_v1/conv1_bn1/gamma:0
resnet_v1_50/block3/unit_4/bottleneck_v1/conv1/W_conv2d:0
resnet_v1_50/block3/unit_4/bottleneck_v1/conv1_bn2/beta:0
resnet_v1_50/block3/unit_4/bottleneck_v1/conv1_bn2/gamma:0
resnet_v1_50/block3/unit_4/bottleneck_v1/prelu_layer/alphas:0
resnet_v1_50/block3/unit_4/bottleneck_v1/conv2/W_conv2d:0
resnet_v1_50/block3/unit_4/bottleneck_v1/conv2_bn/BatchNorm/beta:0
resnet_v1_50/block3/unit_4/bottleneck_v1/conv2_bn/BatchNorm/gamma:0
resnet_v1_50/block3/unit_5/bottleneck_v1/conv1_bn1/beta:0
resnet_v1_50/block3/unit_5/bottleneck_v1/conv1_bn1/gamma:0
resnet_v1_50/block3/unit_5/bottleneck_v1/conv1/W_conv2d:0
resnet_v1_50/block3/unit_5/bottleneck_v1/conv1_bn2/beta:0
resnet_v1_50/block3/unit_5/bottleneck_v1/conv1_bn2/gamma:0
resnet_v1_50/block3/unit_5/bottleneck_v1/prelu_layer/alphas:0
resnet_v1_50/block3/unit_5/bottleneck_v1/conv2/W_conv2d:0
resnet_v1_50/block3/unit_5/bottleneck_v1/conv2_bn/BatchNorm/beta:0
resnet_v1_50/block3/unit_5/bottleneck_v1/conv2_bn/BatchNorm/gamma:0
resnet_v1_50/block3/unit_6/bottleneck_v1/conv1_bn1/beta:0
resnet_v1_50/block3/unit_6/bottleneck_v1/conv1_bn1/gamma:0
resnet_v1_50/block3/unit_6/bottleneck_v1/conv1/W_conv2d:0
resnet_v1_50/block3/unit_6/bottleneck_v1/conv1_bn2/beta:0
resnet_v1_50/block3/unit_6/bottleneck_v1/conv1_bn2/gamma:0
resnet_v1_50/block3/unit_6/bottleneck_v1/prelu_layer/alphas:0
resnet_v1_50/block3/unit_6/bottleneck_v1/conv2/W_conv2d:0
resnet_v1_50/block3/unit_6/bottleneck_v1/conv2_bn/BatchNorm/beta:0
resnet_v1_50/block3/unit_6/bottleneck_v1/conv2_bn/BatchNorm/gamma:0
resnet_v1_50/block3/unit_7/bottleneck_v1/conv1_bn1/beta:0
resnet_v1_50/block3/unit_7/bottleneck_v1/conv1_bn1/gamma:0
resnet_v1_50/block3/unit_7/bottleneck_v1/conv1/W_conv2d:0
resnet_v1_50/block3/unit_7/bottleneck_v1/conv1_bn2/beta:0
resnet_v1_50/block3/unit_7/bottleneck_v1/conv1_bn2/gamma:0
resnet_v1_50/block3/unit_7/bottleneck_v1/prelu_layer/alphas:0
resnet_v1_50/block3/unit_7/bottleneck_v1/conv2/W_conv2d:0
resnet_v1_50/block3/unit_7/bottleneck_v1/conv2_bn/BatchNorm/beta:0
resnet_v1_50/block3/unit_7/bottleneck_v1/conv2_bn/BatchNorm/gamma:0
resnet_v1_50/block3/unit_8/bottleneck_v1/conv1_bn1/beta:0
resnet_v1_50/block3/unit_8/bottleneck_v1/conv1_bn1/gamma:0
resnet_v1_50/block3/unit_8/bottleneck_v1/conv1/W_conv2d:0
resnet_v1_50/block3/unit_8/bottleneck_v1/conv1_bn2/beta:0
resnet_v1_50/block3/unit_8/bottleneck_v1/conv1_bn2/gamma:0
resnet_v1_50/block3/unit_8/bottleneck_v1/prelu_layer/alphas:0
resnet_v1_50/block3/unit_8/bottleneck_v1/conv2/W_conv2d:0
resnet_v1_50/block3/unit_8/bottleneck_v1/conv2_bn/BatchNorm/beta:0
resnet_v1_50/block3/unit_8/bottleneck_v1/conv2_bn/BatchNorm/gamma:0
resnet_v1_50/block3/unit_9/bottleneck_v1/conv1_bn1/beta:0
resnet_v1_50/block3/unit_9/bottleneck_v1/conv1_bn1/gamma:0
resnet_v1_50/block3/unit_9/bottleneck_v1/conv1/W_conv2d:0
resnet_v1_50/block3/unit_9/bottleneck_v1/conv1_bn2/beta:0
resnet_v1_50/block3/unit_9/bottleneck_v1/conv1_bn2/gamma:0
resnet_v1_50/block3/unit_9/bottleneck_v1/prelu_layer/alphas:0
resnet_v1_50/block3/unit_9/bottleneck_v1/conv2/W_conv2d:0
resnet_v1_50/block3/unit_9/bottleneck_v1/conv2_bn/BatchNorm/beta:0
resnet_v1_50/block3/unit_9/bottleneck_v1/conv2_bn/BatchNorm/gamma:0
resnet_v1_50/block3/unit_10/bottleneck_v1/conv1_bn1/beta:0
resnet_v1_50/block3/unit_10/bottleneck_v1/conv1_bn1/gamma:0
resnet_v1_50/block3/unit_10/bottleneck_v1/conv1/W_conv2d:0
resnet_v1_50/block3/unit_10/bottleneck_v1/conv1_bn2/beta:0
resnet_v1_50/block3/unit_10/bottleneck_v1/conv1_bn2/gamma:0
resnet_v1_50/block3/unit_10/bottleneck_v1/prelu_layer/alphas:0
resnet_v1_50/block3/unit_10/bottleneck_v1/conv2/W_conv2d:0
resnet_v1_50/block3/unit_10/bottleneck_v1/conv2_bn/BatchNorm/beta:0
resnet_v1_50/block3/unit_10/bottleneck_v1/conv2_bn/BatchNorm/gamma:0
resnet_v1_50/block3/unit_11/bottleneck_v1/conv1_bn1/beta:0
resnet_v1_50/block3/unit_11/bottleneck_v1/conv1_bn1/gamma:0
resnet_v1_50/block3/unit_11/bottleneck_v1/conv1/W_conv2d:0
resnet_v1_50/block3/unit_11/bottleneck_v1/conv1_bn2/beta:0
resnet_v1_50/block3/unit_11/bottleneck_v1/conv1_bn2/gamma:0
resnet_v1_50/block3/unit_11/bottleneck_v1/prelu_layer/alphas:0
resnet_v1_50/block3/unit_11/bottleneck_v1/conv2/W_conv2d:0
resnet_v1_50/block3/unit_11/bottleneck_v1/conv2_bn/BatchNorm/beta:0
resnet_v1_50/block3/unit_11/bottleneck_v1/conv2_bn/BatchNorm/gamma:0
resnet_v1_50/block3/unit_12/bottleneck_v1/conv1_bn1/beta:0
resnet_v1_50/block3/unit_12/bottleneck_v1/conv1_bn1/gamma:0
resnet_v1_50/block3/unit_12/bottleneck_v1/conv1/W_conv2d:0
resnet_v1_50/block3/unit_12/bottleneck_v1/conv1_bn2/beta:0
resnet_v1_50/block3/unit_12/bottleneck_v1/conv1_bn2/gamma:0
resnet_v1_50/block3/unit_12/bottleneck_v1/prelu_layer/alphas:0
resnet_v1_50/block3/unit_12/bottleneck_v1/conv2/W_conv2d:0
resnet_v1_50/block3/unit_12/bottleneck_v1/conv2_bn/BatchNorm/beta:0
resnet_v1_50/block3/unit_12/bottleneck_v1/conv2_bn/BatchNorm/gamma:0
resnet_v1_50/block3/unit_13/bottleneck_v1/conv1_bn1/beta:0
resnet_v1_50/block3/unit_13/bottleneck_v1/conv1_bn1/gamma:0
resnet_v1_50/block3/unit_13/bottleneck_v1/conv1/W_conv2d:0
resnet_v1_50/block3/unit_13/bottleneck_v1/conv1_bn2/beta:0
resnet_v1_50/block3/unit_13/bottleneck_v1/conv1_bn2/gamma:0
resnet_v1_50/block3/unit_13/bottleneck_v1/prelu_layer/alphas:0
resnet_v1_50/block3/unit_13/bottleneck_v1/conv2/W_conv2d:0
resnet_v1_50/block3/unit_13/bottleneck_v1/conv2_bn/BatchNorm/beta:0
resnet_v1_50/block3/unit_13/bottleneck_v1/conv2_bn/BatchNorm/gamma:0
resnet_v1_50/block3/unit_14/bottleneck_v1/conv1_bn1/beta:0
resnet_v1_50/block3/unit_14/bottleneck_v1/conv1_bn1/gamma:0
resnet_v1_50/block3/unit_14/bottleneck_v1/conv1/W_conv2d:0
resnet_v1_50/block3/unit_14/bottleneck_v1/conv1_bn2/beta:0
resnet_v1_50/block3/unit_14/bottleneck_v1/conv1_bn2/gamma:0
resnet_v1_50/block3/unit_14/bottleneck_v1/prelu_layer/alphas:0
resnet_v1_50/block3/unit_14/bottleneck_v1/conv2/W_conv2d:0
resnet_v1_50/block3/unit_14/bottleneck_v1/conv2_bn/BatchNorm/beta:0
resnet_v1_50/block3/unit_14/bottleneck_v1/conv2_bn/BatchNorm/gamma:0
resnet_v1_50/block4/unit_1/bottleneck_v1/shortcut_conv/W_conv2d:0
resnet_v1_50/block4/unit_1/bottleneck_v1/shortcut_bn/BatchNorm/beta:0
resnet_v1_50/block4/unit_1/bottleneck_v1/shortcut_bn/BatchNorm/gamma:0
resnet_v1_50/block4/unit_1/bottleneck_v1/conv1_bn1/beta:0
resnet_v1_50/block4/unit_1/bottleneck_v1/conv1_bn1/gamma:0
resnet_v1_50/block4/unit_1/bottleneck_v1/conv1/W_conv2d:0
resnet_v1_50/block4/unit_1/bottleneck_v1/conv1_bn2/beta:0
resnet_v1_50/block4/unit_1/bottleneck_v1/conv1_bn2/gamma:0
resnet_v1_50/block4/unit_1/bottleneck_v1/prelu_layer/alphas:0
resnet_v1_50/block4/unit_1/bottleneck_v1/conv2/W_conv2d:0
resnet_v1_50/block4/unit_1/bottleneck_v1/conv2_bn/BatchNorm/beta:0
resnet_v1_50/block4/unit_1/bottleneck_v1/conv2_bn/BatchNorm/gamma:0
resnet_v1_50/block4/unit_2/bottleneck_v1/conv1_bn1/beta:0
resnet_v1_50/block4/unit_2/bottleneck_v1/conv1_bn1/gamma:0
resnet_v1_50/block4/unit_2/bottleneck_v1/conv1/W_conv2d:0
resnet_v1_50/block4/unit_2/bottleneck_v1/conv1_bn2/beta:0
resnet_v1_50/block4/unit_2/bottleneck_v1/conv1_bn2/gamma:0
resnet_v1_50/block4/unit_2/bottleneck_v1/prelu_layer/alphas:0
resnet_v1_50/block4/unit_2/bottleneck_v1/conv2/W_conv2d:0
resnet_v1_50/block4/unit_2/bottleneck_v1/conv2_bn/BatchNorm/beta:0
resnet_v1_50/block4/unit_2/bottleneck_v1/conv2_bn/BatchNorm/gamma:0
resnet_v1_50/block4/unit_3/bottleneck_v1/conv1_bn1/beta:0
resnet_v1_50/block4/unit_3/bottleneck_v1/conv1_bn1/gamma:0
resnet_v1_50/block4/unit_3/bottleneck_v1/conv1/W_conv2d:0
resnet_v1_50/block4/unit_3/bottleneck_v1/conv1_bn2/beta:0
resnet_v1_50/block4/unit_3/bottleneck_v1/conv1_bn2/gamma:0
resnet_v1_50/block4/unit_3/bottleneck_v1/prelu_layer/alphas:0
resnet_v1_50/block4/unit_3/bottleneck_v1/conv2/W_conv2d:0
resnet_v1_50/block4/unit_3/bottleneck_v1/conv2_bn/BatchNorm/beta:0
resnet_v1_50/block4/unit_3/bottleneck_v1/conv2_bn/BatchNorm/gamma:0
resnet_v1_50/E_BN1/beta:0
resnet_v1_50/E_BN1/gamma:0
resnet_v1_50/E_DenseLayer/W:0
resnet_v1_50/E_DenseLayer/b:0
resnet_v1_50/E_BN2/beta:0
237 trainable weights

Trainable weights in resnet_v2.py(added prelu and BN2)

resnet_v2_50/conv1/weights:0
resnet_v2_50/conv1/biases:0
resnet_v2_50/block1/unit_1/bottleneck_v2/preact/beta:0
resnet_v2_50/block1/unit_1/bottleneck_v2/shortcut/weights:0
resnet_v2_50/block1/unit_1/bottleneck_v2/shortcut/biases:0
resnet_v2_50/block1/unit_1/bottleneck_v2/conv1/weights:0
resnet_v2_50/block1/unit_1/bottleneck_v2/conv1/biases:0
resnet_v2_50/block1/unit_1/bottleneck_v2/BN2/beta:0
resnet_v2_50/block1/unit_1/bottleneck_v2/prelu_layer/alphas:0
resnet_v2_50/block1/unit_1/bottleneck_v2/conv2/weights:0
resnet_v2_50/block1/unit_1/bottleneck_v2/conv2/biases:0
resnet_v2_50/block1/unit_1/bottleneck_v2/conv3/weights:0
resnet_v2_50/block1/unit_1/bottleneck_v2/conv3/biases:0
resnet_v2_50/block1/unit_2/bottleneck_v2/preact/beta:0
resnet_v2_50/block1/unit_2/bottleneck_v2/conv1/weights:0
resnet_v2_50/block1/unit_2/bottleneck_v2/conv1/biases:0
resnet_v2_50/block1/unit_2/bottleneck_v2/BN2/beta:0
resnet_v2_50/block1/unit_2/bottleneck_v2/prelu_layer/alphas:0
resnet_v2_50/block1/unit_2/bottleneck_v2/conv2/weights:0
resnet_v2_50/block1/unit_2/bottleneck_v2/conv2/biases:0
resnet_v2_50/block1/unit_2/bottleneck_v2/conv3/weights:0
resnet_v2_50/block1/unit_2/bottleneck_v2/conv3/biases:0
resnet_v2_50/block1/unit_3/bottleneck_v2/preact/beta:0
resnet_v2_50/block1/unit_3/bottleneck_v2/conv1/weights:0
resnet_v2_50/block1/unit_3/bottleneck_v2/conv1/biases:0
resnet_v2_50/block1/unit_3/bottleneck_v2/BN2/beta:0
resnet_v2_50/block1/unit_3/bottleneck_v2/prelu_layer/alphas:0
resnet_v2_50/block1/unit_3/bottleneck_v2/conv2/weights:0
resnet_v2_50/block1/unit_3/bottleneck_v2/conv2/biases:0
resnet_v2_50/block1/unit_3/bottleneck_v2/conv3/weights:0
resnet_v2_50/block1/unit_3/bottleneck_v2/conv3/biases:0
resnet_v2_50/block2/unit_1/bottleneck_v2/preact/beta:0
resnet_v2_50/block2/unit_1/bottleneck_v2/shortcut/weights:0
resnet_v2_50/block2/unit_1/bottleneck_v2/shortcut/biases:0
resnet_v2_50/block2/unit_1/bottleneck_v2/conv1/weights:0
resnet_v2_50/block2/unit_1/bottleneck_v2/conv1/biases:0
resnet_v2_50/block2/unit_1/bottleneck_v2/BN2/beta:0
resnet_v2_50/block2/unit_1/bottleneck_v2/prelu_layer/alphas:0
resnet_v2_50/block2/unit_1/bottleneck_v2/conv2/weights:0
resnet_v2_50/block2/unit_1/bottleneck_v2/conv2/biases:0
resnet_v2_50/block2/unit_1/bottleneck_v2/conv3/weights:0
resnet_v2_50/block2/unit_1/bottleneck_v2/conv3/biases:0
resnet_v2_50/block2/unit_2/bottleneck_v2/preact/beta:0
resnet_v2_50/block2/unit_2/bottleneck_v2/conv1/weights:0
resnet_v2_50/block2/unit_2/bottleneck_v2/conv1/biases:0
resnet_v2_50/block2/unit_2/bottleneck_v2/BN2/beta:0
resnet_v2_50/block2/unit_2/bottleneck_v2/prelu_layer/alphas:0
resnet_v2_50/block2/unit_2/bottleneck_v2/conv2/weights:0
resnet_v2_50/block2/unit_2/bottleneck_v2/conv2/biases:0
resnet_v2_50/block2/unit_2/bottleneck_v2/conv3/weights:0
resnet_v2_50/block2/unit_2/bottleneck_v2/conv3/biases:0
resnet_v2_50/block2/unit_3/bottleneck_v2/preact/beta:0
resnet_v2_50/block2/unit_3/bottleneck_v2/conv1/weights:0
resnet_v2_50/block2/unit_3/bottleneck_v2/conv1/biases:0
resnet_v2_50/block2/unit_3/bottleneck_v2/BN2/beta:0
resnet_v2_50/block2/unit_3/bottleneck_v2/prelu_layer/alphas:0
resnet_v2_50/block2/unit_3/bottleneck_v2/conv2/weights:0
resnet_v2_50/block2/unit_3/bottleneck_v2/conv2/biases:0
resnet_v2_50/block2/unit_3/bottleneck_v2/conv3/weights:0
resnet_v2_50/block2/unit_3/bottleneck_v2/conv3/biases:0
resnet_v2_50/block2/unit_4/bottleneck_v2/preact/beta:0
resnet_v2_50/block2/unit_4/bottleneck_v2/conv1/weights:0
resnet_v2_50/block2/unit_4/bottleneck_v2/conv1/biases:0
resnet_v2_50/block2/unit_4/bottleneck_v2/BN2/beta:0
resnet_v2_50/block2/unit_4/bottleneck_v2/prelu_layer/alphas:0
resnet_v2_50/block2/unit_4/bottleneck_v2/conv2/weights:0
resnet_v2_50/block2/unit_4/bottleneck_v2/conv2/biases:0
resnet_v2_50/block2/unit_4/bottleneck_v2/conv3/weights:0
resnet_v2_50/block2/unit_4/bottleneck_v2/conv3/biases:0
resnet_v2_50/block3/unit_1/bottleneck_v2/preact/beta:0
resnet_v2_50/block3/unit_1/bottleneck_v2/shortcut/weights:0
resnet_v2_50/block3/unit_1/bottleneck_v2/shortcut/biases:0
resnet_v2_50/block3/unit_1/bottleneck_v2/conv1/weights:0
resnet_v2_50/block3/unit_1/bottleneck_v2/conv1/biases:0
resnet_v2_50/block3/unit_1/bottleneck_v2/BN2/beta:0
resnet_v2_50/block3/unit_1/bottleneck_v2/prelu_layer/alphas:0
resnet_v2_50/block3/unit_1/bottleneck_v2/conv2/weights:0
resnet_v2_50/block3/unit_1/bottleneck_v2/conv2/biases:0
resnet_v2_50/block3/unit_1/bottleneck_v2/conv3/weights:0
resnet_v2_50/block3/unit_1/bottleneck_v2/conv3/biases:0
resnet_v2_50/block3/unit_2/bottleneck_v2/preact/beta:0
resnet_v2_50/block3/unit_2/bottleneck_v2/conv1/weights:0
resnet_v2_50/block3/unit_2/bottleneck_v2/conv1/biases:0
resnet_v2_50/block3/unit_2/bottleneck_v2/BN2/beta:0
resnet_v2_50/block3/unit_2/bottleneck_v2/prelu_layer/alphas:0
resnet_v2_50/block3/unit_2/bottleneck_v2/conv2/weights:0
resnet_v2_50/block3/unit_2/bottleneck_v2/conv2/biases:0
resnet_v2_50/block3/unit_2/bottleneck_v2/conv3/weights:0
resnet_v2_50/block3/unit_2/bottleneck_v2/conv3/biases:0
resnet_v2_50/block3/unit_3/bottleneck_v2/preact/beta:0
resnet_v2_50/block3/unit_3/bottleneck_v2/conv1/weights:0
resnet_v2_50/block3/unit_3/bottleneck_v2/conv1/biases:0
resnet_v2_50/block3/unit_3/bottleneck_v2/BN2/beta:0
resnet_v2_50/block3/unit_3/bottleneck_v2/prelu_layer/alphas:0
resnet_v2_50/block3/unit_3/bottleneck_v2/conv2/weights:0
resnet_v2_50/block3/unit_3/bottleneck_v2/conv2/biases:0
resnet_v2_50/block3/unit_3/bottleneck_v2/conv3/weights:0
resnet_v2_50/block3/unit_3/bottleneck_v2/conv3/biases:0
resnet_v2_50/block3/unit_4/bottleneck_v2/preact/beta:0
resnet_v2_50/block3/unit_4/bottleneck_v2/conv1/weights:0
resnet_v2_50/block3/unit_4/bottleneck_v2/conv1/biases:0
resnet_v2_50/block3/unit_4/bottleneck_v2/BN2/beta:0
resnet_v2_50/block3/unit_4/bottleneck_v2/prelu_layer/alphas:0
resnet_v2_50/block3/unit_4/bottleneck_v2/conv2/weights:0
resnet_v2_50/block3/unit_4/bottleneck_v2/conv2/biases:0
resnet_v2_50/block3/unit_4/bottleneck_v2/conv3/weights:0
resnet_v2_50/block3/unit_4/bottleneck_v2/conv3/biases:0
resnet_v2_50/block3/unit_5/bottleneck_v2/preact/beta:0
resnet_v2_50/block3/unit_5/bottleneck_v2/conv1/weights:0
resnet_v2_50/block3/unit_5/bottleneck_v2/conv1/biases:0
resnet_v2_50/block3/unit_5/bottleneck_v2/BN2/beta:0
resnet_v2_50/block3/unit_5/bottleneck_v2/prelu_layer/alphas:0
resnet_v2_50/block3/unit_5/bottleneck_v2/conv2/weights:0
resnet_v2_50/block3/unit_5/bottleneck_v2/conv2/biases:0
resnet_v2_50/block3/unit_5/bottleneck_v2/conv3/weights:0
resnet_v2_50/block3/unit_5/bottleneck_v2/conv3/biases:0
resnet_v2_50/block3/unit_6/bottleneck_v2/preact/beta:0
resnet_v2_50/block3/unit_6/bottleneck_v2/conv1/weights:0
resnet_v2_50/block3/unit_6/bottleneck_v2/conv1/biases:0
resnet_v2_50/block3/unit_6/bottleneck_v2/BN2/beta:0
resnet_v2_50/block3/unit_6/bottleneck_v2/prelu_layer/alphas:0
resnet_v2_50/block3/unit_6/bottleneck_v2/conv2/weights:0
resnet_v2_50/block3/unit_6/bottleneck_v2/conv2/biases:0
resnet_v2_50/block3/unit_6/bottleneck_v2/conv3/weights:0
resnet_v2_50/block3/unit_6/bottleneck_v2/conv3/biases:0
resnet_v2_50/block4/unit_1/bottleneck_v2/preact/beta:0
resnet_v2_50/block4/unit_1/bottleneck_v2/shortcut/weights:0
resnet_v2_50/block4/unit_1/bottleneck_v2/shortcut/biases:0
resnet_v2_50/block4/unit_1/bottleneck_v2/conv1/weights:0
resnet_v2_50/block4/unit_1/bottleneck_v2/conv1/biases:0
resnet_v2_50/block4/unit_1/bottleneck_v2/BN2/beta:0
resnet_v2_50/block4/unit_1/bottleneck_v2/prelu_layer/alphas:0
resnet_v2_50/block4/unit_1/bottleneck_v2/conv2/weights:0
resnet_v2_50/block4/unit_1/bottleneck_v2/conv2/biases:0
resnet_v2_50/block4/unit_1/bottleneck_v2/conv3/weights:0
resnet_v2_50/block4/unit_1/bottleneck_v2/conv3/biases:0
resnet_v2_50/block4/unit_2/bottleneck_v2/preact/beta:0
resnet_v2_50/block4/unit_2/bottleneck_v2/conv1/weights:0
resnet_v2_50/block4/unit_2/bottleneck_v2/conv1/biases:0
resnet_v2_50/block4/unit_2/bottleneck_v2/BN2/beta:0
resnet_v2_50/block4/unit_2/bottleneck_v2/prelu_layer/alphas:0
resnet_v2_50/block4/unit_2/bottleneck_v2/conv2/weights:0
resnet_v2_50/block4/unit_2/bottleneck_v2/conv2/biases:0
resnet_v2_50/block4/unit_2/bottleneck_v2/conv3/weights:0
resnet_v2_50/block4/unit_2/bottleneck_v2/conv3/biases:0
resnet_v2_50/block4/unit_3/bottleneck_v2/preact/beta:0
resnet_v2_50/block4/unit_3/bottleneck_v2/conv1/weights:0
resnet_v2_50/block4/unit_3/bottleneck_v2/conv1/biases:0
resnet_v2_50/block4/unit_3/bottleneck_v2/BN2/beta:0
resnet_v2_50/block4/unit_3/bottleneck_v2/prelu_layer/alphas:0
resnet_v2_50/block4/unit_3/bottleneck_v2/conv2/weights:0
resnet_v2_50/block4/unit_3/bottleneck_v2/conv2/biases:0
resnet_v2_50/block4/unit_3/bottleneck_v2/conv3/weights:0
resnet_v2_50/block4/unit_3/bottleneck_v2/conv3/biases:0
resnet_v2_50/postnorm/beta:0
resnet_v2_50/DenseLayer/W:0
resnet_v2_50/DenseLayer/b:0
resnet_v2_50/BatchNorm/beta:0
158 trainable weights

from insightface_tf.

HsuTzuJen commented on August 20, 2024

Hi, I just found that the first conv and bn should be followed by a 3x3 stride2 pooling layer, but it is not in L_Resnet_E_IR.py.
After adding the pooling layer, the model size for MS1M is down to 1.31G and 1.08G for VGG2.
I finally can train L_Resnet_E_IR.py with batch_size 128 on VGG2(MS1M not sure), not only batch_size 16.
However, the training speed is one-fifth of resnet_V2.py.
By the way, the number of trainable weights is about 141M in L_Resnet_E_IR.py, and about 40M in resnet_V2.py.

from insightface_tf.

auroua commented on August 20, 2024

@HsuTzuJen
Thanks for you compare the different models.

from insightface_tf.

TengliEd commented on August 20, 2024

The implemented architecture of L_ResnetE_IR is as mentioned in the paper? As you @HsuTzuJen said there are some differences. Specially the size of implemented model is big.

from insightface_tf.

TengliEd commented on August 20, 2024

The paper says stride of the second conv2 in each bottleneck is 2, but those in implementation are 1 except for those in the first unit of each block. Is it OK?

from insightface_tf.

HsuTzuJen commented on August 20, 2024

@TengliEd I am using ResNetV2, and I achieve acc 0.995 test on lfw now. I think L_ResnetE_IR is improved from ResNet, but I have not tried to train it more yet since the model is too big. I will try to implement L_ResnetE_IR myself and train it.
I think the model size of L_ResnetE_IR should be similar to ResNetV2 and it is about 160MB.

from insightface_tf.

TengliEd commented on August 20, 2024

So cool @HsuTzuJen. You got exact architecture of L_ResNet100E_IR? I am still confused about the structure of Resnet part. Plus, please don't forget to mention me as soon as you finish your own implementation. Cheers!

from insightface_tf.

HsuTzuJen commented on August 20, 2024

@TengliEd I think you can read the paper about ResNet first"https://arxiv.org/abs/1512.03385".
There are net models in "https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/slim/python/slim/nets",
and you can try to revise the code. It's more simple to implement in this way.

from insightface_tf.

TengliEd commented on August 20, 2024

Thanks @HsuTzuJe. I have read the paper but the Arcface implementation here seems not to follow the original structure. For instance, "There are 14 units in block 3 in L_Resnet_E_IR.py, but 6 units only in block 3 in resnet_v2.py." you said. Arcface not only modified each unit bottleneck but also the number of units, is it true?

from insightface_tf.

HsuTzuJen commented on August 20, 2024

@TengliEd Since each unit bottleneck is modified, the number of units should be too.
They count layers by the number of convolution layer.
There are 2 conv in each unit for L_Resnet_E_IR but 3 conv in each unit for ResNetV2.

from insightface_tf.

HsuTzuJen commented on August 20, 2024

I just compared the two models:
ResNetV2(50 layers): 16 3x3 convs and 32 1x1 convs in 4 blocks.
L_Resnet_E_IR(50 layers): 48 3x3 convs in 4 blocks.
The size of one 3x3 conv is 9 times to one 1x1 conv, and there are more batchnorm is the L_Resnet_E_IR than ResNetV2. Due to these reasons, it makes sense that the size of L_Resnet_E_IR is much bigger
than ResNetV2.
By the way, the size of the pre-trained L_Resnet_E_IR (50 layers) provided by insightface (mxnet version) is only 160MB which is closed to ResNetV2(50 layers, tf version). I am curious what makes them so different since they are in the same architecture(L_Resnet_E_IR implement in tf and mxnet).

from insightface_tf.

TengliEd commented on August 20, 2024

You are sure L_ResNet_E_IR was implemented in TF the same way as in MxNet?

from insightface_tf.

TengliEd commented on August 20, 2024

@HsuTzuJen How is your own implementation going? Is there any changes to this one? :)

from insightface_tf.

HsuTzuJen commented on August 20, 2024

@TengliEd NO, I got the same result just like L_Resnet_E_IR.py.
I think that we should try the insightFace(mxnet), but I do not know about mxnet.

from insightface_tf.

TengliEd commented on August 20, 2024

@HsuTzuJen I just checked the MxNet code. There is no bottleneck in L_ResNet_E_IR. So the filter number in each unit of a block is same. No need to 4* base_depth.

from insightface_tf.

HsuTzuJen commented on August 20, 2024

@TengliEd Thank you very much. I just check it and I found that the size of the model is only 166MB without 4* base_depth.
@auroua I think we can use your code to achieve good performance now!

from insightface_tf.

TengliEd commented on August 20, 2024

My L_ResNet100_E size is 402MB and L_ResNet50_E is 320 MB. How can you achieve just 166MB?

from insightface_tf.

HsuTzuJen commented on August 20, 2024

It is 320MB that the model is included the logit layer"arcface_loss(embedding, labels, out_num, w_init=None, s=64., m=0.5)"(it is a fully connected layer for softmax, and it is useful only in training) and L_ResNet50_E , but it is 166MB with L_ResNet50_E only.
I only save the trained L_ResNet50_E since we test the model with only L_ResNet50_E.

from insightface_tf.

TengliEd commented on August 20, 2024

OK. Do you use WeChat or other similar apps? I wanna ask you for some details about how to get 99.5%+ accuracy on LFW. I only got 99.02%. :)

from insightface_tf.

TengliEd commented on August 20, 2024

Or plz provide your train_nets.py and other modified parts. Thx

from insightface_tf.

TengliEd commented on August 20, 2024

@HsuTzuJen Your result is better when using no bottle neck?

from insightface_tf.

TengliEd commented on August 20, 2024

How can you achieve so large batch size? I am using p100 12G GPU and the maximum batch size is only 50 when employing L_ResNet_100E_IR. Besides, you mean dropout by cutout method?

from insightface_tf.

HsuTzuJen commented on August 20, 2024

I have a GPU server with 4 1080ti.
I do not use the dropout.
Cutout is an image preprocessing method which is easy to implement.
The paper : https://arxiv.org/abs/1708.04552

from insightface_tf.

TengliEd commented on August 20, 2024

Thx @HsuTzuJen. I also have a server with 4 GPUs but don't know how to boost batch size. It seems there is no difference in maximum batch size between 4 GPUs and 1 GPU. Can you share your code to let me repeat your result? Cheers!

from insightface_tf.

TengliEd commented on August 20, 2024

Only one gpu memory is fully used. I have set coda_visible_devices to 0,1,2,3. You use train_nets.py to train your model? How can I fully use the gpu memory?

from insightface_tf.

HsuTzuJen commented on August 20, 2024

Please use train_nets_mgpu.py set parser.add_argument('--num_gpus', default=4, help='the num of gpus')

from insightface_tf.

TengliEd commented on August 20, 2024

OK. parser.add_argument('--batch_size', default=32, help='batch size to train network') means actual batch size is 4*32?

from insightface_tf.

auroua commented on August 20, 2024

Ｔhe train_nets_mgpu.py have some bugs, you should do some modify before using this code. About how to modify the code, you could reference the code train_nets.py.

from insightface_tf.

TengliEd commented on August 20, 2024

The added lines used to train on multiple gpus based on train_nets.py are correct? I suppose you mean the accuracy computation line. When I need a 128 batch size, I just need to set batch_size to 32 on here?

from insightface_tf.

HsuTzuJen commented on August 20, 2024

Just set 128 if you want a 128 batch size, and it would be split into num_gpu parts.

from insightface_tf.

TengliEd commented on August 20, 2024

OK. I got it. You use other tricks like lr schedule?

from insightface_tf.

erlendd commented on August 20, 2024

Just took a look at L_Resnet_E_IR_fix_issue9.py and it appears that both dropout and batch normalization are kept on during testing. In particular batch norm is being passed is_train=True as a hard parameter throughout. Is this intentional? If I've understood correctly it means you're actually going Adaptive Batch Normalization, which is a domain adaptation technique.

from insightface_tf.

The capacity of parameters about insightface_tf HOT 39 OPEN

Comments (39)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent