auroua / insightface_tf Goto Github PK

View Code? Open in Web Editor NEW

699.0 699.0 251.0 366 KB

Insight Face on TensorFlow

License: MIT License

Python 100.00%

facerecognition tensorflow tensorlayer

insightface_tf's People

Contributors

Stargazers

Watchers

Forkers

wyc2015fq darengking pdsyaom zhangmingmovidius damvantai andrewhuman statml majia-yu juhui0419 mingzhenshao guyanxiao tinyloop pbdahzou alexliyang zilianghuang rickerliang dltensor ymcasky sebastianerhan hxl1990 ewrfcas lllhhhqqq tenglied mayofcumtb xiangliu886 denethor1997 mwl5 johndpope jimeffry bigbao9494 michaelyq saxenauts black3391 ruobop ashuezy xggiou dicksonyuan jianyuheng zumbalamambo rolandogough repletetop lpye dansonc superx21 wyw636 chenxl03 gechen cwbjyy qidiso ncepuytl ken2yliu zergratino guoxiaolu deyituo ethanyhzhang re3write rachanonp xianweilv winroot kinraymon jessepraychou caffeandtf brandonzhong rain2008204 cvjie zacharyszz 12knights iwanggp linyia01 richiemay tang1485 nangongmu dreadlord1984 jdc08161063 wanjinchang guidewsp engineerfh theonly22 975150313 colabin xiebin95 michaelchalupa yemenr momandai bychentw wenlihaoyu jluxcs smartwell bitdaocao muhammadfraz shrincy jeremmyzong qianchen94 saiuz retrospection xianfengju wuxiaolianggit jeansoula meitianjinbu yjingyu

insightface_tf's Issues

bgr or rgb

does pretrained model d use bgr or rgb image in preprocessing step? thanks!

training output accuracy always zero

epoch 0, total_step 2020, total loss is 204.23 , inference loss is 23.59, weight deacy loss is 180.64, training accuracy is 0.000000, time 31.271 samples/sec
epoch 0, total_step 2040, total loss is 202.41 , inference loss is 21.94, weight deacy loss is 180.47, training accuracy is 0.000000, time 32.147 samples/sec
epoch 0, total_step 2060, total loss is 210.42 , inference loss is 30.12, weight deacy loss is 180.30, training accuracy is 0.000000, time 32.219 samples/sec
epoch 0, total_step 2080, total loss is 198.15 , inference loss is 18.02, weight deacy loss is 180.13, training accuracy is 0.125000, time 32.244 samples/sec
epoch 0, total_step 2100, total loss is 205.67 , inference loss is 25.72, weight deacy loss is 179.96, training accuracy is 0.000000, time 30.409 samples/sec

Sampling images via saved model

Hi can you please guide me to how i could use the model to make predictions, cant seem to make it work, thank you

Not Found Error in Model D Checkpoint (710K)

I get this error when trying to run eval_ckpt_file.py:

NotFoundError (see above for traceback): Key resnet_v1_50/block1/unit_1/bottleneck_v1/shortcut_bn/BatchNorm/beta not found in checkpoint
	 [[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]
	 [[Node: save/RestoreV2/_79 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_84_save/RestoreV2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]

Pre-training model

when I load the resnet_v1_50,ckpt in my model,there is the error in my pycharm:
NotFoundError (see above for traceback): Tensor name "arcface_loss/embedding_weights" not found in checkpoint files ./output/ckpt/resnet_v1_50.ckpt
[[Node: save_1/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_INT32, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save_1/Const_0_0, save_1/RestoreV2/tensor_names, save_1/RestoreV2/shape_and_slices)]]

what should I do ? please?

How to test whether two images are identical face?

Hello. First, thanks for good projects! 👍

I cloned it and download pretrained "model A".
But I wonder how to test whether two images are identical face?

Here is my code.

sess = tf.Session()    
saver = tf.train.import_meta_graph('./pretrained_models/InsightFace_iter_1060000.ckpt.meta')
saver.restore(sess, './pretrained_models/InsightFace_iter_1060000.ckpt')

input_images = tf.get_default_graph().get_tensor_by_name('img_inputs:0')

image = Image.open('./test_image.jpg')
image = image.resize((112, 112))

images = [np.array(image)]

outputs = sess.run('arcface_loss/embedding_weights:0', feed_dict={input_images: images})

But the output shape is always (512, 85164)

Would you let me know how to test images?
Thanks!

Cannot download model from baidu

Hello,

Could someone upload model D to a host besides baidu?

Google Drive, OneDrive, or anything would work fine.

Thank you!

About cosinface_loss in the face_losses.py

@auroua
Is this implement with Cos θ - m (Additive Margin Softmax for Face Verification)?
Do you have any idea about that how to achieve cos θ1 -m = cos θ2 (CosFace: Large Margin Cosine Loss for Deep Face Recognition)

about distance calculation in verification.py script

looking to line 50 and 51, the scripts
diff = np.subtract(embeddings1, embeddings2)
dist = np.sum(np.square(diff), 1)
are use to calculate the distance between two embedding, but why not add an extra sqrt to the dist?
like this.
dist = np.sqrt(np.sum(np.square(diff), 1))

euclidean needs a extra sqrt operation, doesn't it?

About accuracy?

Hi, I am so interested in your code. Can you please tell me if it can achieve the same accuracy as in the paper?

Regarding licence of model file and dataset.

hi chen wei,
I have gone through the repository and unable to find the license used here,
1.May I know the license used here?Is the model file also covered under that license?
2.With your knowledge, any idea about the licensing of datasets you used?

error nan loss

when i use arcface loss in mnist, my error loss nan ? do you test your code accuracy ?

face_losses.py

Maybe there are something mistaken in cosineface_losses()?
output = tf.add(s * tf.multiply(cos_t, inv_mask), s * tf.multiply(cos_t_m, mask), name='cosineface_loss_output')
But in AMSoftmax, if(top_data[i * dim + gt] > -bias_) top_data[i * dim + gt] += bias_; Not all the value subtract the margin.

About the training accuracy

It is still 0.000000 after few epochs(ex in 7 epochs).
Should we use
acc = tf.reduce_mean(tf.cast(tf.equal(tf.argmax(pred, axis=1), tf.argmax(tf.one_hot(labels, depth=args.num_output), 1)), dtype=tf.int64))
instead of
acc = tf.reduce_mean(tf.cast(tf.equal(tf.argmax(pred, axis=1), labels_s[i]), dtype=tf.int64))
?

I still do not know how to fix it.

the tensorlayer's Droupout layer has trouble working with tf.placeholder?

line 138 in L_Resnet_E_IR_MGPU.py: net = tl.layers.DropoutLayer(net, keep=keep_rate, name='E_Dropout') will throw some errors related with placeholders..

megaface challenge1

hi, can you provide the lr and lr_step parameter on model D, and I use your model D on megaface challenge1 get rank 1:93.0523%。Have you run megaface challenge1 on your model, if have, can you give me the result of rank 1?

do the model to finish the face verification or image similary compare?

do the model to finish the face verification or image similary compare?
if do ,
please tell me how to realise it?thanks

mm = sin_m * m * s should be mm = sin_m * m

InsightFace_TF/losses/face_losses.py

Line 16 in 3dc8883

mm = sin_m * m * s

Maybe it's a tiny bug?

about save models

how to save the models which are not including the loss layer.@auroua

About the pre-processing of the input faces

I want to utilize the pre-trained model for testing my data. However, I don't know the pre-processing method. For example, what's the normalization stype, ie, sub rgb average of images, or norm to [0, 1]? And what alignment method do you use?

使用arcfaceloss无法收敛

我将arcfaceloss函数用到我的代码中，mobilefacenet网络无法收敛，其他网络loss为nan，但是使用softmax没有问题，请问这里有什么需要注意的吗，谢谢？

The capacity of parameters

Hi, the model size for the model InsightFace_MX is about 160MB, but IL_Resnet_E_IR_GBN is about 1.6GB.
Since they are almost in the same architecture, what makes them different?
Thank you for your response!!

verification.py

Hi, thanks for your work!
I found a bug in your code.
In verification.py->test
data-=127.5
data *= 0.0078125
you have to use copy so datas won't change simultaneously.

Here is the example:

for idx, data in enumerate(data_iter(datas, batch_size)):
            if idx%100==0:
              print('idx=',idx)
            data_2 = data.copy()
            data_2 -= 127.5
            data_2 *= 0.0078125
            feed_dict[input_placeholder] = data_2

pertrained models is lost

deer aurora:
your provide the pretrained models (baiduyu) is cannot downloaded (need password?).can you fixed?

pb file size too big

Hi @auroua

Thanks all your work. I tried to freeze model-C ckpt as a pb file. The pb file I got is ~700MB, which is much bigger than the MxNet model size(~100MB). Is it normal? Is there any method to reduce file size?

About the parse_function

Hi, Could you tell me how to modify values in tensor like the way using numpy ?
I want to add cutout method in the in the parse_function as the paper"https://arxiv.org/abs/1708.04552"

vgg16 network performance with arcface loss

hi, considering the speed, I want to use vgg16 trained on the dataset without alignment. i have trained the vgg16 model without fc layers on MSCeleb Aligned Facec Images(using top 2000 class), achieve the testing accuracy 86.2% on unaligned lfw onefold. Now i want to use arcface loss to impove the performance. and choose the vggface2 as the training set (no alignment), vgg have no fc layers followed by arcface loss and softmax, set the margin 0.3(mentioned on the issues), and the trained model will be tested on the unligned LFW onefold, the optimizer is the same as yours
question 1: have you tried the vgg16 model with arcface loss, what's your accuracy? And my net's accuracy could reach more than 97% ?
question 2: are there any suggestions for me?
many thanks!

Accuracy remains 0.00000

Hi the accuracy is constantly 0.00, the output is as follows:

epoch 0, total_step 5340, total loss is 41.79 , inference loss is 34.00, weight deacy loss is 7.79, training accuracy is 0.000000, time 98.567 samples/sec

i have the new code as well as the bug fix issue in issue #10 yet the accuracy remains 0, do i wait for more steps? or is this some other issue, my output classes are 10575, rather than the traditional 85164

train_nets.py

net = get_resnet(images,.....)
net = BatchNormLayer()
Layer.init(self, name=name)
tensorlayer/decorators/deprecated_alias.py,line24,in wrapper
return f(*args, **kwargs)
TypeError:init() missing 1 required positional argument: 'prev_layer'

How can I test my model?

I have been trained my model,but I can't find the code to test my model.can you help me ? @auroua

Do you know why the accuracy is lower than the original project

@auroua E.g., the ACC for LFW in original project could go up to 99.8%. Do you know what's missing here?

samples/sec is around 39, but original is up to 400

my gpu is Tesla P40, 24GBmemory
when i run train_net.py on single gpu, it's about 39 samples/sec, and it seems not much faster on 4-gpu with train_nets_mgpu_new.py(still around 40). and i notice that the cpu Utilization rate is very low(100%) compare to original insightface(1200%) on 48-core cpu.

at first i thought it was due to slow data flow, so i change the data feed code to tensorflow.python.ops.data_flow_ops.FIFOQueue refered from facenet, but it helps nothing, even slower

compare with 200 samples/sec on facenet with inception_resnet_v2
and 400 samples/sec on original insightface with resnet_100
I wonder what is the key point to slow the speed down.

could you give some tips? thanks.

mm = sin_m * m

mm = sin_m * m
cos_t - mm

What is this mean? thank you!

unzip error of ckpt_model_b.zip

Hi,
I got an error when I unzip the file ckpt_model_b.zip which was downloaded from Baidu Wangpan:
F:\BaiduNetdiskDownload\ckpt_model_b(1).zip: 文件 F:\BaiduNetdiskDownload\ckpt_model_b(1)\InsightFace_iter_1100000.ckpt.data-00000-of-00001 里出现校验和错误。该文件已损坏。
F:\BaiduNetdiskDownload\ckpt_model_b(1).zip: 压缩文件已损坏
I downloaded again and again, but it was the same.

what is the node name to get embeddings in model D?

should it be "resnet_v1_50/E_BN2/Identity"
or "arcface/norm_embedding"?
@auroua

about consine face implementation

Hi, I checked all the issue and thinks a lot about cosine faces, however, I still can not understand the logic behind tf.multiply(cos_t, inv_mask), is there any [reference] for this implementation?

face_losses.py ---> function cosineface_losses ---> line:76

thanks ~

Why increase global_step twice?

In train_nets.py, the variabal 'global_step' is increased twice for each train loop:

inc_op = tf.assign_add(global_step, 1, name='increment_global_step')
train_op = opt.apply_gradients(grads, global_step=global_step)

Why?

UnicodeEncodeError in mx2tfrecords.py

when I excute mx2tfrecords.py and it is exuted to "images, labels = sess.run(next_element)", Error occured:

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\framework\tensor_util.py:560: DeprecationWarning: The binary mode of fromstring is deprecated, as it behaves surprisingly on unicode inputs. Use frombuffer instead
return np.fromstring(tensor.tensor_content, dtype=dtype).reshape(shape)
Traceback (most recent call last):

File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2927, in run_code
self.showtraceback(running_compiled_code=True)

File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 1833, in showtraceback
self._showtraceback(etype, value, stb)

File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\zmqshell.py", line 558, in _showtraceback
dh.parent_header, ident=topic)

File "C:\ProgramData\Anaconda3\lib\site-packages\jupyter_client\session.py", line 737, in send
to_send = self.serialize(msg, ident)

File "C:\ProgramData\Anaconda3\lib\site-packages\jupyter_client\session.py", line 625, in serialize
content = self.pack(content)

File "C:\ProgramData\Anaconda3\lib\site-packages\jupyter_client\session.py", line 103, in
ensure_ascii=False, allow_nan=False,

File "C:\ProgramData\Anaconda3\lib\site-packages\zmq\utils\jsonapi.py", line 43, in dumps
s = s.encode('utf8')

UnicodeEncodeError: 'utf-8' codec can't encode character '\udca8' in position 2124: surrogates not allowed

net = get_resnet TypeError: init() takes at least 2 arguments (2 given)

[TL] InputLayer resnet_v1_50/input_layer: (?, 112, 112, 3)
[TL] Conv2d resnet_v1_50/conv1: shape:(3, 3, 3, 64) strides:(1, 1, 1, 1) pad:SAME act:identity
Traceback (most recent call last):
File "train_nets.py", line 77, in
net = get_resnet(images, args.net_depth, type='ir', w_init=w_init_method, trainable=True, keep_rate=dropout_rate)
File "/home/ai/InsightFace_TF/nets/L_Resnet_E_IR_fix_issue9.py", line 399, in get_resnet
scope='resnet_v1_%d' % num_layers)
File "/home/ai/InsightFace_TF/nets/L_Resnet_E_IR_fix_issue9.py", line 301, in resnet
net = BatchNormLayer(net, act=tf.identity, name='bn0', is_train=True, trainable=trainable)
File "/home/ai/InsightFace_TF/nets/L_Resnet_E_IR_fix_issue9.py", line 110, in init
Layer.init(self, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorlayer/deprecation.py", line 24, in wrapper
return f(*args, **kwargs)
TypeError: init() takes at least 2 arguments (2 given)

my version:
tensorflow 1.4 tensorlayer 1.8.4

someone have same porblem , please give me some soutions, thanks

About the new model mobilefacenet of InsightFace

Hi, I just modify the code for the new model mobile-facenet of InsightFace"https://github.com/deepinsight/insightface/blob/master/src/symbols/fmobilefacenet.py"
to TF version as follow:
"https://github.com/HsuTzuJen/Face_Recognition_Practice_with_TF/blob/master/Nets/mobilefacenet_test.py"
I set the training parameters as the original paper"https://arxiv.org/abs/1804.07573", but it is not converged correctly.
Can you please help me to find out what is going on?

Key resnet_v1_50 not found in checkpoint

I run the eval_ckpt_file.py and the console show with

NotFoundError (see above for traceback): Key resnet_v1_50/block1/unit_1/bottleneck_v1/PReluLayer/alpha not found in checkpoint [[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]] [[Node: save/RestoreV2/_301 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_306_save/RestoreV2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]

What should I do to fix the problem

tran.tfrecords not found error

Hi, sir
When run "train_nets.py" , I get the following errors:

2018-06-11 16:48:15.835525: W tensorflow/core/framework/allocator.cc:101] Allocation of 174415872 exceeds 10% of system memory.
2018-06-11 16:48:15.836087: W tensorflow/core/framework/allocator.cc:101] Allocation of 174415872 exceeds 10% of system memory.
2018-06-11 16:48:24.161151: W tensorflow/core/framework/allocator.cc:101] Allocation of 174415872 exceeds 10% of system memory.
Traceback (most recent call last):
File "/home/jimmy/anaconda3/envs/mlwork/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1322, in _do_call
return fn(*args)
File "/home/jimmy/anaconda3/envs/mlwork/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1307, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/jimmy/anaconda3/envs/mlwork/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1409, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.NotFoundError: ./datasets/tfrecords/tran.tfrecords; No such file or directory
[[Node: IteratorGetNext = IteratorGetNextoutput_shapes=[[?,112,112,3], [?]], output_types=[DT_FLOAT, DT_INT64], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "train_nets.py", line 167, in
images_train, labels_train = sess.run(next_element)
File "/home/jimmy/anaconda3/envs/mlwork/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 900, in run
run_metadata_ptr)
File "/home/jimmy/anaconda3/envs/mlwork/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1135, in _run
feed_dict_tensor, options, run_metadata)
File "/home/jimmy/anaconda3/envs/mlwork/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run
run_metadata)
File "/home/jimmy/anaconda3/envs/mlwork/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: ./datasets/tfrecords/tran.tfrecords; No such file or directory
[[Node: IteratorGetNext = IteratorGetNextoutput_shapes=[[?,112,112,3], [?]], output_types=[DT_FLOAT, DT_INT64], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

Caused by op 'IteratorGetNext', defined at:
File "train_nets.py", line 65, in
next_element = iterator.get_next()
File "/home/jimmy/anaconda3/envs/mlwork/lib/python3.6/site-packages/tensorflow/python/data/ops/iterator_ops.py", line 370, in get_next
name=name)), self._output_types,
File "/home/jimmy/anaconda3/envs/mlwork/lib/python3.6/site-packages/tensorflow/python/ops/gen_dataset_ops.py", line 1466, in iterator_get_next
output_shapes=output_shapes, name=name)
File "/home/jimmy/anaconda3/envs/mlwork/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/jimmy/anaconda3/envs/mlwork/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3392, in create_op
op_def=op_def)
File "/home/jimmy/anaconda3/envs/mlwork/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1718, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

NotFoundError (see above for traceback): ./datasets/tfrecords/tran.tfrecords; No such file or directory
[[Node: IteratorGetNext = IteratorGetNextoutput_shapes=[[?,112,112,3], [?]], output_types=[DT_FLOAT, DT_INT64], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

Where I can get "tran.tfrecords" ? Thank you.
Jimmy

Error polling for event status: failed to query event: CUDA_ERROR_ILLEGAL_INSTRUCTION

hi,when i run train_nets.py, it will take a problem including CUDA_ERROR_ILLEGAL_INSTRUCTION, i try my best to solve it by search issues on /github/tensorflow,but it does not work, i guess if my computer's problem, my compter: AMD Ryzen 1600 ,gtx1060 6G, if cpu is intel will work, i dont konw, please help me,thanks

init() missing 1 required positional argument: 'prev_layer'

init() missing 1 required positional argument: 'prev_layer'

what is output_node_names about mode D

i want to product model pb file

Question (問題請教)

請問為什麼是用這個
Can I ask why using this train_op instead of another train_op below?

grads = opt.compute_gradients(total_loss)
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(update_ops):
train_op = opt.apply_gradients(grads, global_step=global_step)

而不是這個呢?
# train_op = opt.minimize(total_loss, global_step=global_step)

zero accuracy

I trained your post r50 model from scratch on ms1m dataset (run train_nets.py). After one epoch (170k steps), the printed training accuracy is still 0.000000.

Accuracy result?

Hello, have you finished trainning? Are you going to publish result compare to original InsightFace?
Also, the pretrained model links are dead, could you please fix it? Thank you.

How can i restore the model_c to continue train and do test?

I mean, what's the model the model_c use to train? the L_Resnet_E_IR.py or some other *.py?
because, when i use the latest code to restore as follow:

if args.ckpt_file:
model_path = tf.train.latest_checkpoint(args.ckpt_file)
print('restore model from model_path:{} {}'.format(model_path, args.ckpt_file))
saver.restore(sess, model_path)
else:
print('re train model')
sess.run(tf.global_variables_initializer())
where,model_path is InsightFace_iter_best_1950000.ckpt, but it fails, the tf report is:
Key resnet_v1_50/block1/unit_1/bottleneck_v1/conv1/kernel not found in checkpoint

so, can you tell me how can i to do restore from the model_c in the right way? and is there something
different in model layer between mgpu and single gpu??? thank you very much!

datetime

https://github.com/auroua/InsightFace_TF#2017-03-30

the datetime is 2017 ?