Giter VIP home page Giter VIP logo

insightface_tf's People

Contributors

auroua avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

insightface_tf's Issues

Accuracy result?

Hello, have you finished trainning? Are you going to publish result compare to original InsightFace?
Also, the pretrained model links are dead, could you please fix it? Thank you.

使用arcfaceloss无法收敛

我将arcfaceloss函数用到我的代码中,mobilefacenet网络无法收敛,其他网络loss为nan,但是使用softmax没有问题,请问这里有什么需要注意的吗,谢谢?

The capacity of parameters

Hi, the model size for the model InsightFace_MX is about 160MB, but IL_Resnet_E_IR_GBN is about 1.6GB.
Since they are almost in the same architecture, what makes them different?
Thank you for your response!!

How can i restore the model_c to continue train and do test?

I mean, what's the model the model_c use to train? the L_Resnet_E_IR.py or some other *.py?
because, when i use the latest code to restore as follow:

if args.ckpt_file:
model_path = tf.train.latest_checkpoint(args.ckpt_file)
print('restore model from model_path:{} {}'.format(model_path, args.ckpt_file))
saver.restore(sess, model_path)
else:
print('re train model')
sess.run(tf.global_variables_initializer())
where,model_path is InsightFace_iter_best_1950000.ckpt, but it fails, the tf report is:
Key resnet_v1_50/block1/unit_1/bottleneck_v1/conv1/kernel not found in checkpoint

so, can you tell me how can i to do restore from the model_c in the right way? and is there something
different in model layer between mgpu and single gpu??? thank you very much!

unzip error of ckpt_model_b.zip

Hi,
I got an error when I unzip the file ckpt_model_b.zip which was downloaded from Baidu Wangpan:
F:\BaiduNetdiskDownload\ckpt_model_b(1).zip: 文件 F:\BaiduNetdiskDownload\ckpt_model_b(1)\InsightFace_iter_1100000.ckpt.data-00000-of-00001 里出现校验和错误。该文件已损坏。
F:\BaiduNetdiskDownload\ckpt_model_b(1).zip: 压缩文件已损坏
I downloaded again and again, but it was the same.

Question (問題請教)

請問為什麼是用這個
Can I ask why using this train_op instead of another train_op below?

grads = opt.compute_gradients(total_loss)
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(update_ops):
train_op = opt.apply_gradients(grads, global_step=global_step)

而不是這個呢?
# train_op = opt.minimize(total_loss, global_step=global_step)

About accuracy?

Hi, I am so interested in your code. Can you please tell me if it can achieve the same accuracy as in the paper?

pertrained models is lost

deer aurora:
your provide the pretrained models (baiduyu) is cannot downloaded (need password?).can you fixed?

net = get_resnet TypeError: __init__() takes at least 2 arguments (2 given)

[TL] InputLayer resnet_v1_50/input_layer: (?, 112, 112, 3)
[TL] Conv2d resnet_v1_50/conv1: shape:(3, 3, 3, 64) strides:(1, 1, 1, 1) pad:SAME act:identity
Traceback (most recent call last):
File "train_nets.py", line 77, in
net = get_resnet(images, args.net_depth, type='ir', w_init=w_init_method, trainable=True, keep_rate=dropout_rate)
File "/home/ai/InsightFace_TF/nets/L_Resnet_E_IR_fix_issue9.py", line 399, in get_resnet
scope='resnet_v1_%d' % num_layers)
File "/home/ai/InsightFace_TF/nets/L_Resnet_E_IR_fix_issue9.py", line 301, in resnet
net = BatchNormLayer(net, act=tf.identity, name='bn0', is_train=True, trainable=trainable)
File "/home/ai/InsightFace_TF/nets/L_Resnet_E_IR_fix_issue9.py", line 110, in init
Layer.init(self, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorlayer/deprecation.py", line 24, in wrapper
return f(*args, **kwargs)
TypeError: init() takes at least 2 arguments (2 given)

my version:
tensorflow 1.4 tensorlayer 1.8.4

someone have same porblem , please give me some soutions, thanks

face_losses.py

Maybe there are something mistaken in cosineface_losses()?
output = tf.add(s * tf.multiply(cos_t, inv_mask), s * tf.multiply(cos_t_m, mask), name='cosineface_loss_output')
But in AMSoftmax, if(top_data[i * dim + gt] > -bias_) top_data[i * dim + gt] += bias_; Not all the value subtract the margin.

About the pre-processing of the input faces

I want to utilize the pre-trained model for testing my data. However, I don't know the pre-processing method. For example, what's the normalization stype, ie, sub rgb average of images, or norm to [0, 1]? And what alignment method do you use?

training output accuracy always zero

epoch 0, total_step 2020, total loss is 204.23 , inference loss is 23.59, weight deacy loss is 180.64, training accuracy is 0.000000, time 31.271 samples/sec
epoch 0, total_step 2040, total loss is 202.41 , inference loss is 21.94, weight deacy loss is 180.47, training accuracy is 0.000000, time 32.147 samples/sec
epoch 0, total_step 2060, total loss is 210.42 , inference loss is 30.12, weight deacy loss is 180.30, training accuracy is 0.000000, time 32.219 samples/sec
epoch 0, total_step 2080, total loss is 198.15 , inference loss is 18.02, weight deacy loss is 180.13, training accuracy is 0.125000, time 32.244 samples/sec
epoch 0, total_step 2100, total loss is 205.67 , inference loss is 25.72, weight deacy loss is 179.96, training accuracy is 0.000000, time 30.409 samples/sec

samples/sec is around 39, but original is up to 400

my gpu is Tesla P40, 24GBmemory
when i run train_net.py on single gpu, it's about 39 samples/sec, and it seems not much faster on 4-gpu with train_nets_mgpu_new.py(still around 40). and i notice that the cpu Utilization rate is very low(100%) compare to original insightface(1200%) on 48-core cpu.

at first i thought it was due to slow data flow, so i change the data feed code to tensorflow.python.ops.data_flow_ops.FIFOQueue refered from facenet, but it helps nothing, even slower

compare with 200 samples/sec on facenet with inception_resnet_v2
and 400 samples/sec on original insightface with resnet_100
I wonder what is the key point to slow the speed down.

could you give some tips? thanks.

zero accuracy

I trained your post r50 model from scratch on ms1m dataset (run train_nets.py). After one epoch (170k steps), the printed training accuracy is still 0.000000.

About the training accuracy

It is still 0.000000 after few epochs(ex in 7 epochs).
Should we use
acc = tf.reduce_mean(tf.cast(tf.equal(tf.argmax(pred, axis=1), tf.argmax(tf.one_hot(labels, depth=args.num_output), 1)), dtype=tf.int64))
instead of
acc = tf.reduce_mean(tf.cast(tf.equal(tf.argmax(pred, axis=1), labels_s[i]), dtype=tf.int64))
?

I still do not know how to fix it.

Accuracy remains 0.00000

Hi the accuracy is constantly 0.00, the output is as follows:

epoch 0, total_step 5340, total loss is 41.79 , inference loss is 34.00, weight deacy loss is 7.79, training accuracy is 0.000000, time 98.567 samples/sec

i have the new code as well as the bug fix issue in issue #10 yet the accuracy remains 0, do i wait for more steps? or is this some other issue, my output classes are 10575, rather than the traditional 85164

About the new model mobilefacenet of InsightFace

Hi, I just modify the code for the new model mobile-facenet of InsightFace"https://github.com/deepinsight/insightface/blob/master/src/symbols/fmobilefacenet.py"
to TF version as follow:
"https://github.com/HsuTzuJen/Face_Recognition_Practice_with_TF/blob/master/Nets/mobilefacenet_test.py"
I set the training parameters as the original paper"https://arxiv.org/abs/1804.07573", but it is not converged correctly.
Can you please help me to find out what is going on?

mm = sin_m * m

mm = sin_m * m
cos_t - mm

What is this mean? thank you!

megaface challenge1

hi, can you provide the lr and lr_step parameter on model D, and I use your model D on megaface challenge1 get rank 1:93.0523%。Have you run megaface challenge1 on your model, if have, can you give me the result of rank 1?

train_nets.py

net = get_resnet(images,.....)
net = BatchNormLayer()
Layer.init(self, name=name)
tensorlayer/decorators/deprecated_alias.py,line24,in wrapper
return f(*args, **kwargs)
TypeError:init() missing 1 required positional argument: 'prev_layer'

error nan loss

when i use arcface loss in mnist, my error loss nan ? do you test your code accuracy ?

about consine face implementation

Hi, I checked all the issue and thinks a lot about cosine faces, however, I still can not understand the logic behind tf.multiply(cos_t, inv_mask), is there any [reference] for this implementation?

face_losses.py ---> function cosineface_losses ---> line:76

thanks ~

UnicodeEncodeError in mx2tfrecords.py

when I excute mx2tfrecords.py and it is exuted to "images, labels = sess.run(next_element)", Error occured:

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\framework\tensor_util.py:560: DeprecationWarning: The binary mode of fromstring is deprecated, as it behaves surprisingly on unicode inputs. Use frombuffer instead
return np.fromstring(tensor.tensor_content, dtype=dtype).reshape(shape)
Traceback (most recent call last):

File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2927, in run_code
self.showtraceback(running_compiled_code=True)

File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 1833, in showtraceback
self._showtraceback(etype, value, stb)

File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\zmqshell.py", line 558, in _showtraceback
dh.parent_header, ident=topic)

File "C:\ProgramData\Anaconda3\lib\site-packages\jupyter_client\session.py", line 737, in send
to_send = self.serialize(msg, ident)

File "C:\ProgramData\Anaconda3\lib\site-packages\jupyter_client\session.py", line 625, in serialize
content = self.pack(content)

File "C:\ProgramData\Anaconda3\lib\site-packages\jupyter_client\session.py", line 103, in
ensure_ascii=False, allow_nan=False,

File "C:\ProgramData\Anaconda3\lib\site-packages\zmq\utils\jsonapi.py", line 43, in dumps
s = s.encode('utf8')

UnicodeEncodeError: 'utf-8' codec can't encode character '\udca8' in position 2124: surrogates not allowed

about distance calculation in verification.py script

looking to line 50 and 51, the scripts
diff = np.subtract(embeddings1, embeddings2)
dist = np.sum(np.square(diff), 1)
are use to calculate the distance between two embedding, but why not add an extra sqrt to the dist?
like this.
dist = np.sqrt(np.sum(np.square(diff), 1))

euclidean needs a extra sqrt operation, doesn't it?

Not Found Error in Model D Checkpoint (710K)

I get this error when trying to run eval_ckpt_file.py:

NotFoundError (see above for traceback): Key resnet_v1_50/block1/unit_1/bottleneck_v1/shortcut_bn/BatchNorm/beta not found in checkpoint
	 [[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]
	 [[Node: save/RestoreV2/_79 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_84_save/RestoreV2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]

verification.py

Hi, thanks for your work!
I found a bug in your code.
In verification.py->test
data-=127.5
data *= 0.0078125
you have to use copy so datas won't change simultaneously.

Here is the example:

for idx, data in enumerate(data_iter(datas, batch_size)):
            if idx%100==0:
              print('idx=',idx)
            data_2 = data.copy()
            data_2 -= 127.5
            data_2 *= 0.0078125
            feed_dict[input_placeholder] = data_2

Regarding licence of model file and dataset.

hi chen wei,
I have gone through the repository and unable to find the license used here,
1.May I know the license used here?Is the model file also covered under that license?
2.With your knowledge, any idea about the licensing of datasets you used?

Pre-training model

when I load the resnet_v1_50,ckpt in my model,there is the error in my pycharm:
NotFoundError (see above for traceback): Tensor name "arcface_loss/embedding_weights" not found in checkpoint files ./output/ckpt/resnet_v1_50.ckpt
[[Node: save_1/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_INT32, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save_1/Const_0_0, save_1/RestoreV2/tensor_names, save_1/RestoreV2/shape_and_slices)]]

what should I do ? please?

vgg16 network performance with arcface loss

hi, considering the speed, I want to use vgg16 trained on the dataset without alignment. i have trained the vgg16 model without fc layers on MSCeleb Aligned Facec Images(using top 2000 class), achieve the testing accuracy 86.2% on unaligned lfw onefold. Now i want to use arcface loss to impove the performance. and choose the vggface2 as the training set (no alignment), vgg have no fc layers followed by arcface loss and softmax, set the margin 0.3(mentioned on the issues), and the trained model will be tested on the unligned LFW onefold, the optimizer is the same as yours
question 1: have you tried the vgg16 model with arcface loss, what's your accuracy? And my net's accuracy could reach more than 97% ?
question 2: are there any suggestions for me?
many thanks!

bgr or rgb

does pretrained model d use bgr or rgb image in preprocessing step? thanks!

About cosinface_loss in the face_losses.py

@auroua
Is this implement with Cos θ - m (Additive Margin Softmax for Face Verification)?
Do you have any idea about that how to achieve cos θ1 -m = cos θ2 (CosFace: Large Margin Cosine Loss for Deep Face Recognition)

pb file size too big

Hi @auroua

Thanks all your work. I tried to freeze model-C ckpt as a pb file. The pb file I got is ~700MB, which is much bigger than the MxNet model size(~100MB). Is it normal? Is there any method to reduce file size?

tran.tfrecords not found error

Hi, sir
When run "train_nets.py" , I get the following errors:

2018-06-11 16:48:15.835525: W tensorflow/core/framework/allocator.cc:101] Allocation of 174415872 exceeds 10% of system memory.
2018-06-11 16:48:15.836087: W tensorflow/core/framework/allocator.cc:101] Allocation of 174415872 exceeds 10% of system memory.
2018-06-11 16:48:24.161151: W tensorflow/core/framework/allocator.cc:101] Allocation of 174415872 exceeds 10% of system memory.
Traceback (most recent call last):
File "/home/jimmy/anaconda3/envs/mlwork/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1322, in _do_call
return fn(*args)
File "/home/jimmy/anaconda3/envs/mlwork/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1307, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/jimmy/anaconda3/envs/mlwork/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1409, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.NotFoundError: ./datasets/tfrecords/tran.tfrecords; No such file or directory
[[Node: IteratorGetNext = IteratorGetNextoutput_shapes=[[?,112,112,3], [?]], output_types=[DT_FLOAT, DT_INT64], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "train_nets.py", line 167, in
images_train, labels_train = sess.run(next_element)
File "/home/jimmy/anaconda3/envs/mlwork/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 900, in run
run_metadata_ptr)
File "/home/jimmy/anaconda3/envs/mlwork/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1135, in _run
feed_dict_tensor, options, run_metadata)
File "/home/jimmy/anaconda3/envs/mlwork/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run
run_metadata)
File "/home/jimmy/anaconda3/envs/mlwork/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: ./datasets/tfrecords/tran.tfrecords; No such file or directory
[[Node: IteratorGetNext = IteratorGetNextoutput_shapes=[[?,112,112,3], [?]], output_types=[DT_FLOAT, DT_INT64], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

Caused by op 'IteratorGetNext', defined at:
File "train_nets.py", line 65, in
next_element = iterator.get_next()
File "/home/jimmy/anaconda3/envs/mlwork/lib/python3.6/site-packages/tensorflow/python/data/ops/iterator_ops.py", line 370, in get_next
name=name)), self._output_types,
File "/home/jimmy/anaconda3/envs/mlwork/lib/python3.6/site-packages/tensorflow/python/ops/gen_dataset_ops.py", line 1466, in iterator_get_next
output_shapes=output_shapes, name=name)
File "/home/jimmy/anaconda3/envs/mlwork/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/jimmy/anaconda3/envs/mlwork/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3392, in create_op
op_def=op_def)
File "/home/jimmy/anaconda3/envs/mlwork/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1718, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

NotFoundError (see above for traceback): ./datasets/tfrecords/tran.tfrecords; No such file or directory
[[Node: IteratorGetNext = IteratorGetNextoutput_shapes=[[?,112,112,3], [?]], output_types=[DT_FLOAT, DT_INT64], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

Where I can get "tran.tfrecords" ? Thank you.
Jimmy

Why increase global_step twice?

In train_nets.py, the variabal 'global_step' is increased twice for each train loop:

  1. inc_op = tf.assign_add(global_step, 1, name='increment_global_step')
  2. train_op = opt.apply_gradients(grads, global_step=global_step)

Why?

Key resnet_v1_50 not found in checkpoint

I run the eval_ckpt_file.py and the console show with

NotFoundError (see above for traceback): Key resnet_v1_50/block1/unit_1/bottleneck_v1/PReluLayer/alpha not found in checkpoint [[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]] [[Node: save/RestoreV2/_301 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_306_save/RestoreV2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]

What should I do to fix the problem

How to test whether two images are identical face?

Hello. First, thanks for good projects! 👍

I cloned it and download pretrained "model A".
But I wonder how to test whether two images are identical face?

Here is my code.

sess = tf.Session()    
saver = tf.train.import_meta_graph('./pretrained_models/InsightFace_iter_1060000.ckpt.meta')
saver.restore(sess, './pretrained_models/InsightFace_iter_1060000.ckpt')

input_images = tf.get_default_graph().get_tensor_by_name('img_inputs:0')

image = Image.open('./test_image.jpg')
image = image.resize((112, 112))

images = [np.array(image)]

outputs = sess.run('arcface_loss/embedding_weights:0', feed_dict={input_images: images})    

But the output shape is always (512, 85164)

Would you let me know how to test images?
Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.