Giter VIP home page Giter VIP logo

arcface-tf2's People

Contributors

ali-fayzi avatar peteryux avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

arcface-tf2's Issues

trouble in running dataset_cheaker.py

tensorflow.python.framework.errors_impl.InvalidArgumentError: {{function_node __inference_Dataset_map_parse_tfrecord_98}} Feature: image/encoded (data type: string) is required but could not be found.
[[{{node ParseSingleExample/ParseSingleExample}}]] [Op:IteratorGetNextSync]

bad result from lfw dataset

Alicia_Silverstone_0001
Alicia_Silverstone_0002

hello @peteryuX i try to test accuracy on 2 images of 1 person in lfw dataset.
The distance is big.

my test code:

from absl import app, flags, logging
from absl.flags import FLAGS
import cv2
import os
import numpy as np
import tensorflow as tf

from modules.evaluations import get_val_data, perform_val
from modules.models import ArcFaceModel
from modules.utils import set_memory_growth, load_yaml, l2_norm
from scipy.spatial.distance import cosine


flags.DEFINE_string('cfg_path', './configs/arc_res50.yaml', 'config file path')
flags.DEFINE_string('gpu', '0', 'which gpu to use')
flags.DEFINE_string('img_path', '', 'path to input image')


def main(_argv):
    os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
    os.environ['CUDA_VISIBLE_DEVICES'] = FLAGS.gpu

    logger = tf.get_logger()
    logger.disabled = True
    logger.setLevel(logging.FATAL)
    set_memory_growth()

    cfg = load_yaml(FLAGS.cfg_path)

    model = ArcFaceModel(size=cfg['input_size'],
                         backbone_type=cfg['backbone_type'],
                         training=False)

    ckpt_path = tf.train.latest_checkpoint('./checkpoints/')
    if ckpt_path is not None:
        print("[*] load ckpt from {}".format(ckpt_path))
        model.load_weights(ckpt_path)
    else:
        print("[*] Cannot find ckpt from {}.".format(ckpt_path))
        exit()
    image_fol = "./tmp"
    paths = os.listdir(image_fol)
    embeds = []
    images = []
    flip_images = []
    for path in paths:
        print(path)
        img = cv2.imread(os.path.join(image_fol, path))
        img = cv2.resize(img, (cfg['input_size'], cfg['input_size']))
        img = img.astype(np.float32) / 255.
        # if len(img.shape) == 3:
        #     img = np.expand_dims(img, 0)
        # embeds.append(l2_norm(model(img)))
        images.append(img)
    images = np.array(images)
    def hflip_batch(imgs):
        assert len(imgs.shape) == 4
        return imgs[:, :, ::-1, :]

    flip_images = hflip_batch(images)
    embeds = model(images) + model(flip_images)
    embeds = l2_norm(embeds)

    dist = np.sum(np.square(embeds[0]-embeds[1]))

    print("dist: ", dist)
    # diff = np.subtract([embeds[0]], [embeds[1]])
    # dist = np.sum(np.square(diff), 1)
    # print("diff: ", diff)
    acc = 1 - cosine(embeds[0], embeds[1])
    print("acc: ", acc)

the result is:

dist:  1.1708782
acc:  0.4145609736442566

How can i achive 99.35% accuracy on lfw?

get nan result for a whole batch

Hi,
Thanks for sharing this amazing work! I downloaded your model and loaded the weights, resnet-50 ccrop=true, and I test it with a bunch of images. Some batches work fine, no nan, but some batches are all nan results. What might cause this?

Use pretrained ResNet50 model for Face Recognition on my own dataset

I want to build a face recognizer using the pretrained models given in the repository. Currently facing the issue with the distance threshold as with different faces the distances are coming very small whereas for same faces the distance is coming large, in most cases. My questions are:

  1. Am I calculating the embedding correctly?
  2. How face comparison should be done in order to reach LFW level accuracy?

I have already referenced this #6 and #8 but couldn't come up with a concrete solution.

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
import cv2
import numpy as np
import tensorflow as tf

from .modules.models import ArcFaceModel
from .modules.utils import set_memory_growth, load_yaml, l2_norm


class ArcFaceResNet50:
    def __init__(self):
        set_memory_growth()
        self.cfg = load_yaml(os.path.join(os.path.dirname(os.path.abspath(__file__)), \
                                     './configs/arc_res50.yaml'))

        self.model = ArcFaceModel(size=self.cfg['input_size'],
                             backbone_type=self.cfg['backbone_type'],
                             training=False)

        ckpt_path = tf.train.latest_checkpoint(os.path.join(os.path.dirname(os.path.abspath(__file__)), \
                                                            './checkpoints/' + self.cfg['sub_name']))
        if ckpt_path is not None:
            print("[*] load ckpt from {}".format(ckpt_path))
            self.model.load_weights(ckpt_path)
        else:
            print("[*] Cannot find ckpt from {}.".format(ckpt_path))
            exit()
        
    def get_embeddings(self, frame_rgb, bounding_boxes):
            faces = []
            for x1, y1, x2, y2 in bounding_boxes:
                face_patch = frame_rgb[y1:y2, x1:x2, :]
                resized = cv2.resize(face_patch, (self.cfg['input_size'], self.cfg['input_size']), interpolation=cv2.INTER_AREA)
                normalize = resized.astype(np.float32) / 255.
                faces.append(normalize)
            faces = np.stack(faces)
            if len(faces.shape) == 3:
                faces = np.expand_dims(faces, 0)
            # Run prediction
            embeddings = l2_norm(self.model(faces))
            return embeddings

Thanks in advance!

训练epoch问题

请问下大神,你在MS-Celeb-1M数据集上训练5个epoch就能达到Verification results中的效果吗?

Question regarding evaluation metric

Hi, thanks for sharing the project !

I am wondering about the metric used for distnace of embeddings.
From what I understand, you use L2 in the following code:

diff = np.subtract(embeddings1, embeddings2)
dist = np.sum(np.square(diff), 1)

While, the offical code seems to use cosine similarity in all evaluations.
For example:

https://github.com/deepinsight/insightface/blob/1af6eeffdc1fe1d81c308fafe37f28883c5cf27f/Evaluation/IJB/IJB_1N.py#L117

Am I missing something ? Is this intentional ?
I'd appreciate your clarification.

Cannot find ckpt from None

im have this error when running testing python test.py --cfg_path="./configs/arc_res50.yaml" , im already put the pretrained model checkpoint on checkpoints folder and i just read closed issue but i still cant understand it , can u guys explain to me how to solve this problem?

good performance is not obtained

image
image
image

Ihi. I am very grateful for the help here.

I am currently training a model using your source code. However, good performance is not obtained. This is the result after learning 2 epochs, but I don't know if I'm doing well. I look forward to your comments.

SIze of model

Thank you for this project.
I would like to know if there is a way to reduce the size of the model using MobileNetV2 as backbone architecture.

Cosine Similarity, Best Threshold

I want to use this pre-trained model to compare images, What is the best threshold for cosine similarity corresponding to this model.

accuracy related query

Hi friend,

I used below config

# general
batch_size: 128
input_size: 112
embd_shape: 512
sub_name: 'my_arc_res50_no_central_crop'
backbone_type: 'ResNet50' # 'ResNet50', 'MobileNetV2'
head_type: ArcHead # 'ArcHead', 'NormHead'
is_ccrop: False # central-cropping or not

# train
train_dataset: './data/ms1m_bin.tfrecord'
binary_img: True
num_classes: 85742
num_samples: 5822653
epochs: 10
base_lr: 0.01
w_decay: !!float 5e-4
save_steps: 1000

# test
test_dataset: 'test_dataset'

At the end of training loss is 8.0511

when I tried
is_ccrop:True

At that time the epoch were set to 5.

The loss was near 9.x

Please suggest what can be done to improve this.

Thanks,
Vatsal

Asking test.py

Hi, thank you for your implementation.

I have one question about test.py. Can you explain what's the purpose of output embedding vector for Brucelee.jpg ?

Pre-model download

Hi, your pre-model is on Google Drive, I cannot download it, can you upload it to Baidu Drive or send it to my E-mail?
Thank you very much and I wish you good health.
E-mail [email protected]

How can I still get loss=nan

I'm using the MS-Celeb-1M dataset, downloaded from the link posted in README.md

1 . I converted the data to tfrecords following the steps provided in the documetation for binary images.

  1. My training cfg are like these:
    `

general

batch_size: 8
input_size: 112
embd_shape: 128
sub_name: 'arc_mbv2'
backbone_type: 'MobileNetV2' # 'ResNet50', 'MobileNetV2'
head_type: ArcHead # 'ArcHead', 'NormHead'
is_ccrop: False # central-cropping or not

train

train_dataset: './data/imgs_full.tfrecord'
binary_img: True
num_classes: 85742
num_samples: 5822653
epochs: 100
base_lr: 0.01
w_decay: !!float 5e-4
save_steps: 100

test

test_dataset: '.test/'
`

But I'm still getting loss=nan. Is is normal for the initial epochs? Is it a tfrecords error?

Asian-celeb dataset download link


[Asian-celeb dataset]

  • Training data(Asian-celeb)

The dataset consists of the crawled images of celebrities on the he web.The ima images are covered under a Creative Commons Attribution-NonCommercial 4.0 International license (Please read the license terms here. e. http://creativecommons.org/licenses/by-nc/4.0/).


[train_msra.tar.gz]

MD5:c5b668f2204c400099b14f367069aef5

Content: Train dataset called MS-Celeb-1M-v1c with 86,876 ids/3,923,399 aligned images cleaned from MS-Celeb-1M dataset.

This dataset has been excluded from both LFW and Asian-Celeb.

Format: *.jpg

Google: https://drive.google.com/file/d/1aaPdI0PkmQzRbWErazOgYtbLA1mwJIfK/view?usp=sharing

[msra_lmk.tar.gz]

MD5:7c053dd0462b4af243bb95b7b31da6e6

Content: A list of five-point landmarks for the 3,923,399 images in MS-Celeb-1M-v1c.

Format: .....

while is the path of images in tar file train_msceleb.tar.gz.

Label is an integer ranging from 0 to 86,875.

(x,y) is the coordinate of a key point on the aligned images.

left eye
right eye
nose tip
mouth left
mouth right

Google: https://drive.google.com/file/d/1FQ7P4ItyKCneNEvYfJhW2Kff7cOAFpgk/view?usp=sharing

[train_celebrity.tar.gz]

MD5:9f2e9858afb6c1032c4f9d7332a92064

Content: Train dataset called Asian-Celeb with 93,979 ids/2,830,146 aligned images.

This dataset has been excluded from both LFW and MS-Celeb-1M-v1c.

Format: *.jpg

Google: https://drive.google.com/file/d/1-p2UKlcX06MhRDJxJukSZKTz986Brk8N/view?usp=sharing

[celebrity_lmk.tar.gz]

MD5:9c0260c77c13fbb32692fc06a5dbfaf0

Content: A list of five-point landmarks for the 2,830,146 images in Asian-Celeb.

Format: .....

while is the path of images in tar file train_celebrity.tar.gz.

Label is an integer ranging from 86,876 to 196,319.

(x,y) is the coordinate of a key point on the aligned images.

left eye
right eye
nose tip
mouth left
mouth right

Google: https://drive.google.com/file/d/1sQVV9epoF_8jS3ge6DqbilpWk3UNE8U7/view?usp=sharing

[testdata.tar.gz]

MD5:f17c4712f7562ea6d45f0a158e59b792

Content: Test dataset with 1,862,120 aligned images.

Format: *.jpg

Google: https://drive.google.com/file/d/1ghzuEQqmUFN3nVujfrZfBx_CeGUpWzuw/view?usp=sharing

[testdata_lmk.tar]

MD5:7e4995eb9976a2cfd2b23db05d76572c

Content: A list of five-point landmarks for the 1,862,120 images in testdata.tar.gz.

Features should be extracted in the same sequence and with the same amount with this list.

Format: .....

while is the path of images in tar file testdata.tar.gz.

(x,y) is the coordinate of a key point on the aligned images.

left eye
right eye
nose tip
mouth left
mouth right

Google: https://drive.google.com/file/d/1lYzqnPyHXRVgXJYbEVh6zTXn3Wq4JO-I/view?usp=sharing

[feature_tools.tar.gz]

MD5:227b069d7a83aa43b0cb738c2252dbc4

Content: Feature format transform tool and a sample feature file.

Format: We use the same format as Megaface(http://megaface.cs.washington.edu/) except that we merge all files into a single binary file.

Google: https://drive.google.com/file/d/1bjZwOonyZ9KnxecuuTPVdY95mTIXMeuP/view?usp=sharing

loss = nan..what's the problem?

I am training model with ms1m_dataset and asian seleb dataset
but loss = Non...
Model is not tranied at all.
mode = 'fit' -> loss = non
mode = 'eager_ft' -> loss = non
mode = 'eager_fit' -> Out Of memory Error
what's the problem?
please help me and thank you...have a nice day

Test script

Hi @peteryuX, thank you for your amazing work. i wont to know the recommended threshold value for validating if 2 images are similar.

Learning rate and loss value for small number of epoch

Thanks for the project. During your training, did you change the base learning rate to smaller number over training? How many total epoch did you train for your pre trained model? What is the loss value at the end of training? I found my training loss at around 2.1 is reducing very slowly, do you have any suggestions?

Fine-tunning ArcFace

Hi!
Is there a way to perform a fine-tunning to your pretrained model? Or your training code doesn't support a quick way to do it?

For example, I've noticed in other repositories that it can be achieved with this type of commands:

python -u train.py --network m1 --loss triplet --lr 0.005 --pretrained ./models/m1-softmax-emore,1

Thanks!

Issues with perform_val

The function perform_val (from modules.evaluations) seems to have two issues:

  • it evaluates the test data without converting from BGR to RGB: in the test data archives (lfw_align_112.zip, ...) the images are provided in BGR Format, and that is used in evaluation. But the training procedure uses the RGB format (as obtained from tf.image.decode_jpeg). Evaluating on RGB images instead of BGR can slightly (but consistently) improve the results, e.g. with the pretrained ResNet50 from 99.35% to 99,42% for LFW and from 90.36% to 92.56% for CFP-FP.
  • when providing the parameters is_ccrop=True, is_flip=False, center cropping is performed twice, drastically reducing the performance in that case.

TensorBoard

Hi, great job in here! Could you please share TensorBoard logs if you still have?

visualization

Hi. Can you help me how can i visualize my outputs. lets say i want the output label on the given image while evaluating.

I can't achieve the accuracy in bench mark, could somebody help?

output19 42
I use the same train dataset and test dataset as you proposed, but the best result I've got so far is as the picture shows.
I used SGD optimizer and lr=0.1,0.05,0.01,0.0001,0.00001, each lr an epoch. And when I found the loss increasing rather than decreasing, I stoped training. And I got the test result for loss 19.42 as up picture.
More, this is test result when the train loss is 21.15, shown as down picture.
output21 15

train.py AssertionError

Hello, It's so useful code.
but when I use ArcHead on training, It occur error like this.

Traceback (most recent call last):
File "train.py", line 83, in
train()
File "train.py", line 55, in train
logist = model(inputs, training=True)
File "/home/dukim/env/tf2.1/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/base_layer.py", line 891, in call
outputs = self.call(cast_inputs, *args, **kwargs)
File "/home/dukim/env/tf2.1/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/network.py", line 708, in call
convert_kwargs_to_constants=base_layer_utils.call_context().saving)
File "/home/dukim/env/tf2.1/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/network.py", line 870, in _run_internal_graph
assert str(id(x)) in tensor_dict, 'Could not compute output ' + str(x)
AssertionError: Could not compute output Tensor("ArcHead/Identity:0", shape=(None, 1200), dtype=float32)

[BUG] lost GlobalAveragePooling

In modules/models.py backbones are loaded without pretrained classification head (include_top=False) and then custom OutputLayer is added on the top. Ignoring pretrained classifier means cutting off GlobalAveragePooling layer, but OutputLayer doesn't contain it.

I propose something like this:

   def OutputLayer(embd_shape, w_decay=5e-4, name='OutputLayer'):
    
     def output_layer(x_in):
        x = inputs = Input(x_in.shape[1:])
        x = BatchNormalization()(x) # maybe this layer is redundunt
        x = GlobalAveragePooling2D()(x)
        x = Dropout(rate=0.5)(x)
        x = Flatten()(x)
        x = Dense(embd_shape, kernel_regularizer=_regularizer(w_decay))(x)
        x = BatchNormalization()(x)
        model = Model(inputs, x, name=name)
        return model(x_in)

    return output_layer

An effect of loosing GlobalAveragePooling is increasing backbone MobileNetV2 in size from 12 MB to 50 MB, but increasing in accuracy too, although for training MobileNetV2 must be used other hyperparameters which will increase val accuracy.

How to get the classification result?

I trained a model with my own dataset and save it as checkpoint with the help of train.py.
Now, the question is how can I test the model with my own dataset? I tried to just use predict but it only gives me a list of numbers. I guess they are embedding? But what I want is the classification result.

Really appreciate any help.

Colab notebook does not work(for downloading arc_res50.zip file)

Could you let me know how to get the zip file arc_res50.zip?
I attached the warning messege.
From this error, I cannot make checkpoints folder.

Downloading 1HasWQb86s4xSYy36YbmhRELg9LBmvhvt into ./arc_res50.zip... Done.
Unzipping.../usr/local/lib/python3.7/dist-packages/google_drive_downloader/google_drive_downloader.py:78: UserWarning: Ignoring unzip since "1HasWQb86s4xSYy36YbmhRELg9LBmvhvt" does not look like a valid zip file
warnings.warn('Ignoring unzip since "{}" does not look like a valid zip file'.format(file_id))
mv: cannot stat 'arc_res50': No such file or directory

How can i apply augmentation

I've never user tf.records for training.

My question is, Is there any way to set augmentation such as albumentation or imgaug on the training pipeline?

And if so, where do I suppose to set this? during the convertion to tf.records or while loading?

Colab notebook does not work

Traceback (most recent call last):
  File "test.py", line 77, in <module>
    app.run(main)
  File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 303, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "test.py", line 31, in main
    training=False)
  File "/content/arcface-tf2/modules/models.py", line 82, in ArcFaceModel
    x = Backbone(backbone_type=backbone_type, use_pretrain=use_pretrain)(x)
  File "/content/arcface-tf2/modules/models.py", line 32, in backbone
    weights=weights)(x_in)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/keras/applications/__init__.py", line 46, in wrapper
    return base_fun(*args, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/keras/applications/resnet.py", line 33, in ResNet50
    return resnet.ResNet50(*args, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/keras_applications/resnet_common.py", line 435, in ResNet50
    **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/keras_applications/resnet_common.py", line 411, in ResNet
    model.load_weights(weights_path)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/keras/engine/training.py", line 234, in load_weights
    return super(Model, self).load_weights(filepath, by_name, skip_mismatch)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/keras/engine/network.py", line 1222, in load_weights
    hdf5_format.load_weights_from_hdf5_group(f, self.layers)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/keras/saving/hdf5_format.py", line 651, in load_weights_from_hdf5_group
    original_keras_version = f.attrs['keras_version'].decode('utf8')
AttributeError: 'str' object has no attribute 'decode'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.