peteryux / retinaface-tf2 Goto Github PK

RetinaFace (Single-stage Dense Face Localisation in the Wild, 2019) implemented (ResNet50, MobileNetV2 trained on single GPU) in Tensorflow 2.0+. This is an unofficial implementation. With Colab.

License: MIT License

Python 100.00%

colab colab-notebook face-detection facedetection mobilenetv2 resnet-50 retinaface retinaface-detector tensorflow tensorflow2 tf2

retinaface-tf2's People

Contributors

Stargazers

Watchers

Forkers

iloveuu2011 dllearn hoangtienduc xiaooquanwu tjulitianyi1997 felixzhang7 dolortaste rahul-islam scutzhe dreamer-shan nhatuan84 magikerwin hunglc007 daoxuanvietanh antongordei pranavparameshwaran jedidimohamed abhiksark myknotruby pabgonza ssmgg alxemade chewkokwah twelve-app noirmist brahimbellahcen pked01 buaaxiejun omengxiang gipsyblues markcanete susudos munemasa lixianwa zhjikoshlizhzc yxpandjay dim25 yafengge renatostavares xianfengju sunvod hongtao-niro rm1377 io8ex khanhduy17 scotthcl imtanyasuri raj911tx nmber5 arufuss cryu854 pasutisu chenying99 alexartemis juruobudong noahzhy suke0 alirezafarahabadi nickuntitled azeme1 chetansinha choi612 nt-dominik crispirat hansroh woting-jada miliadis akinoriosamura sinianyutian julietteoh or-toledano cmmclee zerenyenice weifengchiu nivedwho ritwickghosh cuongdv1 zardzen shankhanil evan-wt chenkaiju barakeelfanseukamhoua elijahahianyo weiguangfan wvermaw ajunlonglive p-e-r-k yasaminborhani elijahmsmith atlasbro devison123 bartoszdorobek eyglys

retinaface-tf2's Issues

How can i saved pretrained model as "savedmodel" tensorflow format ?

trying to load the weight from checkpoint and save entire model as ".savedmodel" format. Here is my code:

cfg_path = os.path.join(os.path.dirname(__file__), 'configs/retinaface_res50.yaml')
checkpoint_path = os.path.join(os.path.dirname(__file__), 'checkpoints/cpkt-81')
saved_path = os.path.join(os.path.dirname(__file__), 'retinaface_res50')


def main(_):
    cfg = load_yaml(cfg_path)
    model = RetinaFaceModel(cfg, training=True)
    model.summary()
    model.load_weights(checkpoint_path).expect_partial()
    model.save(saved_path)

it seem cant load the weights by load_weights():

So how do i save pretrained model as a file in tf2 format ?

Error Test.py on one image

Hi,

Thank you for your repo retinaface.
I have some trouble to use your implementation (my configuration : Python 3.7 / Tensorflow 2.4 - CPU).
I simply wanted to infer RetinaFace with your pretrained weights.
When I use the command line
python test.py --cfg_path="./configs/retinaface_res50.yaml" --img_path=MYIMAGEPATH --down_scale_factor=1.0
I got this error :
tensorflow.python.framework.errors_impl.InvalidArgumentError: ConcatOp : Dimensions of inputs should match: shape[0] = [45056,2] vs. shape[1] = [22528,2] [Op:ConcatV2] name: concat

Do you have any idea what may cause this error?

Thank you in advance

I'm wondering how to fine-tune the model, such as freezing the model, which levels can have a good training effect. Is there any research?

Very slow inference

I implemented this on GTX 2080 on images of 192010803.
The inference time was 250 to 300 ms for each image! I don't know why but it's terrible!

tf.checkpoint how to convert to pb correct

Thanks for this great work, but i convert ckpt to pb& tflite something error.

Can this tf2 model be converted into pb model like in tensorflow 1.x and then be executed in tensorflow 1.x ?

Can you add inference script?

Also any plan to prune to the model to speed up the results?

Can we increase batch size during inferencing?

classification loss (crossentropy)

    # classification loss (crossentropy)
    # 1. compute max conf across batch for hard negative mining
    loss_class = tf.where(mask_neg,
                          1 - class_pred[:, 0][..., tf.newaxis], 0)

請問，這裏的分類損失為什麽是這樣的計算的？要是檢測的目標不僅僅是人臉，而是多個目標（比如，人臉和行人等）是否要做相應的修改？

How to train with custom dataset by using the pretrained model?

Hello,
I would like to ask a few questions about this github repo.

What do I need to do to train using the pretrained model?
How can I create my own custom dataset other than Wider-Face? Is there an annotation tool you recommend that does annotation in the same format?
What should I do to train with the dataset I created?
How can I convert the model to onnx after training?

train size

retinaface-tf2/modules/dataset.py

Line 167 in 9d45c0a

h = w = tf.cast(scale * short_side, tf.int32)

The crop define square crop. (640x640)
If It change the network size to 600x400. Is it a good method for crop?

support batching on test

hi @peteryuX . Your code is awesome but it only support for batch size 1. (models.py)

Can you help me to support test on dynamic batch images

Runtime performance

First things first: many thanks for the nice tensorflow2 implementation! I used your pre-trained model on my laptop (with my webcam) and I am getting ~1 FPS. How many FPS do you achieve? Have you tried to convert it to Tensorflow Light yet? Do you have other ideas of how to make it run faster (i.e. get more FPS)?

SSH context module

Hi, I have noticed that in SSH context module all kernels are 3x3 rather than 3x3/5x5/7x7.
Is there a special reason for that?

the question about the inference time

the retinaface paper said that, the inference time is less than 10 ms (1.4ms, gpu P40, resolution: 640*480), I test the inference time of this model used the same size image on the platform of nvidia p6000, it cost about 240ms, so i want to know what cause the time gap, thanks

can not find esrgan.yaml and --save_image in test.py

Test ResNet50 backbone model

python test.py --cfg_path="./configs/esrgan.yaml" --save_image=True

or

Test ResNet50 backbone model

python test.py --cfg_path="./configs/psnr.yaml" --save_image=True

Cannot find ckpt from ./checkpoints/retinaface_mbv2.

Hi,

I download the Retinaface MobileNetV2 model by using the download link that in the Models section in readme file. When I unzip the file, extracted the following files;

checkpoint
ckpt-81.data-00000-of-00002
ckpt-81.data-00001-of-00002
ckpt-81.index

After that, I run the suggested code;

python test.py --cfg_path="./configs/retinaface_mbv2.yaml" --img_path="./data/0_Parade_marchingband_1_149.jpg" --down_scale_factor=1.0

But I get the following output;

/home/usr/anaconda3/envs/env/lib/python3.6/site-packages/keras_applications/mobilenet_v2.py:294: UserWarning: input_shape is undefined or non-square, or rows is not in [96, 128, 160, 192, 224]. Weights for input shape (224, 224) will be loaded as the default.
warnings.warn('input_shape is undefined or non-square, '
[*] Cannot find ckpt from ./checkpoints/retinaface_mbv2.

How can I run the test.py script ?

Size mismatch error which running mobilenet model

ValueError: Received incompatible tensor with shape (1, 1, 32, 64) when attempting to restore variable with shape (1, 1, 192, 64) and name model/layer_with_weights-1/output1/conv/kernel/.ATTRIBUTES/VARIABLE_VALUE.

PiecewiseConstantWarmUpDecay incorrect implementation causes skipping to the almost minimal LR

see PR #34
Steps to reproduce:
set lr_decay_epoch to a list with length >=3
See that the second learning rate is wrong and won't decrease from its value

GPU utilisation is 0 while running test.py

Hi, i am running test.py on gpu,but instead of running on gpu it uses cpu cores

inference time

I test the inference time of this model used the 1920*1080 size image on the platform of nvidia v100, I use tensorflow serving and it cost about 170ms for signle image. Do you know the reason caused the time gap between paper and the model?
THX
@peteryuX @magikerwin1993

Inference with TF1.15

Hello, first of all thanks for this impressive code for training.
I want to ask if there is a way to run models trained with this code with TF1.15 instead TF2?
Thanks,
Ilya

OSError: [WinError 126] The specified module could not be found

from retinaface import RetinaFace

OSError Traceback (most recent call last)
in
----> 1 from retinaface import RetinaFace

~\anaconda3\envs\myenv3_6\lib\site-packages\retinaface_init_.py in
----> 1 from retinaface.src.retinaface import RetinaFace

~\anaconda3\envs\myenv3_6\lib\site-packages\retinaface\src\retinaface.py in
1 import tensorflow as tf
2 import numpy as np
----> 3 from utilpack.util import *
4 import os
5

~\anaconda3\envs\myenv3_6\lib\site-packages\utilpack\util_init_.py in
1 # list of offering util class
----> 2 from .data_util import PyDataUtil
3 from .time_util import PyTimeUtil
4 from .image_util import PyImageUtil
5 from .debug_util import PyDebugUtil

~\anaconda3\envs\myenv3_6\lib\site-packages\utilpack\util\data_util.py in
19
20
---> 21 from .image_util import PyImageUtil
22 from .time_util import Timeout
23 import os

~\anaconda3\envs\myenv3_6\lib\site-packages\utilpack\util\image_util.py in
19
20 import os
---> 21 from utilpack.core import *
22
23 import cv2

~\anaconda3\envs\myenv3_6\lib\site-packages\utilpack\core_init_.py in
8 from .error import PyError,ERROR_TYPES
9 from .crpyto import PyCrypto
---> 10 from .algorithm import PyAlgorithm

~\anaconda3\envs\myenv3_6\lib\site-packages\utilpack\core\algorithm.py in
19 import numpy as np
20 import random
---> 21 from shapely.geometry import box
22
23

~\anaconda3\envs\myenv3_6\lib\site-packages\shapely\geometry_init_.py in
2 """
3
----> 4 from .base import CAP_STYLE, JOIN_STYLE
5 from .geo import box, shape, asShape, mapping
6 from .point import Point, asPoint

~\anaconda3\envs\myenv3_6\lib\site-packages\shapely\geometry\base.py in
16
17 from shapely.affinity import affine_transform
---> 18 from shapely.coords import CoordinateSequence
19 from shapely.errors import WKBReadingError, WKTReadingError
20 from shapely.geos import WKBWriter, WKTWriter

~\anaconda3\envs\myenv3_6\lib\site-packages\shapely\coords.py in
6 from ctypes import byref, c_double, c_uint
7
----> 8 from shapely.geos import lgeos
9 from shapely.topology import Validating
10

~\anaconda3\envs\myenv3_6\lib\site-packages\shapely\geos.py in
143 if os.getenv('CONDA_PREFIX', ''):
144 # conda package.
--> 145 _lgeos = CDLL(os.path.join(sys.prefix, 'Library', 'bin', 'geos_c.dll'))
146 else:
147 try:

~\anaconda3\envs\myenv3_6\lib\ctypes_init_.py in init(self, name, mode, handle, use_errno, use_last_error)
346
347 if handle is None:
--> 348 self._handle = _dlopen(self._name, mode)
349 else:
350 self._handle = handle

OSError: [WinError 126] The specified module could not be found

Completely wrong prediction matrix!

When I tried to get the face prediction matrix(shape=(37840, 2)) under tf1.15 (without using tf.compat.v1.enable_eager_execution()), I found that the prediction matrix was completely wrong. Can you help me solve it? Thank you very much.

test performance on big face

the performance on big face is very poor

More than 5 landmarks

Is there a way to get more than 5 landmarks on this model?

frozen graph

I converted the retinaface savedmodel format to frozen graph by executing the following code:

import tensorflow as tf
from tensorflow.python.framework.convert_to_constants import convert_variables_to_constants_v2

path2savedmodel = './tf-serving-retinaface_mbv2/1'
path2frozengraph = './models/retinaface.pb'
loaded = tf.saved_model.load(path2savedmodel)
infer = loaded.signatures['serving_default']
f = tf.function(infer).get_concrete_function(input_image=tf.TensorSpec(shape=[None, None, None, 3], dtype=tf.float32))
f2 = convert_variables_to_constants_v2(f)
graph_def = f2.graph.as_graph_def()# Export frozen graph
# write frozen graph (single file) to disk
with tf.io.gfile.GFile(path2frozengraph, 'wb') as f:
   f.write(graph_def.SerializeToString())

While loading the frozen graph in opencv using the following code

print('\n\nloading frozen model...')
net = cv2.dnn.readNet(path2frozengraph)
#net_tf = cv2.dnn.readNetFromTensorflow(path2frozengraph)

I get the following errro

[ERROR:0] global /tmp/pip-req-build-tjxnaiom/opencv/modules/dnn/src/tensorflow/tf_importer.cpp (2804) parseNode DNN/TF: Can't parse layer for node='StatefulPartitionedCall/StatefulPartitionedCall/RetinaFaceModel/ClassHead_2/mul_1' of type='Mul'. Exception: OpenCV(4.5.3) /tmp/pip-req-build-tjxnaiom/opencv/modules/dnn/src/tensorflow/tf_importer.cpp:1464: error: (-215:Assertion failed) scaleMat.type() == CV_32FC1 in function 'parseMul'

Traceback (most recent call last):
  File "/home/Projects/bitbucket/model_conversion/use_retinaface_pb_in_cv2.py", line 188, in <module>
    net = cv2.dnn.readNet(path2frozengraph)

error: OpenCV(4.5.3) /tmp/pip-req-build-tjxnaiom/opencv/modules/dnn/src/tensorflow/tf_importer.cpp:1464: error: (-215:Assertion failed) scaleMat.type() == CV_32FC1 in function 'parseMul'

Typo in the README

Hi. Thank you for your effort.

In line 176 and 179 of README.md, it seems like the script file name is miswritten.

Shouldn't these

python test_widerface.py.py --cfg_path="./configs/retinaface_res50.yaml" --gpu=0
python test_widerface.py.py --cfg_path="./configs/retinaface_res50.yaml" --gpu=0

changed to these? (.py.py -> .py)

python test_widerface.py --cfg_path="./configs/retinaface_res50.yaml" --gpu=0
python test_widerface.py --cfg_path="./configs/retinaface_res50.yaml" --gpu=0

about multi gpus

how to train with multi gpus?

keras .h5 model

I need to use keras (.h5) model. Where can I download it? If not available, then how can I convert this model to .h5 format?
I already have the savedmodel format and tried to convert using the following command.
tf.keras.models.save_model(saved_model, 'retinaface_mbv2.h5')
But it saves retinaface_mbv2.h5 with only 800 Kbytes and is not working.

image input shape does not have to be the same as Input?

image input shape does not have to be the same as Input? for e.g test.py, did see reshape for img but it works, how???

URL fetch failed

I am getting this error while run the code.
Exception: URL fetch failure on https://github.com/JonathanCMitchell/mobilenet_v2_keras/releases/download/v1.1/mobilenet_v2_weights_tf_dim_ordering_tf_kernels_1.0_224_no_top.h5: None -- [Errno -2] Name or service not known

Please help.

CPU RAM consumption until depletion during training if shuffle is set to True

retinaface-tf2/modules/dataset.py

Line 107 in 9d45c0a

if shuffle:

Hi and Thanks for the awesome repo!
I did notice that during training the cpu ram is consumed until it depletes thus silently killing the process after some time. after debugging I found the issue is because of tf.data.Dataset.shuffle being invoked. If I set shuffle to False then the training process is happening as expected.
Thank you!

Pre model download

Hello, your pre-model is on Google Drive, I cannot download it, can you upload it to Baidu hard drive or send it to my mailbox? Thank you very much and I wish you good health.
E-mail [email protected]

hard negative mining

请问这里为什么要做两次sort？

loss_class_idx = tf.argsort(loss_class, axis=1, direction='DESCENDING')
loss_class_idx_rank = tf.argsort(loss_class_idx, axis=1)

Error in converting retinaface mobilenetv2 backend to CoreML

---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
File ~/SageMaker/envs/coreml_env/lib64/python3.8/site-packages/tensorflow/python/framework/importer.py:496, in _import_graph_def_internal(graph_def, input_map, return_elements, validate_colocation_constraints, name, producer_op_list)
    495 try:
--> 496   results = c_api.TF_GraphImportGraphDefWithResults(
    497       graph._c_graph, serialized, options)  # pylint: disable=protected-access
    498   results = c_api_util.ScopedTFImportGraphDefResults(results)

InvalidArgumentError: Input 0 of node RetinaFaceModel/FPN/ConvBN/bn/AssignNewValue was passed float from RetinaFaceModel/FPN/ConvBN/bn/FusedBatchNormV3/ReadVariableOp/resource:0 incompatible with expected resource.

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
Cell In[19], line 1
----> 1 ct.convert(model, source='tensorflow')

File ~/SageMaker/envs/coreml_env/lib64/python3.8/site-packages/coremltools/converters/_converters_entry.py:492, in convert(model, source, inputs, outputs, classifier_config, minimum_deployment_target, convert_to, compute_precision, skip_model_load, compute_units, package_dir, debug, pass_pipeline)
    489 if specification_version is None:
    490     specification_version = _set_default_specification_version(exact_target)
--> 492 mlmodel = mil_convert(
    493     model,
    494     convert_from=exact_source,
    495     convert_to=exact_target,
    496     inputs=inputs,
    497     outputs=outputs_as_tensor_or_image_types,  # None or list[ct.ImageType/ct.TensorType]
    498     classifier_config=classifier_config,
    499     skip_model_load=skip_model_load,
    500     compute_units=compute_units,
    501     package_dir=package_dir,
    502     debug=debug,
    503     specification_version=specification_version,
    504     main_pipeline=pass_pipeline,
    505 )
    507 if exact_target == 'milinternal':
    508     return mlmodel  # Returns the MIL program

File ~/SageMaker/envs/coreml_env/lib64/python3.8/site-packages/coremltools/converters/mil/converter.py:188, in mil_convert(model, convert_from, convert_to, compute_units, **kwargs)
    149 @_profile
    150 def mil_convert(
    151     model,
   (...)
    155     **kwargs
    156 ):
    157     """
    158     Convert model from a specified frontend `convert_from` to a specified
    159     converter backend `convert_to`.
   (...)
    186         See `coremltools.converters.convert`
    187     """
--> 188     return _mil_convert(model, convert_from, convert_to, ConverterRegistry, MLModel, compute_units, **kwargs)

File ~/SageMaker/envs/coreml_env/lib64/python3.8/site-packages/coremltools/converters/mil/converter.py:212, in _mil_convert(model, convert_from, convert_to, registry, modelClass, compute_units, **kwargs)
    209     weights_dir = _tempfile.TemporaryDirectory()
    210     kwargs["weights_dir"] = weights_dir.name
--> 212 proto, mil_program = mil_convert_to_proto(
    213                         model,
    214                         convert_from,
    215                         convert_to,
    216                         registry,
    217                         **kwargs
    218                      )
    220 _reset_conversion_state()
    222 if convert_to == 'milinternal':

File ~/SageMaker/envs/coreml_env/lib64/python3.8/site-packages/coremltools/converters/mil/converter.py:285, in mil_convert_to_proto(model, convert_from, convert_to, converter_registry, main_pipeline, **kwargs)
    280 frontend_pipeline, backend_pipeline = _construct_other_pipelines(
    281     main_pipeline, convert_from, convert_to
    282 )
    284 frontend_converter = frontend_converter_type()
--> 285 prog = frontend_converter(model, **kwargs)
    286 PipelineManager.apply_pipeline(prog, frontend_pipeline)
    288 PipelineManager.apply_pipeline(prog, main_pipeline)

File ~/SageMaker/envs/coreml_env/lib64/python3.8/site-packages/coremltools/converters/mil/converter.py:98, in TensorFlow2Frontend.__call__(self, *args, **kwargs)
     95 from .frontend.tensorflow2.load import TF2Loader
     97 tf2_loader = TF2Loader(*args, **kwargs)
---> 98 return tf2_loader.load()

File ~/SageMaker/envs/coreml_env/lib64/python3.8/site-packages/coremltools/converters/mil/frontend/tensorflow/load.py:61, in TFLoader.load(self)
     59 outputs = self.kwargs.get("outputs", None)
     60 output_names = get_output_names(outputs)
---> 61 self._graph_def = self._graph_def_from_model(output_names)
     63 if self._graph_def is not None and len(self._graph_def.node) == 0:
     64     msg = "tf.Graph should have at least 1 node, Got empty graph."

File ~/SageMaker/envs/coreml_env/lib64/python3.8/site-packages/coremltools/converters/mil/frontend/tensorflow2/load.py:133, in TF2Loader._graph_def_from_model(self, output_names)
    131 def _graph_def_from_model(self, output_names=None):
    132     """Overwrites TFLoader._graph_def_from_model()"""
--> 133     cfs, graph_def = self._get_concrete_functions_and_graph_def()
    134     if isinstance(self.model, _tf.keras.Model) and self.kwargs.get("outputs", None) is None:
    135         # For the keras model, check if the outputs is provided by the user.
    136         # If not, we make sure the coreml model outputs order is the same as
    137         # the original keras model
    138         cf = cfs[0]

File ~/SageMaker/envs/coreml_env/lib64/python3.8/site-packages/coremltools/converters/mil/frontend/tensorflow2/load.py:127, in TF2Loader._get_concrete_functions_and_graph_def(self)
    124 else:
    125     raise NotImplementedError(msg.format(self.model))
--> 127 graph_def = self._graph_def_from_concrete_fn(cfs)
    129 return cfs, graph_def

File ~/SageMaker/envs/coreml_env/lib64/python3.8/site-packages/coremltools/converters/mil/frontend/tensorflow2/load.py:328, in TF2Loader._graph_def_from_concrete_fn(self, cfs)
    325     raise NotImplementedError("Only a single concrete function is supported.")
    327 if _get_version(_tf.__version__) >= _StrictVersion("2.2.0"):
--> 328     frozen_fn = _convert_variables_to_constants_v2(cfs[0], lower_control_flow=False, aggressive_inlining=True)
    329 else:
    330     frozen_fn = _convert_variables_to_constants_v2(cfs[0], lower_control_flow=False)

File ~/SageMaker/envs/coreml_env/lib64/python3.8/site-packages/tensorflow/python/framework/convert_to_constants.py:1083, in convert_variables_to_constants_v2(func, lower_control_flow, aggressive_inlining)
   1075 converter_data = _FunctionConverterData(
   1076     func=func,
   1077     lower_control_flow=lower_control_flow,
   1078     aggressive_inlining=aggressive_inlining)
   1080 output_graph_def, converted_input_indices = _replace_variables_by_constants(
   1081     converter_data=converter_data)
-> 1083 return _construct_concrete_function(func, output_graph_def,
   1084                                     converted_input_indices)

File ~/SageMaker/envs/coreml_env/lib64/python3.8/site-packages/tensorflow/python/framework/convert_to_constants.py:1008, in _construct_concrete_function(func, output_graph_def, converted_input_indices)
   1005   if context.context().has_function(f.signature.name):
   1006     context.context().remove_function(f.signature.name)
-> 1008 new_func = wrap_function.function_from_graph_def(output_graph_def,
   1009                                                  new_input_names,
   1010                                                  new_output_names)
   1012 # Manually propagate shape for input tensors where the shape is not correctly
   1013 # propagated. Scalars shapes are lost when wrapping the function.
   1014 for input_tensor in new_func.inputs:

File ~/SageMaker/envs/coreml_env/lib64/python3.8/site-packages/tensorflow/python/eager/wrap_function.py:650, in function_from_graph_def(graph_def, inputs, outputs)
    647 def _imports_graph_def():
    648   importer.import_graph_def(graph_def, name="")
--> 650 wrapped_import = wrap_function(_imports_graph_def, [])
    651 import_graph = wrapped_import.graph
    652 return wrapped_import.prune(
    653     nest.map_structure(import_graph.as_graph_element, inputs),
    654     nest.map_structure(import_graph.as_graph_element, outputs))

File ~/SageMaker/envs/coreml_env/lib64/python3.8/site-packages/tensorflow/python/eager/wrap_function.py:621, in wrap_function(fn, signature, name)
    618 if name is not None:
    619   func_graph_name = "wrapped_function_" + name
    620 return WrappedFunction(
--> 621     func_graph.func_graph_from_py_func(
    622         func_graph_name,
    623         holder,
    624         args=None,
    625         kwargs=None,
    626         signature=signature,
    627         add_control_dependencies=False,
    628         collections={}),
    629     variable_holder=holder,
    630     signature=signature)

File ~/SageMaker/envs/coreml_env/lib64/python3.8/site-packages/tensorflow/python/framework/func_graph.py:999, in func_graph_from_py_func(name, python_func, args, kwargs, signature, func_graph, autograph, autograph_options, add_control_dependencies, arg_names, op_return_value, collections, capture_by_value, override_flat_arg_shapes)
    996 else:
    997   _, original_func = tf_decorator.unwrap(python_func)
--> 999 func_outputs = python_func(*func_args, **func_kwargs)
   1001 # invariant: `func_outputs` contains only Tensors, CompositeTensors,
   1002 # TensorArrays and `None`s.
   1003 func_outputs = nest.map_structure(convert, func_outputs,
   1004                                   expand_composites=True)

File ~/SageMaker/envs/coreml_env/lib64/python3.8/site-packages/tensorflow/python/eager/wrap_function.py:87, in VariableHolder.__call__(self, *args, **kwargs)
     86 def __call__(self, *args, **kwargs):
---> 87   return self.call_with_variable_creator_scope(self._fn)(*args, **kwargs)

File ~/SageMaker/envs/coreml_env/lib64/python3.8/site-packages/tensorflow/python/eager/wrap_function.py:93, in VariableHolder.call_with_variable_creator_scope.<locals>.wrapped(*args, **kwargs)
     91 def wrapped(*args, **kwargs):
     92   with variable_scope.variable_creator_scope(self.variable_creator_scope):
---> 93     return fn(*args, **kwargs)

File ~/SageMaker/envs/coreml_env/lib64/python3.8/site-packages/tensorflow/python/eager/wrap_function.py:648, in function_from_graph_def.<locals>._imports_graph_def()
    647 def _imports_graph_def():
--> 648   importer.import_graph_def(graph_def, name="")

File ~/SageMaker/envs/coreml_env/lib64/python3.8/site-packages/tensorflow/python/util/deprecation.py:535, in deprecated_args.<locals>.deprecated_wrapper.<locals>.new_func(*args, **kwargs)
    527         _PRINTED_WARNING[(func, arg_name)] = True
    528       logging.warning(
    529           'From %s: calling %s (from %s) with %s is deprecated and will '
    530           'be removed %s.\nInstructions for updating:\n%s',
   (...)
    533           'in a future version' if date is None else ('after %s' % date),
    534           instructions)
--> 535 return func(*args, **kwargs)

File ~/SageMaker/envs/coreml_env/lib64/python3.8/site-packages/tensorflow/python/framework/importer.py:400, in import_graph_def(***failed resolving arguments***)
    357 """Imports the graph from `graph_def` into the current default `Graph`.
    358 
    359 This function provides a way to import a serialized TensorFlow
   (...)
    397     it refers to an unknown tensor).
    398 """
    399 del op_dict
--> 400 return _import_graph_def_internal(
    401     graph_def,
    402     input_map=input_map,
    403     return_elements=return_elements,
    404     name=name,
    405     producer_op_list=producer_op_list)

File ~/SageMaker/envs/coreml_env/lib64/python3.8/site-packages/tensorflow/python/framework/importer.py:501, in _import_graph_def_internal(graph_def, input_map, return_elements, validate_colocation_constraints, name, producer_op_list)
    498     results = c_api_util.ScopedTFImportGraphDefResults(results)
    499   except errors.InvalidArgumentError as e:
    500     # Convert to ValueError for backwards compatibility.
--> 501     raise ValueError(str(e))
    503 # Create _DefinedFunctions for any imported functions.
    504 #
    505 # We do this by creating _DefinedFunctions directly from `graph_def`, and
   (...)
    510 # TODO(skyewm): fetch the TF_Functions directly from the TF_Graph
    511 # TODO(skyewm): avoid sending serialized FunctionDefs back to the TF_Graph
    513 _ProcessNewOps(graph)

ValueError: Input 0 of node RetinaFaceModel/FPN/ConvBN/bn/AssignNewValue was passed float from RetinaFaceModel/FPN/ConvBN/bn/FusedBatchNormV3/ReadVariableOp/resource:0 incompatible with expected resource.

While trying to convert model to CoreMl format, We are getting this error and the conversion process stops @peteryuX can you let us know how to fix this ?

Empty output on second inference call on exported saved model

Hey everyone, and thanks @peteryuX for the great work!

I'm experiencing a weird issue where after exporting the model to the saved_model format, on the second inference call I get an empty output. The first inference call always works though - I am seeing this with both tensorflow serving and regular inference.

Here's how to reproduce...

import tensorflow as tf

import cv2
import numpy as np

from modules.models import RetinaFaceModel
from modules.utils import set_memory_growth, load_yaml, draw_bbox_landm, pad_input_image, recover_pad_output

CONFIG_PATH = "<path-to>/configs/retinaface_res50.yaml"
CHECKPOINT_PATH = "<path-to>/retinaface-tf2/checkpoints/retinaface_res50"
OUTPUT_PATH = "<path-to>/retinaface-tf2/checkpoints/retinaface_res50_export"

def main():
    image = cv2.imread("an-image-path")
    image_infer = np.expand_dims(cv2.cvtColor(image, cv2.COLOR_BGR2RGB).astype(np.float32), axis=0)

    config = load_yaml(CONFIG_PATH)
    model = RetinaFaceModel(config, training=False, iou_th=0.4, score_th=0.5)

    checkpoint = tf.train.Checkpoint(model=model)
    checkpoint.restore(tf.train.latest_checkpoint(CHECKPOINT_PATH))

    # Here, inference works on every call. 
    output_ckpt = model(image_infer)   # Get a result with shape: (4, 16) which is good. 
    output_ckpt2 = model(image_infer)  # Get the same result: (4, 16)

    # Save to file. I have tried all different ways to do this.. 
    # tf.saved_model.save(model, os.path.join(OUTPUT_PATH, "saved_model"), signatures=concrete_fn)
    # tf.keras.models.save_model(model, os.path.join(OUTPUT_PATH, "saved_model"))
    # tf.saved_model.save(model, os.path.join(OUTPUT_PATH, "saved_model"))
    # All have the same issue, let's just use the simple method...
    model.save(OUTPUT_PATH)
    
    # But if we export the model, (or make a concrete function), load it in and run twice, the second call will return an empty output.
    model_loaded = tf.saved_model.load(OUTPUT_PATH)
    infer = model_loaded.signatures["serving_default"]

    output1 = infer(**{"input_image": tf.convert_to_tensor(image_infer)})  # Get a result with shape: (4, 16) which is good. 
    output2 = infer(**{"input_image": tf.convert_to_tensor(image_infer)})  # Get a result like: (0, 16)

if __name__ == "__main__":
    main()

The same behaviour happens if I make a concrete function and export like so:

    concrete_fn = tf.function(model.call).get_concrete_function(
        tf.TensorSpec(
            shape=[None, None, None, 3], dtype=tf.float32, name="image_tensor"
        ),
        training=False
    )
    tf.saved_model.save(model, OUTPUT_PATH, signatures=concrete_fn)

I have a feeling this might have something do with the code not allowing for a batch at this point in the decoding...

# only for batch size 1
preds = tf.concat(  # [bboxes, landms, landms_valid, conf]
    [bbox_regressions[0], landm_regressions[0],
     tf.ones_like(classifications[0, :, 0][..., tf.newaxis]),
     classifications[0, :, 1][..., tf.newaxis]], 1)
priors = prior_box_tf((tf.shape(inputs)[1], tf.shape(inputs)[2]),
                      cfg['min_sizes'],  cfg['steps'], cfg['clip'])
decode_preds = decode_tf(preds, priors, cfg['variances'])

It's so weird has anyone experienced this? And has anyone been able to export the model and make it run consistently - or am I completely missing something lol!

Thinking to rewrite the post-processing code to handle batching but before I do that just checking if anyone has been through this.

how convert to .tflite?

when using the following code:
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_types = [tf.float16]
tflite_model = converter.convert()
open("./tflite_models/face_retinaface_mobilenetv2.tflite", "wb").write(tflite_model)
print('saved tflite model!')
An error of "Tensor 'input_image' has invalid shape '[None, None, None, 3]."appears
How convert to .tflite?Looking forward to your reply.

Error Testing on WIDER FACE Validation Set

[146 / 3226] det ./data/widerface/val\images/10--People_Marching/10_People_Marching_People_Marching_10_People_Marching_People_Marching_10_People_Marching_People_Marching_10_674.jpg
Traceback (most recent call last):
File "test_widerface.py", line 167, in
app.run(main)
File "C:\Users\User\AppData\Local\Programs\Python\Python37\lib\site-packages\absl\app.py", line 312, in run
_run_main(main, args)
File "C:\Users\User\AppData\Local\Programs\Python\Python37\lib\site-packages\absl\app.py", line 258, in _run_main
sys.exit(main(argv))
File "test_widerface.py", line 93, in main
img_height_raw, img_width_raw, _ = img_raw.shape
AttributeError: 'NoneType' object has no attribute 'shape'

I have the above error on some images in the validation images section "10--People_Marching" on 8 images in that folder have no attribute shape with the error above,
update: i just realized it is because the image name on this folder is too long and how to fix it?

when doing the testing part I follow the sequence from the beginning

with complete data preparing section with # Online Image Loading , because I experienced an error in # Binary Image (recommend): need additional space, so the second solution is online image

then download the "Retinaface ResNet50" model and immediately do the testing on the wider face and get an error like the one above which only occurs in the "10--People_Marching"" folder is it just me or am I doing something wrong? I'm really confused

Please help me

ValueError: Received incompatible tensor with shape (1, 1, 32, 64) when attempting to restore variable with shape (1, 1, 192, 64) and name model/layer_with_weights-1/output1/conv/kernel/.ATTRIBUTES/VARIABLE_VALUE.

Hi
Thanks for the great work, res50 is working just fine with me but mbv2 is not working and it keeps giving me this error:
(Note that I followed the same steps as in ReadMe.
ValueError: Received incompatible tensor with shape (1, 1, 32, 64) when attempting to restore variable with shape (1, 1, 192, 64) and name model/layer_with_weights-1/output1/conv/kernel/.ATTRIBUTES/VARIABLE_VALUE.

Access of widerface

Hi, we are students from USTC, can you please open the access to google drive of widerface for us? My email is [email protected]. Looking forward to your reply!

Could you provide a pre-training model suitable for tensorflow1.10?

When I tried to run the model on TF1.10,
I simply commented out the statement tf.com Pat.v1.enable_eager_execution () and tested it under TF.session (): and the following problems popped up:

FailedPreconditionError: Error while reading resource variable block_10_project/kernel from Container: localhost. This could mean that the variable was uninitialized. Not found: Container localhost does not exist. (Could not find resource: localhost/block_10_project/kernel)
[[node RetinaFaceModel/MobileNetV2_extrator/block_10_project/Conv2D/ReadVariableOp (defined at D:\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow_core\python\framework\ops.py:1748) ]]

Error while running on tf2.4

On running the pretrained mobilenet model on tf2.4; I get the following error:


  14:45:53.439 > Traceback (most recent call last):
    File "main.py", line 19, in <module>
    File "_aftershoot.py", line 217, in _aftershoot.run_inference
    File "_aftershoot.py", line 60, in _aftershoot.init_jpegs
    File "raw_handler\img_init.py", line 31, in init raw_handler.img_init
    File "ml_modules\face_detector\FaceDetector.py", line 120, in ml_modules.face_detector.FaceDetector.BatchFaceDetector.__init__
    File "tensorflow\python\training\tracking\util.py", line 2118, in restore
    File "tensorflow\python\training\tracking\util.py", line 2035, in read
    File "tensorflow\python\training\tracking\util.py", line 1320, in restore
    File "tensorflow\python\training\tracking\base.py", line 209, in restore
    File "tensorflow\python\training\tracking\base.py", line 914, in _restore_from_checkpoint_position
    File "tensorflow\python\training\tracking\util.py", line 297, in restore_saveables
    File "tensorflow\python\training\saving\functional_saver.py", line 340, in restore
    File "tensorflow\python\training\saving\functional_saver.py", line 316, in restore_fn        
    File "tensorflow\python\training\saving\functional_saver.py", line 111, in restore
    File "tensorflow\python\training\saving\saveable_object_util.py", line 127, in restore       
    File "tensorflow\python\ops\resource_variable_ops.py", line 311, in shape_safe_assign_variable_handle
    File "tensorflow\python\framework\tensor_shape.py", line 1134, in assert_is_compatible_with  
  ValueError: Shapes (7, 7, 3, 64) and (3, 3, 3, 32) are incompatible

┗ ----------------------------
┏ Electron -------------------

  14:45:53.548 > Sentry is attempting to send 0 pending error messages
  Waiting up to 2 seconds
  Press Ctrl-Break to quit

┗ ----------------------------
┏ Electron -------------------

  14:45:55.347 > close: false

However, the resnet50 model works without any issues.

Any ideas as to what can be done to get the mobilenet model running on Tensorflow 2.4?

@peteryuX

peteryux / retinaface-tf2 Goto Github PK

retinaface-tf2's People

Contributors

Stargazers

Watchers

Forkers

retinaface-tf2's Issues

Test ResNet50 backbone model

or

Test ResNet50 backbone model

Recommend Projects

Recommend Topics

Recommend Org