deepcam-cn / yolov5-face Goto Github PK

View Code? Open in Web Editor NEW

1.9K 1.9K 478.0 10 MB

YOLO5Face: Why Reinventing a Face Detector (https://arxiv.org/abs/2105.12931) ECCV Workshops 2022)

License: GNU General Public License v3.0

Python 96.55% Shell 2.73% Dockerfile 0.23% Cython 0.49%

arcface blazeface pytorch retinaface scrfd shufflenet tensorrt tinaface yolo yolov5 yolov7

yolov5-face's People

Contributors

Stargazers

Watchers

Forkers

aiwzx jie311 boboboblog icpachong yanggui19891007 hoffnung-s tiger1933 yaoq l1129433134 jingruhou liqiang0307 wobjtushisui jacke121 wolfworld6 qaz734913414 wavelet2008 cooparation mc261670164 zhdai baodijun jiujiangluck code-conquer yuanzhenjie txqwjh yxu13 zhaomonica xinhaivillain gatewaynv47h hust-wayne jingyuanzeng liuguoyou great-wind royzon gaofssvm hunterena lucachen zdhscdj felixzhang7 wangdeyu lyffly xiaoyubing ycdhqzhiai open-project-for-yang uptodiff zhujiabao xeron56 jade999 lyrl fengdalu chaocunchen zilipeng you-old guolong-zhang lxgychen garricklin adas-eye cubicimage aiwener aiotnetx zyg11 sundawei scott-mao yongjingli bnutfilloyev cunjian ethan-jiang-1 ikonushok othmane-kada gsuvorov gheyret winterxx wf1024966 julioteleco jinkham kaichoulyc iostream11 beyonddream-productions seonho najeeb-aqel grainw xrosliang edwardnguyen1705 trendingtechnology feigechuanshu zineos sinianyutian runauto xiuyangleiasp cvlinks laughing-q tangdk scb-vs5 yxpandjay getuntun windspire gloriahm ncnnnnn suke0 11menghuan11 lyleonard

yolov5-face's Issues

multi-class object detection with face landmark

I want to detect human body and face using yolov5-face.
What I did is generating the dataset by append human body annotation with bbox and using -1 as the landmark. But the model can only find face with landmark and never try to detect human body.

wider face evaluation prediction dir

how to get the wider_face's prediction dir?
Could you give me some advice? I look forword to your reply.Thanks.

wider_face evaluation Read me: python3 evaluation.py -p -g

loss计算

weight[torch.where(t == -1)] = 0
和 lks_mask = torch.where(lks < 0, torch.full_like(lks, 0.), torch.full_like(lks, 1.0))

weight[torch.where(t == 0)] = 0是不是应该这样才对呢？

Docker image

Hi,
Thank you for the impresive solution.
Please concider providing a docker image or docker-compose for simple use or evaluation.

关键点训练

你的测试结果很棒。
请问关键点数据库用的是哪个？
我看wider face没有关键点标签，是吗？
另外，训练用的widerface.yaml，详细的图像数据与标签文件目录是怎么设置的？

Train

你好，我在运行了val2yolo.py的代码后进行训练，出现如下问题：

How to modify the network?

Hi, I want to add a regression problem besides the previous labels. I mean my labels are [class x y width height (new_label) ]. so how can I change the network to train with this new added feature? thanks in advance

Regarding conversion of bounding box to yolo format in val2yolo.py

Hello!

Thanks for releasing such an amazing project.

In val2yolo.py, is there any specific reason that 1 pixel is being subtracted from center_x and center_y value in line 19-20?

Normally, only x = (box[0] + box[1]) / 2.0 and y = (box[2] + box[3]) / 2.0 should work as expected!

Also, why are coordinates for landmarks are set to -1 in val dataset?

Thanks

Question regarding bounding box

Hello, is it possible to draw the bounding box for the eyes and the mouth? because it seems like it only return 5 landmark only.

Pretrained models on google-drive

Hi,
Can you please share pretrained models on google-drive or onedrive?

Thanks!

landmarks coordinate

in yolov5-face/models/yolo.py,
from the 71th to 75th line, such as
y[..., 5:7] = y[..., 5:7] * self.anchor_grid[i] + self.grid[i].to(x[i].device) * self.stride[i] # landmark x1 y1
How did you get this formula？

多类别人脸，只能检测出来第一种人脸？

你好，我用这个代码跑3中类别人脸检测遇到了一点问题，多个类别训练完，跑检测脚本，只能预测出来第一个类别（人脸框，landm都正常），另外两个类别的人脸预测不出来，把过nms前，原始pred的tensor打印出来看，发现后两列被固定成0。然后检查了训练代码，发现另外两个类别的分类score和obj的score都是正常的，训练过程中的precision和recall指标都很低，不知道是不是跟nms超时10s有关。调试了很久找不到原因，可以帮忙分析下吗？谢谢。
hello, this is my question: train 3 class faces by this code, but when i run detect_face.py to pred a picture, only the 0 class can be success detected，the pred tensor of the other class faces always be 0.00000e+00. i checked the train.py, i also find the question, that is precision and recall is too low, but the pred tensor is ok in score of the other class face.

Version of torch

Nice work，Which version of torch is this？

训练脚本

单 gpu
python3 train.py
2个 gpu
python3 -m torch.distributed.launch --nproc_per_node 2 train.py
其它参数如 --batch-size 32 --data widerface.yaml --weights yolov5s.pt，是不是都是按train.py里面的默认设置吗？

Share all weights on google drive

Hi, can you share all weights on gdrive? Not able to download for models yolov5m and yolov5l from pan.baidu. Thanks!

YOLOV5_n0.5无法复现出论文精度

您好，尝试从头训练YOLOV5_n0.5，无法达到论文中的指标，从头训练后WiderFace上的测试结果如下：

yolov5n-0.5 from scratch
==================== Results ====================
Easy   Val AP: 0.8842580430169323
Medium Val AP: 0.8577707110186754
Hard   Val AP: 0.7363192129511129
=================================================

而论文中提供的精度指标为：

yolov5n-0.5 paper
==================== Results ====================
Easy   Val AP: 0.9076029138358125
Medium Val AP: 0.8812336140494236
Hard   Val AP: 0.7388013938649491
=================================================

在Easy和Medium数据集上，大约有3个点的差距，请问：

对于YOLOV5_n0.5，要达到论文中的指标，使用的是该仓库目前的超参数么？
需要ShuffleNet_V2_0.5的预训练权重么？

How to label data?

Hi, please help me, I don't understand your label format, can you explain it to me? What tool do you use to label it?

Your label:

The yolov5 format I usually use:

Thanks!

Whether the paper used a coco pre-trained model

Hello!
I'm planning to write a paper related to face detection, and I would like to refer to your paper.
I wonder if you used a pre-trained model usch as coco when conducting the experiment of the paper.
Thank you!

提供的模型yolov5-face实验准确率

您好，非常棒的工作，感谢您的分享，但是我使用您提供的模型yolov5s-face在单尺度最大边输入分辨率为1024的情况下，easy=94.4956，medium=93.2827，hard=87.9253，与您之前readme里提到的easy=95.4，medium=94.6，hard=88.2尚有差距，请问可能是由什么原因造成的呢？

unable to download from baidu

can you please share the weights in other cloud?

Image size

what is the training size of the pretrained models. I want to know to convert to onnx? thanks you

训练完最后一个epoch，测试存在问题

比如训练250轮，结束后会进行测试，在test.py里面第180，181行：
180th lines
if plots:
confusion_matrix.process_batch(pred, torch.cat((labels[:, 0:1], tbox), 1))
存在问题是：
utils/metrics.py", line 146, in process_batch
self.matrix[gc, detection_classes[m1[j]]] += 1 # correct
IndexError: index 3 is out of bounds for axis 1 with size 2

我把180th,181th行注释掉，就可以测试和保存最后1个epoch的模型了。

Inference Speed

Thanks for sharing gread code.

I have a question about yolov5-face inference speed.
Yolov5-face is more accurate than scrfd, but inference speed is more slower.

Is it true?

推理时间不清楚？

您好，论文里面只写了参数量和理论计算量，没有注明各个模型的具体推理耗时，scrfd的对比是有注明各个模型的耗时的，请问是什么原因呢？能否提供各模型的耗时情况呢？

关键点训练的标签文件格式能说明下么，用来训其他数据

训练时验证显存溢出

@derronqi 每次训练一代结束后，进行验证和模型测试时，显存会溢出，不知道是什么原因，我batch_size设置的很小也会出现这个问题。不知道你是否也出现了这样的情况，谢谢

WiderFace BaiduYun Link will be helpful

Hi,
I really appreicate your efforts on this outstanding face detector. And I am trying to train my own detector on other kind of objects, but I am a little bit confused of the input label format of this subject. Could you please share the input data format(before the train2yolo.py/val2yolo.py) or a link of BaiduYun. I am within the mainland rightnow, cannot use the GoogleDrive.
Thanks

统计的Flops不一样

大佬您好，好像你readme中统计的flops和我用yolov5统计出来的不一样：
yolov5-n：Model Summary: 308 layers, 1705462 parameters, 1705462 gradients, 5.0 GFLOPS
yolov5-0.5n：Model Summary: 308 layers, 439734 parameters, 439734 gradients, 1.4 GFLOPS

实时性指标以及相应文档

请问使用这个模型进行人脸检测、关键点检测FPS大概是多少？这个项目有对应的论文或者博客可以参考吗？

train2yolo

现在下载的标签是labelv2.txt，修改代码后（代码只修改了路径），下载后运行train2yolo.py等文件进行训练后似乎有标签无法对齐的问题，如图片所示

landmarks calculation

in yolov5-face/utils/loss.py:
the 169th, 170th:
#landmarks loss
plandmarks = ps[:,5:15].sigmoid() * 8. - 4.

why landmarks multiply 8 and minus 4?

RuntimeError: Sizes of tensors must match except in dimension 0. Got 0 and 222 (The offending index is 2)

Run detect_face.py

Namespace(image='data/images/test.jpg', img_size=640, weights='yolov5s.pt')
Fusing layers... 
Traceback (most recent call last):
  File "yolov5-face/detect_face.py", line 148, in <module>
    detect_one(model, opt.image, device)
  File "yolov5-face/detect_face.py", line 108, in detect_one
    pred = non_max_suppression_face(pred, conf_thres, iou_thres)
  File "yolov5-face/utils/general.py", line 424, in non_max_suppression_face
    x = torch.cat((box[i], x[i, j + 15, None], x[:, 5:15] ,j[:, None].float()), 1)
RuntimeError: Sizes of tensors must match except in dimension 0. Got 0 and 222 (The offending index is 2)

Process finished with exit code 1

How to sovle it?

转换为onnx文件时出错

您好，在使用models/export.py将训练好的yolov5s-face.pt转换为onnx文件时，报错如下：
RuntimeError: step!=1 is currently not supported
请问您有什么解决的方法么？谢谢！

Flask python

Hello!

Can I read the model with "torch.hub.load" and deploy it in Flask?

Thank you!

pre-trained model for large face detection

Could you provide pre-trained model for large face detection? I found that you provide new training dataset Multi-Task-Facial without updating corresponding pre-trained models. Thanks.

about some detals

Thank you for open this repo. I have some questions as follows:

All model use 800 image size for input in readme.md table ? However 640 images size in evaluate ?

yolov5-face/train.py

Line 439 in f4db424

parser.add_argument('--img-size', nargs='+', type=int, default=[800, 800], help='[train, test] image sizes')
Have you mean filter small face by this code? Anything else?

As we explain before, the Mosaic has to work with the ignoring small faces, otherwise the performance degrades dramatically

yolov5-face/utils/datasets.py

Line 900 in f4db424

 def box_candidates(box1, box2, wh_thr=2, ar_thr=20, area_thr=0.1, eps=1e-16): # box1(4,n), box2(4,n) 

openvino

can i convert your pretrained model to openvino ? What version of yolov5 do you use?

multi class trianing?

will it work with multi class training ?

图片中单个人脸检测

您好，使用了您给的pt文件，发现针对图片中只存在单个人脸且人脸面积占比较大的情况时识别率下降。

关于readme中实验结果的对比

您好，我看到您对比其他工作的结果，比那些工作自己公布的要低很多。
请问这是什么原因？是测试的方式不同吗？

关于训练

请问最新提供的yolov5s模型使用了多少数据训练的，数据集可以方便透露一下么

大人脸检测出错

当单张人脸面积较大时，几乎完全出错。要么会检测到多张人脸，要么无法检测到。现在的人脸检测越来越注重小人脸检测，却无法精确的检测大人脸

the instructions should be more detailed

such as how to make sample and label, how to train, how to test, how to configure the running environment and so on.

some issues for train_batch.jpg

现在正常训练了，感谢您的帮助。但是在trian/exp中的train_batch.jpg中会有许多人脸没有被正确框选到，会影响训练吗（如图所示）

train2yolo.py文件在哪？

请问数据集生成train2yolo.py文件在哪？ @derronqi

训练结果

Epoch gpu_mem box obj cls landmark total targets img_size
2/249 5.01G 0.06629 0.02983 0 0.04542 0.1415 1 800

这里面的box, obj, landmark对应的数值0.06629 0.02983 0 0.04542，是loss的值
没问题了

The model does not detect large faces

I am training the model on the dataset with mixed images (small faces, and large faces). When I use the best weights and detect faces on the test set, there are no detections on large faces. I have tried using all versions including those with P6 output block.

where are the landmarks for the eyes, nose, & mouth?

so do your models output the landmark coordinates (i.e. eyes, mouth, nose coordinates) or do they just put a bounding box on the faces. When I tried your inference script on one of your pretrained models, it just outputted the bounding box of the face with the confidence score. How do you get the landmarks as described in your paper and as shown here https://github.com/deepcam-cn/yolov5-face/blob/master/data/images/result.jpg?

I look forward to your response, thank you.

when I train with widerface,error occured

Traceback (most recent call last):
File "train.py", line 513, in
train(hyp, opt, device, tb_writer, wandb)
File "train.py", line 299, in train
scaler.step(optimizer) # optimizer.step
File "/home/fut/miniconda3/lib/python3.8/site-packages/torch/cuda/amp/grad_scaler.py", line 321, in step
retval = optimizer.step(*args, **kwargs)
File "/home/fut/miniconda3/lib/python3.8/site-packages/torch/optim/lr_scheduler.py", line 67, in wrapper
return wrapped(*args, **kwargs)
File "/home/fut/miniconda3/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 26, in decorate_context
return func(*args, **kwargs)
File "/home/fut/miniconda3/lib/python3.8/site-packages/torch/optim/sgd.py", line 106, in step
buf.mul_(momentum).add_(d_p, alpha=1 - dampening)
RuntimeError: The size of tensor a (32) must match the size of tensor b (24) at non-singleton dimension 0

关键点数量是写死的

大佬能出一版关键点数量不写死的么~
自己改了之后模型就训的不对了，改的地方挺多，而且不太好改。