deepcam-cn / yolov5-face Goto Github PK
View Code? Open in Web Editor NEWYOLO5Face: Why Reinventing a Face Detector (https://arxiv.org/abs/2105.12931) ECCV Workshops 2022)
License: GNU General Public License v3.0
YOLO5Face: Why Reinventing a Face Detector (https://arxiv.org/abs/2105.12931) ECCV Workshops 2022)
License: GNU General Public License v3.0
I want to detect human body and face using yolov5-face.
What I did is generating the dataset by append human body annotation with bbox and using -1 as the landmark. But the model can only find face with landmark and never try to detect human body.
how to get the wider_face's prediction dir?
Could you give me some advice? I look forword to your reply.Thanks.
wider_face evaluation Read me: python3 evaluation.py -p -g
weight[torch.where(t == -1)] = 0
和 lks_mask = torch.where(lks < 0, torch.full_like(lks, 0.), torch.full_like(lks, 1.0))
weight[torch.where(t == 0)] = 0是不是应该这样才对呢?
Hi,
Thank you for the impresive solution.
Please concider providing a docker image or docker-compose for simple use or evaluation.
你的测试结果很棒。
请问关键点数据库用的是哪个?
我看wider face没有关键点标签,是吗?
另外,训练用的widerface.yaml,详细的图像数据与标签文件目录是怎么设置的?
你好,我在运行了val2yolo.py的代码后进行训练,出现如下问题:
Hi, I want to add a regression problem besides the previous labels. I mean my labels are [class x y width height (new_label) ]. so how can I change the network to train with this new added feature? thanks in advance
Hello!
Thanks for releasing such an amazing project.
In val2yolo.py
, is there any specific reason that 1 pixel is being subtracted from center_x
and center_y
value in line 19-20?
Normally, only x = (box[0] + box[1]) / 2.0
and y = (box[2] + box[3]) / 2.0
should work as expected!
Also, why are coordinates for landmarks are set to -1 in val dataset?
Thanks
Hello, is it possible to draw the bounding box for the eyes and the mouth? because it seems like it only return 5 landmark only.
Hi,
Can you please share pretrained models on google-drive or onedrive?
Thanks!
in yolov5-face/models/yolo.py,
from the 71th to 75th line, such as
y[..., 5:7] = y[..., 5:7] * self.anchor_grid[i] + self.grid[i].to(x[i].device) * self.stride[i] # landmark x1 y1
How did you get this formula?
你好,我用这个代码跑3中类别人脸检测遇到了一点问题,多个类别训练完,跑检测脚本,只能预测出来第一个类别(人脸框,landm都正常),另外两个类别的人脸预测不出来,把过nms前,原始pred的tensor打印出来看,发现后两列被固定成0。然后检查了训练代码,发现另外两个类别的分类score和obj的score都是正常的,训练过程中的precision和recall指标都很低,不知道是不是跟nms超时10s有关。调试了很久找不到原因,可以帮忙分析下吗?谢谢。
hello, this is my question: train 3 class faces by this code, but when i run detect_face.py to pred a picture, only the 0 class can be success detected,the pred tensor of the other class faces always be 0.00000e+00. i checked the train.py, i also find the question, that is precision and recall is too low, but the pred tensor is ok in score of the other class face.
Nice work,Which version of torch is this?
单 gpu
python3 train.py
2个 gpu
python3 -m torch.distributed.launch --nproc_per_node 2 train.py
其它参数如 --batch-size 32 --data widerface.yaml --weights yolov5s.pt,是不是都是按train.py里面的默认设置吗?
Hi, can you share all weights on gdrive? Not able to download for models yolov5m and yolov5l from pan.baidu. Thanks!
您好,尝试从头训练YOLOV5_n0.5,无法达到论文中的指标,从头训练后WiderFace上的测试结果如下:
yolov5n-0.5 from scratch
==================== Results ====================
Easy Val AP: 0.8842580430169323
Medium Val AP: 0.8577707110186754
Hard Val AP: 0.7363192129511129
=================================================
而论文中提供的精度指标为:
yolov5n-0.5 paper
==================== Results ====================
Easy Val AP: 0.9076029138358125
Medium Val AP: 0.8812336140494236
Hard Val AP: 0.7388013938649491
=================================================
在Easy和Medium数据集上,大约有3个点的差距,请问:
Hello!
I'm planning to write a paper related to face detection, and I would like to refer to your paper.
I wonder if you used a pre-trained model usch as coco when conducting the experiment of the paper.
Thank you!
您好,非常棒的工作,感谢您的分享,但是我使用您提供的模型yolov5s-face在单尺度最大边输入分辨率为1024的情况下,easy=94.4956,medium=93.2827,hard=87.9253,与您之前readme里提到的easy=95.4,medium=94.6,hard=88.2尚有差距,请问可能是由什么原因造成的呢?
can you please share the weights in other cloud?
what is the training size of the pretrained models. I want to know to convert to onnx? thanks you
比如训练250轮,结束后会进行测试,在test.py里面第180,181行:
180th lines
if plots:
confusion_matrix.process_batch(pred, torch.cat((labels[:, 0:1], tbox), 1))
存在问题是:
utils/metrics.py", line 146, in process_batch
self.matrix[gc, detection_classes[m1[j]]] += 1 # correct
IndexError: index 3 is out of bounds for axis 1 with size 2
我把180th,181th行注释掉,就可以测试和保存最后1个epoch的模型了。
Thanks for sharing gread code.
I have a question about yolov5-face inference speed.
Yolov5-face is more accurate than scrfd, but inference speed is more slower.
Is it true?
您好,论文里面只写了参数量和理论计算量,没有注明各个模型的具体推理耗时,scrfd的对比是有注明各个模型的耗时的,请问是什么原因呢?能否提供各模型的耗时情况呢?
@derronqi 每次训练一代结束后,进行验证和模型测试时,显存会溢出,不知道是什么原因,我batch_size设置的很小也会出现这个问题。不知道你是否也出现了这样的情况,谢谢
Hi,
I really appreicate your efforts on this outstanding face detector. And I am trying to train my own detector on other kind of objects, but I am a little bit confused of the input label format of this subject. Could you please share the input data format(before the train2yolo.py/val2yolo.py) or a link of BaiduYun. I am within the mainland rightnow, cannot use the GoogleDrive.
Thanks
大佬您好,好像你readme中统计的flops和我用yolov5统计出来的不一样:
yolov5-n:Model Summary: 308 layers, 1705462 parameters, 1705462 gradients, 5.0 GFLOPS
yolov5-0.5n:Model Summary: 308 layers, 439734 parameters, 439734 gradients, 1.4 GFLOPS
请问使用这个模型进行人脸检测、关键点检测FPS大概是多少?这个项目有对应的论文或者博客可以参考吗?
in yolov5-face/utils/loss.py:
the 169th, 170th:
#landmarks loss
plandmarks = ps[:,5:15].sigmoid() * 8. - 4.
why landmarks multiply 8 and minus 4?
Run detect_face.py
Namespace(image='data/images/test.jpg', img_size=640, weights='yolov5s.pt')
Fusing layers...
Traceback (most recent call last):
File "yolov5-face/detect_face.py", line 148, in <module>
detect_one(model, opt.image, device)
File "yolov5-face/detect_face.py", line 108, in detect_one
pred = non_max_suppression_face(pred, conf_thres, iou_thres)
File "yolov5-face/utils/general.py", line 424, in non_max_suppression_face
x = torch.cat((box[i], x[i, j + 15, None], x[:, 5:15] ,j[:, None].float()), 1)
RuntimeError: Sizes of tensors must match except in dimension 0. Got 0 and 222 (The offending index is 2)
Process finished with exit code 1
How to sovle it?
您好,在使用models/export.py将训练好的yolov5s-face.pt转换为onnx文件时,报错如下:
RuntimeError: step!=1 is currently not supported
请问您有什么解决的方法么?谢谢!
Hello!
Can I read the model with "torch.hub.load" and deploy it in Flask?
Thank you!
Could you provide pre-trained model for large face detection? I found that you provide new training dataset Multi-Task-Facial without updating corresponding pre-trained models. Thanks.
Thank you for open this repo. I have some questions as follows:
All model use 800 image size for input in readme.md table ? However 640 images size in evaluate ?
Line 439 in f4db424
Have you mean filter small face by this code? Anything else?
As we explain before, the Mosaic has to work with the ignoring small faces, otherwise the performance degrades dramatically
Line 900 in f4db424
can i convert your pretrained model to openvino ? What version of yolov5 do you use?
will it work with multi class training ?
您好,使用了您给的pt文件,发现针对图片中只存在单个人脸且人脸面积占比较大的情况时识别率下降。
您好,我看到您对比其他工作的结果,比那些工作自己公布的要低很多。
请问这是什么原因?是测试的方式不同吗?
请问最新提供的yolov5s模型使用了多少数据训练的,数据集可以方便透露一下么
当单张人脸面积较大时,几乎完全出错。要么会检测到多张人脸,要么无法检测到。现在的人脸检测越来越注重小人脸检测,却无法精确的检测大人脸
such as how to make sample and label, how to train, how to test, how to configure the running environment and so on.
请问数据集生成train2yolo.py文件在哪? @derronqi
Epoch gpu_mem box obj cls landmark total targets img_size
2/249 5.01G 0.06629 0.02983 0 0.04542 0.1415 1 800
这里面的box, obj, landmark对应的数值0.06629 0.02983 0 0.04542,是loss的值
没问题了
I am training the model on the dataset with mixed images (small faces, and large faces). When I use the best weights and detect faces on the test set, there are no detections on large faces. I have tried using all versions including those with P6 output block.
so do your models output the landmark coordinates (i.e. eyes, mouth, nose coordinates) or do they just put a bounding box on the faces. When I tried your inference script on one of your pretrained models, it just outputted the bounding box of the face with the confidence score. How do you get the landmarks as described in your paper and as shown here https://github.com/deepcam-cn/yolov5-face/blob/master/data/images/result.jpg?
I look forward to your response, thank you.
Traceback (most recent call last):
File "train.py", line 513, in
train(hyp, opt, device, tb_writer, wandb)
File "train.py", line 299, in train
scaler.step(optimizer) # optimizer.step
File "/home/fut/miniconda3/lib/python3.8/site-packages/torch/cuda/amp/grad_scaler.py", line 321, in step
retval = optimizer.step(*args, **kwargs)
File "/home/fut/miniconda3/lib/python3.8/site-packages/torch/optim/lr_scheduler.py", line 67, in wrapper
return wrapped(*args, **kwargs)
File "/home/fut/miniconda3/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 26, in decorate_context
return func(*args, **kwargs)
File "/home/fut/miniconda3/lib/python3.8/site-packages/torch/optim/sgd.py", line 106, in step
buf.mul_(momentum).add_(d_p, alpha=1 - dampening)
RuntimeError: The size of tensor a (32) must match the size of tensor b (24) at non-singleton dimension 0
大佬能出一版关键点数量不写死的么~
自己改了之后模型就训的不对了,改的地方挺多,而且不太好改。
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.