Giter VIP home page Giter VIP logo

retinaface_pytorch's Introduction

RetinaFace_Pytorch

Reimplement RetinaFace with Pytorch

Installation

Clone and install requirements
$ git clone https://github.com/supernotman/RetinaFace_Pytorch.git
$ cd RetinaFace_Pytorch/
$ sudo pip install -r requirements.txt

Pytorch version 1.1.0+ and torchvision 0.3.0+ are needed.

Data
  1. Download widerface dataset

  2. Download annotations (face bounding boxes & five facial landmarks) from baidu cloud or dropbox

  3. Organise the dataset directory as follows:

  widerface/
    train/
      images/
      label.txt
    val/
      images/
      label.txt
    test/
      images/
      label.txt

Train

$ train.py [-h] [data_path DATA_PATH] [--batch BATCH]
                [--epochs EPOCHS]
                [--shuffle SHUFFLE] [img_size IMG_SIZE]
                [--verbose VERBOSE] [--save_step SAVE_STEP]
                [--eval_step EVAL_STEP]
                [--save_path SAVE_PATH]
                [--depth DEPTH]

Example

For multi-gpus training, run:

$ CUDA_VISIBLE_DEVICES=0,1,2,3,4,5 python train.py --data_path /widerface --batch 32 --save_path ./out

Training log

---- [Epoch 39/200, Batch 400/403] ----
+----------------+-----------------------+
| loss name      | value                 |
+----------------+-----------------------+
| total_loss     | 0.09969855844974518   |
| classification | 0.09288528561592102   |
| bbox           | 0.0034053439740091562 |
| landmarks      | 0.003407923271879554  |
+----------------+-----------------------+
-------- RetinaFace Pytorch --------
Evaluating epoch 39
Recall: 0.7432201780921814
Precision: 0.906913273261629
Pretrained model

You can download the model from baidu cloud or dropbox

Detect

Image
$ python detect.py --model_path model.pt --image_path 4.jpg
Video
$ python video_detect.py --model_path model.pt 

Pose

Hey, I find something interesting and add it in the code. Pose detection Hopenet: https://github.com/natanielruiz/deep-head-pose Now you can estimate pose with RetinaFace and Hopenet. Download pose model

$ python pose_detect.py --f_model model.pt --p_model hopenet.pkl --image_path test.jpg

also you can detect in video

$ python pose_detect.py --f_model model.pt --p_model hopenet.pkl --type video --video_path test.avi

Todo:

  • Wider Face mAP calculation
  • Deformable Convolution
  • More models support
  • Random crop and color distortion
  • Graph Convolution
  • Bug fix

retinaface_pytorch's People

Contributors

supernotman avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

retinaface_pytorch's Issues

focal_loss = False

focal_loss = False
# focal loss
if focal_loss:
alpha = 0.25
gamma = 2.0
alpha_factor = torch.ones(targets.shape).cuda() * alpha

            alpha_factor = torch.where(torch.eq(targets, 1.), alpha_factor, 1. - alpha_factor)
            focal_weight = torch.where(torch.eq(targets, 1.), 1. - classification, classification)
            focal_weight = alpha_factor * torch.pow(focal_weight, gamma)

            bce = -(targets * torch.log(classification) + (1.0 - targets) * torch.log(1.0 - classification))

            cls_loss = focal_weight * bce

            cls_loss = torch.where(torch.ne(targets, -1.0), cls_loss, torch.zeros(cls_loss.shape).cuda())

            classification_losses.append(cls_loss.sum()/torch.clamp(num_positive_anchors.float(), min=1.0))
        else:
            if positive_indices.sum() > 0:
                classification_losses.append(positive_losses.mean() + sorted_losses.mean())
            else:
                classification_losses.append(torch.tensor(0).float().cuda())

never use focalloss???

when do inference, load model is wrong?

RuntimeError: Error(s) in loading state_dict for RetinaFace
Missing key(s) in state_dict: "body.conv1.weight", "body.bn1.weight", "body.bn1.bias",....
Unexpected key(s) in state_dict: "module.body.conv1.weight", "module.body.bn1.weight",...

The pre-train model

Hi, I can't reproduct the real precision, Can you give me the model_epoch_200.pt, Thanx

About context module

I think there maybe some mistakes of channels in context module

x1 = self.det_conv1(x) # 256 channels
x_ = self.det_context_conv1(x) # 128 channels
x2 = self.det_context_conv2(x_) # 128 channels
x3_ = self.det_context_conv3_1(x_) # 128 channels
x3 = self.det_context_conv3_2(x3_) # 128 channels

and after concat x1,x2,x3 I got 512 channels. This is inconsistent with the paper.(256 channels)
Is there anything wrong with me?

Landmark won't converge

我现在自己尝试用 Caffe 在训练,但是关键点回归得很差。请问有什么经验心得分享吗?🙏

where is the landmarks labels?

hi, I was not found the landmarks in your annotations data. I'm trainning a model with resnet18, the landmarks' loss does not decline.Do landmarks and bbox separate to train?

element 0 of tensors does not require grad and does not have a grad_fn

Thank you for your open source, but I encountered the following problem when 104 epoch in training.can you help me? thanks

Traceback (most recent call last):
File "train.py", line 156, in
main()
File "train.py", line 111, in main
loss.backward()
File "/home/boyun/.conda/envs/retinaface/lib/python3.6/site-packages/torch/tensor.py", line 107, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/boyun/.conda/envs/retinaface/lib/python3.6/site-packages/torch/autograd/init.py", line 93, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

输入的图片对于任意的大小是否都可以呢?

在detect.py 文件中,有padded image 这一环节,你是否考虑过对于大小不是640×640的图片,在padding和resize之后输入的模型中,得到的人脸框的位置和关键点的位置与原图之间会有偏移?这个偏移是否应该在显示的时候矫正一下呢?

Have you tested on widerface val?

I have the following result of image size (1200,1200):
Easy Val AP: 0.721983363755764
Medium Val AP: 0.742308954563704
Hard Val AP: 0.6196879642610857

Is there something wrong?

Validation error

hello everyone

Please I need help I get this error when I try to compile train.py
Evaluating epoch 0
0%| | 0/3226 [00:00<?, ?it/s]Traceback (most recent call last):
File "train.py", line 151, in
main()
File "train.py", line 136, in main
recall, precision = eval_widerface.evaluate(dataloader_val,retinaface)
File "C:\Desktop\RetinaFace_super\eval_widerface.py", line 74, in evaluate
for data in tqdm(iter(val_data)):
File "C:\AppData\Local\Continuum\anaconda3\envs\supernotman\lib\site-packages\tqdm\std.py", line 1099, in iter
for obj in iterable:
File "C:AppData\Local\Continuum\anaconda3\envs\supernotman\lib\site-packages\torch\utils\data\dataloader.py", line 582, in next
return self._process_next_batch(batch)
File "C:\AppData\Local\Continuum\anaconda3\envs\supernotman\lib\site-packages\torch\utils\data\dataloader.py", line 608, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
ValueError: Traceback (most recent call last):
File "C:\AppData\Local\Continuum\anaconda3\envs\supernotman\lib\site-packages\torch\utils\data_utils\worker.py", line 99, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "C:\AppData\Local\Continuum\anaconda3\envs\supernotman\lib\site-packages\torch\utils\data_utils\worker.py", line 99, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "C:\Desktop\RetinaFace_super\dataloader.py", line 348, in getitem
annotation[0,0] = label[0] # x1
ValueError: setting an array element with a sequence.

0%| | 0/3226 [00:00<?, ?it/s]

About data argumentation

Hello
Which data argumentation did you use in your actual trainning? Cuz I saw several methods that you had commented but not sure which ones did you actually use.

BTW, many of them are not working and have bugs.

for example,

add this to line 297 in dataloader.py

pad = torch.from_numpy(np.array(pad))
before this
padded_img = F.pad(img, pad, "constant", value=0).

Or it will show

TypeError: narrow(): argument 'start' (position 2) must be int, not numpy.int64

retinaface做多类别检测可行吗

你好,在使用你的代码做人脸检测。我突发奇想,想用来检测人体和人体关键点+人脸和人脸关键点,请问这个是否可行

Allow for dynamic input sizes / anchor sizes

Currently when tracing the model, the following two warnings apply:

/d/dev/RetinaFace_Pytorch/anchors.py:27: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
image_shape = np.array(image_shape)
/d/dev/RetinaFace_Pytorch/anchors.py:40: TracerWarning: torch.from_numpy results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
return torch.from_numpy(all_anchors.astype(np.float32)).cuda()

The model is then using a hardcoded 640x640 input size and anchors whereas the input size should be dynamic.

no prior_box?

it seems to be no prior_box part in this code. is it unnecessary?

Question about focal loss

@supernotman Hi, thank you for this great project.
May I understand why you use cross entropy loss for classification head, other than focal loss? As Focal loss is the key feature of retinaNet.

About labels and training

Hello,
I am following your instructions to train the network. However, the label file, in the website, is not like how you described it in the instructions. I changed the name of the bounding box and annotations txt file name to label.txt and the dataloader.py code cannot read it. What is the solution to that problem ?
To be more clear the file in the website of the widerface is like that:

0--Parade/0_Parade_marchingband_1_849.jpg
1
449 330 122 149 0 0 0 0 0 0
0--Parade/0_Parade_Parade_0_904.jpg
1
361 98 263 339 0 0 0 0 0 0
0--Parade/0_Parade_marchingband_1_799.jpg
21
78 221 7 8 2 0 0 0 0 0
78 238 14 17 2 0 0 0 0 0
113 212 11 15 2 0 0 0 0 0
134 260 15 15 2 0 0 0 0 0
163 250 14 17 2 0 0 0 0 0
201 218 10 12 2 0 0 0 0 0
182 266 15 17 2 0 0 0 0 0

And the output of the train.py is like that:

Traceback (most recent call last):
File "train.py", line 150, in
main()
File "train.py", line 53, in main
dataset_train = TrainDataset(train_path,transform=transforms.Compose([Resizer(),PadToSquare()]))
File "/home/barkntuncer/RetinaFace_Pytorch/dataloader.py", line 45, in init
label = [float(x) for x in line]
File "/home/barkntuncer/RetinaFace_Pytorch/dataloader.py", line 45, in
label = [float(x) for x in line]
ValueError: could not convert string to float: '0--Parade/0_Parade_marchingband_1_849.jpg'

Had you ever use other backbone?

Thanks for your great job!
I'd use mobilenet V1 0.25 to replace your resnet ,however, I found it really hard to converge.
Although the loss was quite low even at the first several epochs, but it just keep that way forever.
Had you tried other light-weight backbone for your code? Could you share some details for your training?
Also, I am trying to increase # landmarks to 68 with the 300w dataset with your code, had you ever tried that?
Thanks!

Out of memory

How much memory do you estimate this project needs?
I'm using a Titan V with 12GB and this goes out of memory with a batch size of 16 (default was 32), which seems quite small for WIDER face.

I had to use a batch size of 8, which used 10GB.

Evaluation problem

Hello,
I try to execute your code but there is problem, I cant find any solution
Can you please help me.
I download the dataset wider face as you explain and I tried to run this command on windows:
set CUDA_VISIBLE_DEVICES=0 & python train.py --data_path dataset/widerface --batch 1 --save_path ./out
but I get this problem:
Namespace(batch=1, data_path='dataset/widerface', depth=50, epochs=1, eval_step=3, img_size=512, save_path='./out', save_step=10, shuffle=True, verbose=10)
Traceback (most recent call last):
File "train.py", line 151, in
main()
File "train.py", line 55, in main
dataset_val = ValDataset(val_path,transform=transforms.Compose([RandomCroper()]))
File "C:\Desktop\RetinaFace_super\dataloader.py", line 332, in init
label = [float(x) for x in line]
File "C:\Desktop\RetinaFace_super\dataloader.py", line 332, in
label = [float(x) for x in line]
ValueError: could not convert string to float: '/24--Soldier_Firing/24_Soldier_Firing_Soldier_Firing_24_329.jpg'

when I change the val images with the same as train images it start the training then I get this error :

---- [Epoch 0/1, Batch 12870/12880] ----
+----------------+---------------------+
| loss name | value |
+----------------+---------------------+
| total_loss | 2.6635076999664307 |
| classification | 1.5447975397109985 |
| bbox | 0.34370726346969604 |
| landmarks | 0.7750030159950256 |
+----------------+---------------------+
-------- RetinaFace Pytorch --------
Evaluating epoch 0
0%| | 0/12880 [00:00<?, ?it/s]Traceback (most recent call last):
File "train.py", line 151, in
main()
File "train.py", line 136, in main
recall, precision = eval_widerface.evaluate(dataloader_val,retinaface)
File "C:\Desktop\RetinaFace_super\eval_widerface.py", line 74, in evaluate
for data in tqdm(iter(val_data)):
File "C:\AppData\Local\Continuum\anaconda3\envs\supernotman\lib\site-packages\tqdm\std.py", line 1099, in iter
for obj in iterable:
File "C:\AppData\Local\Continuum\anaconda3\envs\supernotman\lib\site-packages\torch\utils\data\dataloader.py", line 582, in next
return self._process_next_batch(batch)
File "C:\AppData\Local\Continuum\anaconda3\envs\supernotman\lib\site-packages\torch\utils\data\dataloader.py", line 608, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
ValueError: Traceback (most recent call last):
File "C:\AppData\Local\Continuum\anaconda3\envs\supernotman\lib\site-packages\torch\utils\data_utils\worker.py", line 99, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "C:\AppData\Local\Continuum\anaconda3\envs\supernotman\lib\site-packages\torch\utils\data_utils\worker.py", line 99, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "C:\Desktop\RetinaFace_super\dataloader.py", line 347, in getitem
annotation[0,0] = label[0] # x1
ValueError: setting an array element with a sequence.

0%|

I dont know what s go on
I really appreciate if you help me.

Fine tune pre-trained model

I was trying to fine tune pre-trained model but I think you current code did not provide this facility. I added a few lines in train.py, have a look at the following code. If you think it should be the part of it kindly add this in next commit. Thanks for your good work.


import argparse
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, models, transforms
from dataloader import TrainDataset, ValDataset, collater, RandomCroper, RandomFlip, Resizer, PadToSquare
from torch.utils.data import Dataset, DataLoader
from terminaltables import AsciiTable, DoubleTable, SingleTable
from tensorboardX import SummaryWriter
from torch.optim import lr_scheduler
import torch.distributed as dist
import eval_widerface
import torchvision
import model
import os
from torch.utils.data.distributed import DistributedSampler
import torchvision_model

def get_args():
    parser = argparse.ArgumentParser(description="Train program for retinaface.")
    parser.add_argument('--data_path', type=str, help='Path for dataset,default WIDERFACE')
    parser.add_argument('--batch', type=int, default=16, help='Batch size')
    parser.add_argument('--epochs', type=int, default=200, help='Max training epochs')
    parser.add_argument('--shuffle', type=bool, default=True, help='Shuffle dataset or not')
    parser.add_argument('--img_size', type=int, default=640, help='Input image size')
    parser.add_argument('--verbose', type=int, default=10, help='Log verbose')
    parser.add_argument('--save_step', type=int, default=10, help='Save every save_step epochs')
    parser.add_argument('--eval_step', type=int, default=3, help='Evaluate every eval_step epochs')
    parser.add_argument('--save_path', type=str, default='./out', help='Model save path')
    parser.add_argument('--depth', help='Resnet depth, must be one of 18, 34, 50, 101, 152', type=int, default=50)
    parser.add_argument('--pretrained_model_path', type=str, default='./out', help='Pre-Trained Model Path')
    args = parser.parse_args()
    print(args)
    return args


def main():
    args = get_args()
    if not os.path.exists(args.save_path):
        os.mkdir(args.save_path)
    log_path = os.path.join(args.save_path,'log')
    if not os.path.exists(log_path):
        os.mkdir(log_path)

    writer = SummaryWriter(log_dir=log_path)

    data_path = args.data_path
    train_path = os.path.join(data_path,'train/label.txt')
    val_path = os.path.join(data_path,'val/label.txt')
    # dataset_train = TrainDataset(train_path,transform=transforms.Compose([RandomCroper(),RandomFlip()]))
    dataset_train = TrainDataset(train_path,transform=transforms.Compose([Resizer(),PadToSquare()]))
    dataloader_train = DataLoader(dataset_train, num_workers=8, batch_size=args.batch, collate_fn=collater,shuffle=True)
    # dataset_val = ValDataset(val_path,transform=transforms.Compose([RandomCroper()]))
    dataset_val = ValDataset(val_path,transform=transforms.Compose([Resizer(),PadToSquare()]))
    dataloader_val = DataLoader(dataset_val, num_workers=8, batch_size=args.batch, collate_fn=collater)
    
    total_batch = len(dataloader_train)

	# Create the model
    # if args.depth == 18:
    #     retinaface = model.resnet18(num_classes=2, pretrained=True)
    # elif args.depth == 34:
    #     retinaface = model.resnet34(num_classes=2, pretrained=True)
    # elif args.depth == 50:
    #     retinaface = model.resnet50(num_classes=2, pretrained=True)
    # elif args.depth == 101:
    #     retinaface = model.resnet101(num_classes=2, pretrained=True)
    # elif args.depth == 152:
    #     retinaface = model.resnet152(num_classes=2, pretrained=True)
    # else:
    #     raise ValueError('Unsupported model depth, must be one of 18, 34, 50, 101, 152')

    # Create torchvision model
    return_layers = {'layer2':1,'layer3':2,'layer4':3}
    retinaface = torchvision_model.create_retinaface(return_layers)


    retinaface = retinaface.cuda()
    retinaface = torch.nn.DataParallel(retinaface).cuda()
    retinaface.training = True
    
    try:
        pretrained_model_path = args.pretrained_model_path
        state_dict=None
        with open( pretrained_model_path , "br" ) as f:
            stat_dict = torch.load(f)
        retinaface.load_state_dict( stat_dict )
        print( "Previuos Model is Successfully Loaded :)" )
    except:
        print( "Error while loading previous model :(" ) 

    optimizer = optim.Adam(retinaface.parameters(), lr=1e-3)
    # optimizer = optim.SGD(retinaface.parameters(), lr=1e-2, momentum=0.9, weight_decay=0.0005)
    # scheduler = optim.lr_scheduler.ReduceLROnPlateau(optimizer, patience=3, verbose=True)
    # scheduler  = optim.lr_scheduler.StepLR(optimizer, step_size=30, gamma=0.1)
    #scheduler  = optim.lr_scheduler.MultiStepLR(optimizer, milestones=[10,30,60], gamma=0.1)

    print('Start to train.')

    epoch_loss = []
    iteration = 0

    for epoch in range(args.epochs):
        retinaface.train()

        # Training
        for iter_num,data in enumerate(dataloader_train):
            optimizer.zero_grad()
            classification_loss, bbox_regression_loss,ldm_regression_loss = retinaface([data['img'].cuda().float(), data['annot']])
            classification_loss = classification_loss.mean()
            bbox_regression_loss = bbox_regression_loss.mean()
            ldm_regression_loss = ldm_regression_loss.mean()

            # loss = classification_loss + 1.0 * bbox_regression_loss + 0.5 * ldm_regression_loss
            loss = classification_loss + bbox_regression_loss + ldm_regression_loss

            loss.backward()
            optimizer.step()
            
            if iter_num % args.verbose == 0:
                log_str = "\n---- [Epoch %d/%d, Batch %d/%d] ----\n" % (epoch, args.epochs, iter_num, total_batch)
                table_data = [
                    ['loss name','value'],
                    ['total_loss',str(loss.item())],
                    ['classification',str(classification_loss.item())],
                    ['bbox',str(bbox_regression_loss.item())],
                    ['landmarks',str(ldm_regression_loss.item())]
                    ]
                table = AsciiTable(table_data)
                log_str +=table.table
                print(log_str)
                # write the log to tensorboard
                writer.add_scalar('losses:',loss.item(),iteration*args.verbose)
                writer.add_scalar('class losses:',classification_loss.item(),iteration*args.verbose)
                writer.add_scalar('box losses:',bbox_regression_loss.item(),iteration*args.verbose)
                writer.add_scalar('landmark losses:',ldm_regression_loss.item(),iteration*args.verbose)
                iteration +=1

        # Eval
        if epoch % args.eval_step == 0:
            print('-------- RetinaFace Pytorch --------')
            print ('Evaluating epoch {}'.format(epoch))
            recall, precision = eval_widerface.evaluate(dataloader_val,retinaface)
            print('Recall:',recall)
            print('Precision:',precision)

            writer.add_scalar('Recall:', recall, epoch*args.eval_step)
            writer.add_scalar('Precision:', precision, epoch*args.eval_step)

        # Save model
        if (epoch + 1) % args.save_step == 0 or iter_num>=100:
            torch.save(retinaface.state_dict(), args.save_path + '/model_epoch_{}.pt'.format(epoch + 1))

    writer.close()


if __name__=='__main__':
    main()

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.