Giter VIP home page Giter VIP logo

objectdetection-onestagedet's Introduction

Contents

Introduction

Now we have implemented yolov2 and yolov3 in this repo, which is a generation object detection framework named OneStageDet(OSD), in the future we consider to implement yolo and ssd in a single framework.

Requirements

  • python 3.6
  • pytorch 0.4.0

Features

  • Include both Yolov2 and Yolov3
  • Good performance
544x544 VOC2007 Test(mAP) Time per forward
(batch size = 1)
Yolov2 77.6% 11.5ms
Yolov3 79.6% 23.1ms

The models are trained from pretrained weights on imagenet with this implementation.

  • Train as fast as darknet

  • A lot of efficient backbones on hand

    Like tiny yolov2, tiny yolov3, mobilenet, mobilenetv2, shufflenet(g2), shufflenetv2(1x), squeezenext(1.0-SqNxt-23v5), light xception, xception etc.

    Check folder vedanet/network/backbone for details.

    416x416 VOC2007 Test(mAP) Time per forward
    (batch size = 1)
    TinyYolov2 57.5% 2.4ms
    TinyYolov3 61.3% 2.3ms

    The models are trained from scratch with this implementation.

Getting Started

Installation

1) Code
git clone xxxxx/ObjectDetection-OneStageDet
cd ObjectDetection-OneStageDet/
yolo_root=$(pwd)
cd ${yolo_root}/utils/test
make -j32
2) Data
wget https://pjreddie.com/media/files/VOCtrainval_11-May-2012.tar
wget https://pjreddie.com/media/files/VOCtrainval_06-Nov-2007.tar
wget https://pjreddie.com/media/files/VOCtest_06-Nov-2007.tar

tar xf VOCtrainval_11-May-2012.tar
tar xf VOCtrainval_06-Nov-2007.tar
tar xf VOCtest_06-Nov-2007.tar

cd VOCdevkit
VOCdevkit_root=$(pwd)

There will now be a VOCdevkit subdirectory with all the VOC training data in it.

mkdir ${VOCdevkit_root}/onedet_cache
cd ${yolo_root}

open examples/labels.py, let the variable ROOT point to ${VOCdevkit_root}

python examples/labels.py

open cfgs/yolov2.yml, let the data_root_dir point to ${VOCdevkit_root}/onedet_cache

open cfgs/yolov3.yml, let the data_root_dir point to ${VOCdevkit_root}/onedet_cache

3) weights

Download model weights from baidudrive or googledrive.

Or downlowd darknet19_448.conv.23 and darknet53.conv.74 from darknet website:

wget https://pjreddie.com/media/files/darknet19_448.conv.23

wget https://pjreddie.com/media/files/darknet53.conv.74

Then, move all the model weights to ${yolo_root}/weights directory.

Training

cd ${yolo_root}

1) Yolov2

1.1) open cfgs/yolov2.yml, let the weights of train block point to the pretrain weights

1.2) open cfgs/yolov2.yml, let the gpus of train block point to an available gpu id

1.3) If you want to print log onto screen, make the stdout of train block True in cfgs/yolov2.yml

1.4) run

python examples/train.py Yolov2

2) Yolov3

2.1) open cfgs/yolov3.yml, let the weights of train block point to the pretrain weights

2.2) open cfgs/yolov3.yml, let the gpus of train block point to an available gpu id

2.3) If you want to print log onto screen, make the stdout of train block True in cfgs/yolov3.yml

2.4) run

python examples/train.py Yolov3

3) Results

The logs and weights will be in ${yolo_root}/outputs.

4) Other models

There are many other models like tiny yolov2, tiny yolov3, mobilenet, mobilenetv2, shufflenet(g2), shufflenetv2(1x), squeezenext(1.0-SqNxt-23v5), light xception, xception etc. You can try these like 1) Yolov2 part.


Evaluation

cd ${yolo_root}

1) Yolov2

1.1) open cfgs/yolov2.yml, let the gpus of test block point to an available gpu id

1.2) run

python examples/test.py Yolov2

2) Yolov3

2.1) open cfgs/yolov3.yml, let the gpus of test block point to an available gpu id

2.2) run

python examples/test.py Yolov3

3) Results

The output bbox will be in ${yolo_root}/results, every line of the file in ${yolo_root}/results has a format like img_name confidence xmin ymin xmax ymax

4) Other models

There are many other models like tiny yolov2, tiny yolov3, mobilenet, mobilenetv2, shufflenet(g2), shufflenetv2(1x), squeezenext(1.0-SqNxt-23v5), light xception, xception etc. You can try these like 1) Yolov2 part.


Benchmarking the speed of network

cd ${yolo_root}

1) Yolov2

1.1) open cfgs/yolov2.yml, let the gpus of speed block point to an available gpu id

1.2) run

python examples/speed.py Yolov2

2) Yolov3

2.1) open cfgs/yolov3.yml, let the gpus of speed block point to an available gpu id

2.2) run

python examples/speed.py Yolov3

3) Tiny Yolov2

3.1) open cfgs/tiny_yolov2.yml, let the gpus of speed block point to an available gpu id

3.2) run

python examples/speed.py TinyYolov2

4) Tiny Yolov3

4.1) open cfgs/tiny_yolov3.yml, let the gpus of speed block point to an available gpu id

4.2) run

python examples/speed.py TinyYolov3

5) Mobilenet

5.1) open cfgs/region_mobilenet.yml, let the gpus of speed block point to an available gpu id

5.2) run

python examples/speed.py RegionMobilenet

6) Other backbones with region loss

You can try these like 5) Mobilenet part.


Credits

I got a lot of code from lightnet, thanks to EAVISE.

objectdetection-onestagedet's People

Contributors

lijiannuist avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

objectdetection-onestagedet's Issues

关于map

你好,感谢腾讯优图的分享,但是有一些以下问题:
1.从pre-trian训练和从初始训练是怎么定义的?如moblienet 是从imagenet训练来的还是 从初始训练
2.yolov3的性能感觉至少是可以对标ssd的,ssd从vgg16开始训练可以达到78+的map,但是看起来yolov3只能到60-70左右,这个问题请问有什么思考吗?

iou about gt and anchors

为什么在计算gt和anchors的iou时,要将两者的中心点坐标都置为0??

anchors = torch.cat([torch.zeros_like(self.anchors), self.anchors], 1)
.........................
gt_wh = gt.clone()
gt_wh[:, :2] = 0
iou_gt_anchors = bbox_ious(gt_wh, anchors)

麻烦了!!!

多GPU训练

您好,请问,1. 多GPU怎么训练呢?需要修改什么地方的代码?2. 我使用您给的原网络配置,训练yolov3,总是会内存不够,把batchsize设置为16才勉强训练,但是显然16太小。您是怎么训的呢?

No module named 'utils.test.nms.gpu_nms'

你好,我按照你的说明配置了VOC2007但是遇到以下问题。

ObjectDetection-OneStageDet-master/yolo$ python examples/train.py Yolov3
Traceback (most recent call last):
  File "examples/train.py", line 15, in <module>
    import vedanet as vn
  File "./vedanet/__init__.py", line 13, in <module>
    from . import engine
  File "./vedanet/engine/__init__.py", line 10, in <module>
    from ._voc_test import *
  File "./vedanet/engine/_voc_test.py", line 10, in <module>
    from utils.test import voc_wrapper
  File "./utils/test/voc_wrapper.py", line 4, in <module>
    from .fast_rcnn.nms_wrapper import nms, soft_nms
  File "./utils/test/fast_rcnn/nms_wrapper.py", line 8, in <module>
    from ..nms.gpu_nms import gpu_nms
ModuleNotFoundError: No module named 'utils.test.nms.gpu_nms'

我查看了这个文件,发现这个gpu_nms文件是一个pyx类型的文件,能否提供原本的py文件啊,或者如何解决这个问题,感激不尽

请问input_shape的选择有什么技巧么?

现在对于input_shape对于结果的影响比较困惑,请指教。
我是在自己的数据集上测试,分辨率是[1280, 960],训练时input_shape设置为[608, 456]比[608, 608]要好。测试时input_shape设置地越大,结果越好,比如[960, 960]。

Undefined name 'hsv' in ./yolo/vedanet/data/transform/_preprocess.py

hsv is an undefined name in this context: https://github.com/TencentYoutuResearch/ObjectDetection-OneStageDet/blob/master/yolo/vedanet/data/transform/_preprocess.py#L470

flake8 testing of https://github.com/TencentYoutuResearch/ObjectDetection-OneStageDet

$ flake8 . --count --select=E901,E999,F821,F822,F823 --show-source --statistics

./yolo/vedanet/data/transform/_preprocess.py:470:33: F821 undefined name 'hsv'
        img[:, :, 0] = wrap_hue(hsv[:, :, 0] + (360.0 * dh))
                                ^
1     F821 undefined name 'hsv'
1

E901,E999,F821,F822,F823 are the "showstopper" flake8 issues that can halt the runtime with a SyntaxError, NameError, etc. Most other flake8 issues are merely "style violations" -- useful for readability but they do not effect runtime safety.

  • F821: undefined name name
  • F822: undefined name name in __all__
  • F823: local variable name referenced before assignment
  • E901: SyntaxError or IndentationError
  • E999: SyntaxError -- failed to compile a file into an Abstract Syntax Tree

about negative anchors

计算损失时的负样本是到底什么??

自己理解是:
比如基于yolov2,在_regionloss.py中.假设:batch_size=16,16幅图的gt总数是50个,特征图尺寸是15x15.那么一共就会得到15x15x5x16=18000个anchors.
根据设置,因为有50个gt,那么正样本就应该是50个
负样本的个数应该体现在参数conf_neg_mask中.conf_neg_mask初始值为1,随后经过2个变化:
 1. 当某个anchor与gt的iou值大于阈值0.5时,将该anchor对应位置的元素改为0;
 2. 除去正样本的,将该anchor对应位置的元素改为0.
经过上述2步后,如果conf_neg_mask(shape为[16, 5, 361])中有17920个元素值为1.那是不是就可以在16幅对应的18000个anchors中,有17920个anchors都是负样本

不知道自己上述的理解对吗??

About the benchmark of network speed

Why do you evaluate the speed of network by feeding some randomly generated data not averaging the time of real test datset? And does the final speed include the time of post-processing NMS or Soft-NMS?

MobileNetv2 yolov3

thanks for your works!!! happy Chinese new years!
Can I use MobileNetv2 instead darknet53 to train ?

error in Evaluation:RuntimeError

I follow the instruction and run "Training" and "Benchmarking the speed of network" successfully. But meet error when i run "python examples/test.py Yolov3" in the "Evaluation".
The error information is shwon as follows:
_20190122203339
I tried some change in vedanet module like .float() but the error still exist. Is anyone know how to fix it, thanks!

训练自己的数据集

谢谢大佬的贡献,我向请问下如果把自己的数据集制作成标准的VOC格式,然后要用YOLOV3进行训练。
目前我能想到的修改有:
(1)首先改动yolov3.yml中的“labels”和“data_root_dir”以及一些超参数,
(2)还有在_yolov3.py修改
num_classes=数据集对应的种类数
想问下还需要改动什么么?

另外,mobilenet, mobilenetv2, shufflenet(g2), shufflenetv2(1x), squeezenext(1.0-SqNxt-23v5), light xception, xception这些模型的权重文件可否提供一下呢?

dataset,dataloader

你好。首先感谢您的出色的工作。
我有一个问题。在您的_dataloading.py中的Dataset类中。返回的是self.input_dim,没有返回((image, anno)),那网络如何得到(image, anno)。
我是想实现多尺度训练的。想不同batch的尺寸不同(320~608)。

how to use the trained yolov3.weights?

I try to use the trained yolov3.weights instead your weights with the commond
"$ python examples/test.py Yolov3"
The result is as follows,but I get nothing in the folder "results".
2019-01-18 12:16:36,326:INFO:20/619
2019-01-18 12:16:38,887:INFO:40/619
2019-01-18 12:16:41,446:INFO:60/619
2019-01-18 12:16:44,014:INFO:80/619
2019-01-18 12:16:46,582:INFO:100/619
2019-01-18 12:16:49,152:INFO:120/619
2019-01-18 12:16:51,719:INFO:140/619
2019-01-18 12:16:54,290:INFO:160/619
2019-01-18 12:16:56,932:INFO:180/619
2019-01-18 12:16:59,510:INFO:200/619
2019-01-18 12:17:02,083:INFO:220/619
2019-01-18 12:17:04,723:INFO:240/619
2019-01-18 12:17:07,311:INFO:260/619
2019-01-18 12:17:09,898:INFO:280/619
2019-01-18 12:17:12,492:INFO:300/619
2019-01-18 12:17:15,086:INFO:320/619
2019-01-18 12:17:17,689:INFO:340/619
2019-01-18 12:17:20,282:INFO:360/619
2019-01-18 12:17:22,885:INFO:380/619
2019-01-18 12:17:25,495:INFO:400/619
2019-01-18 12:17:28,110:INFO:420/619
2019-01-18 12:17:30,718:INFO:440/619
2019-01-18 12:17:33,321:INFO:460/619
2019-01-18 12:17:35,918:INFO:480/619
2019-01-18 12:17:38,522:INFO:500/619
2019-01-18 12:17:41,131:INFO:520/619
2019-01-18 12:17:43,732:INFO:540/619
2019-01-18 12:17:46,349:INFO:560/619
2019-01-18 12:17:48,957:INFO:580/619
2019-01-18 12:17:51,565:INFO:600/619

windows version

Thanks for your open source code,but I have two questions

  1. how to run this project on windows?
  2. I have trained on voc data,but I didn't get log file, what's the problem?

训练过程请教

我使用的yolov3-tiny的模型。使用了官网提供的初始化权重,width:height=416:416训练了自己的数据集50w次([net]中的参数都没有改)。使用final_weight测试的时候发现width:height=512:320的效果最好(分别尝试了416:416,512:256,512:320)。然后我从前面的训练模型中挑选第10w(随机选的)次作为我新的初始化权重,此时使用width:height=512:320来重新训练。我的这种思路对不对,如果可以的话这个时候我的学习率应该如何调,有没有什么方法可以优化我挑选的初始化权重。主要的目的是为了能快速的迭代

ImportError: /home/lele/pytorch/ObjectDetection-OneStageDet/yolo/utils/test/nms/gpu_nms.so: undefined symbol: _Py_ZeroStruct

你好
当我编译 utils/test 时,编译成功了,但当我运行python train.py Yolov3时遇到了,下面的报错.
我把python的默认版本从python2.7改为python3.6 编译后运行,依然出现同样的错误.
期望你的回复,谢谢
Traceback (most recent call last):
File "/home/lele/pytorch/ObjectDetection-OneStageDet/yolo/examples/train.py", line 15, in
import vedanet as vn
File "/home/lele/pytorch/ObjectDetection-OneStageDet/yolo/vedanet/init.py", line 13, in
from . import engine
File "/home/lele/pytorch/ObjectDetection-OneStageDet/yolo/vedanet/engine/init.py", line 10, in
from ._voc_test import *
File "/home/lele/pytorch/ObjectDetection-OneStageDet/yolo/vedanet/engine/_voc_test.py", line 10, in
from utils.test import voc_wrapper
File "/home/lele/pytorch/ObjectDetection-OneStageDet/yolo/utils/test/voc_wrapper.py", line 4, in
from .fast_rcnn.nms_wrapper import nms, soft_nms
File "/home/lele/pytorch/ObjectDetection-OneStageDet/yolo/utils/test/fast_rcnn/nms_wrapper.py", line 8, in
from ..nms.gpu_nms import gpu_nms
ImportError: /home/lele/pytorch/ObjectDetection-OneStageDet/yolo/utils/test/nms/gpu_nms.so: undefined symbol: _Py_ZeroStruct

The problem of eval speed

I write a img_detect func which detect per img, so I have not use dataloader. The other code is same to yours. when I test on 1080ti, I get only 0.05s/per img.Is the problem is whether using dataloader?

多GPU训练Yolov3

两个GPU训练Yolov3

yolov3.yml修改

gpus: "0,1"
mini_batch_size:8

GPU使用情况如下

Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla P100-PCIE...  Off  | 00000000:05:00.0 Off |                    0 |
| N/A   58C    P0   237W / 250W |   8835MiB / 16160MiB |     100%      Default|
+-------------------------------+----------------------+----------------------+
|   1  Tesla P100-PCIE...  Off  | 00000000:89:00.0 Off |                    0 |
| N/A   36C    P0    25W / 250W |     11MiB / 16160MiB |      0%      Default 

其中一个GPU在闲置

mAP

How to get mAp

burn_in

"burn_in=1000" in darknet version .cfg file = "??" in this pytorch version .yml file

How to train tinyyolov3?

thanks for your programs!
i get some error when trying to train tinyyolov3 ,with command :python examples/train.py TinyYolov3
ImportError: cannot import name annotation
env:python 3.6.2

an illegal memory access

When i use tiny_yolov3.yml to train tiny yolov3 from scratch,i got the error:
image
What's going on?

Loss computation; SmoothL1 and BCE for coordinates

Thanks for this repo! I'm learning the loss function since it is quite different from other repos I've seen. Something specific that caught my attention is the loss computation for the coordinates:

https://github.com/TencentYoutuResearch/ObjectDetection-OneStageDet/blob/5f52c97e14378d4de25f59d4aaee7ffe43f3523d/yolo/vedanet/loss/_yololoss.py#L124-L125

So, w/h use the SmoothL1 loss while the x/y coordinates use BCE. None of which are mentioned in the paper. Specifically, some detectors (namely Faster R-CNN and its variants) use SmoothL1 for all coordinates - but using BCE for x/y (regression targets) makes little sense to me. Can you please elaborate on this?

Thanks.

How to train from scratch on my own datasets?

Hi there,

First of all, to be fare, it is a very great job indeed.

I want to train on my own datasets from scratch. Do I need to modify in the 'cfg/yolo3.xml' and comment out the 'weights'? is that correct?

I am looking forward to your reply, many thanks.

Kind regards

Wei

Why do you divide learning rate by batch size?

Usually, we increase lr as increasing batch size.

But, in _voc_train.py

line 66: optim = torch.optim.SGD(net.parameters(), lr=learning_rate/batch, momentum=momentum, dampening=0, weight_decay=decay*batch)

line 101: self.add_rate('learning_rate', lr_steps, [lr/self.batch_size for lr in lr_rates])

Why did you decide to decrease lr when use large batch size?

检测结果中出现了置信度很低的bbox

你好,我在使用代码的过程中发现,虽然经过NMS去重,还是会有一些置信度不一,但都是对同一个物体的bbox依然被写入到检测结果中。
比如使用YoloV3在VOC数据集上进行训练,测试得到的“comp4_det_test_aeroplane.txt”文件中,对000521.jpg检测到了3个bbox,结果如下:

000521 0.735211 249.26878 120.09045 318.2519 236.4592
000521 0.136427 272.41382 144.24376 324.67673 253.31178
000521 0.011197 244.49088 127.17586 298.92145 170.71593

既有置信度相对较高的,也有很低的。其中000521.jpg是一张只包含了一架飞机的图片。

我注意到,如果多次检测到同一个目标,在使用官方提供的voc_eval.py脚本进行评估时,可能会“虚假地提高”mAP。

所以请问:
1、对于检测深度学习模型的效果来说,是不是后续的mAP就直接在其基础上进行计算呢?还是说要进一步手动设置置信度的阈值(默认值是0.005),来对结果进行进一步筛选呢?
2、一般论文中对于这种“虚假地提高”mAP的做法是怎样的呢?是只计算一次么?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.