yatenglg / retinanet-pytorch Goto Github PK

Retinanet目标检测算法(简单,明了,易用,全中文注释,单机多卡训练,视频检测)(based on pytorch,Simple, Clear, Mutil GPU)

Python 100.00%

retinanet-pytorch's Introduction

Hi, i'm LG. 🧑‍💻

I'm an algorithm engineer who focused on AI. My main research direction is computer vision and point cloud segmentation. I hope to bring convenience to people through my technology.

Occasionally share articles about python and AI

retinanet-pytorch's People

Contributors

Stargazers

Watchers

Forkers

wuyx chaoso stivensss royzon cenchaojun 2017tjm yizhex tiber2013 zhang1in tobenum yuuzhao hptitan gggtttyyy joene-zhou xuejianjia xzycr7 bhsphd liyang0520 jie3820165 cocol11 rc209972344 tjulitianyi1997 cezi127 jiangguangai jameskry kokoing123 zhanjunxiang blackjaxx 1834304120 biao321 liangruofei techmagic hitaitengteng liangxiaobo 201814476 superuichang lianghongli2 aaaaa-cmby yurongchen1998 mutiangua taochx argusswift xzxedu game-li limitmhw yangcyz gary828 annsmf captainfffsama zhixiangwang-cn picopon caihaocheng-caihaocheng daijuting douglas2code evenblue l0118 modestyjx collector-m kng-7 mark-zhood malixiaoguo xiangjun0103 xiaopengqiu jake-wei yimiz1995 stevenjokess tongtong-allure 521hellogithub qingniaoihep wilburd jiangzqyw lucky-light-sun zhouyingfeng wnxy lxssg1231 githubhfx changmengweiyang7 handsome-lu catofwei fangsq liang-zx lksssszz huangyan-jingguantian xiongjianping0417 baixiaohu tzlm0302 bookfangy pnme79 roar090 guyuex guoqingru0911 baichuan12138 knaico noticeable backermanaaa

retinanet-pytorch's Issues

你好，我用自己的数据集，跑了一夜，loss很低，但是测试结果特别差，能给一些建议吗

fpn最后的3X3卷积都是用的conv1的吗？

嗨喽大佬你好正在学习你的代码，fpn.py里最后的3*3卷积都是用的self.top_down_conv1吗？这样就共享权重了？那上面怎么还定义了conv2和conv3呢？求大佬的解答

报错

IndexError: The shape of the mask [8, 1] at index 1 does not match the shape of the indexed tensor [8, 67995] at index 1
我将batch_size修改为8，这个错是在哪里进行修改啊

RuntimeError: Found dtype Double but expected Float

--- load weight finish ---
Setting up a new session...
Max_iter = 120000, Batch_size = 20
Model will train on cuda:[0]
--- Focal_loss alpha = 0.25 ,将对背景类进行衰减,请在目标检测任务中使用 ---
--- Multiboxloss : α=0.25 γ=2 num_classes=21
Set optimizer : SGD (
Parameter Group 0
dampening: 0
initial_lr: 0.001
lr: 0.001
momentum: 0.9
nesterov: False
weight_decay: 0.0005
)
Set scheduler : <torch.optim.lr_scheduler.MultiStepLR object at 0x7f7d7f196e20>
Set lossfunc : multiboxloss(
(loc_loss_fn): SmoothL1Loss()
(cls_loss_fn): focal_loss()
)
Start Train......

/home/pdj/PycharmProjects/lyy/Retinanet-Pytorch/Data/Transfroms_utils.py:263: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
mode = random.choice(self.sample_options)
/home/pdj/PycharmProjects/lyy/Retinanet-Pytorch/Data/Transfroms_utils.py:263: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
mode = random.choice(self.sample_options)
Traceback (most recent call last):
File "/home/pdj/PycharmProjects/lyy/Retinanet-Pytorch/Demo_train.py", line 36, in
trainer(net, train_dataset)
File "/home/pdj/PycharmProjects/lyy/Retinanet-Pytorch/Model/trainer.py", line 122, in call
loss.backward()
File "/home/pdj/anaconda3/envs/lyy/lib/python3.8/site-packages/torch/tensor.py", line 221, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/pdj/anaconda3/envs/lyy/lib/python3.8/site-packages/torch/autograd/init.py", line 130, in backward
Variable._execution_engine.run_backward(
RuntimeError: Found dtype Double but expected Float

Process finished with exit code 1
请问个是为什么，我在Transforms.py中明明看到有ConvertFromInts()
并且在Transforms_utils.py中明明看到有return image.astype(np.float32), boxes, labels
为什么就报RuntimeError: Found dtype Double but expected Float这个错误了呢，
难道是上面VisibleDeprecationWarning这个的问题。
库版本如下：
python 3.8.5 h7579374_1
pytorch 1.7.0 py3.8_cuda10.1.243_cudnn7.6.3_0 pytorch
torchvision 0.8.1 py38_cu101 pytorch
numpy 1.19.4 pypi_0 pypi
opencv-python 4.4.0.46 pypi_0 pypi
yacs 0.1.8 pypi_0 pypi
visdom 0.1.8.9 pypi_0 pypi
vizer 0.1.5 pypi_0 pypi

方便把fpn 的 p6, p7 加回来吗?

注意到 fpn 中只有p3..5, 没有p6, p7.

这样实现的效果相对原版效果如何?
方便把 p6, p7加回来吗?

数据集下载

您好！请问可以提供数据集下载链接吗?也希望您能够提供具体的训练步骤，谢谢！

我计算ap的时候很高有0.83但是检测单张图片为啥没有结果，只有几个框，iou设置为了0.5

网络如何进行微调和迁移学习

我在一个数据集上训练得到一个权重，我想在另一个数据集上还用这个权重并进行训练，该怎么做呢

可不可以放几张效果图呢？或者在coco数据集上的map

根据我的实验结果，最终效果不太理想呢，是不是有些细节需要调优

visdom server 網頁無法開啟

Checking for scripts.
It's Alive!
ERROR:root:initializing
INFO:root:Application Started
INFO:root:Working directory: /root/.visdom
You can navigate to http://874e8d336cd3:8097/

boxes 矩阵运算报错 IndexError: too many indices for array

我训练的数据集，目标比较小，训练的时候loss很低，但是检测不出来，ap值三点几

false positives 特别多的问题

您好，非常感谢您的代码。我尝试将您的focal loss部分加到retinaface模型中，没有改动参数，发现误检特别多。想问下知道可能的原因嘛？是因为我没有进行hard negative mining还是因为我参数没有调对呢？(尝试过修改阈值但作用非常微小)

VOC数据集的最终mAP

请问您有在voc上完整训练吗，最后的mAP大概有多少呀

为什么会出现这个错误RuntimeError: Expected object of scalar type float but got scalar type double for argument 'other'

RunTimeError

Traceback (most recent call last):
File "F:/Retinanet-Pytorch-master/Demo_train.py", line 36, in
trainer(net, train_dataset)
File "F:\Retinanet-Pytorch-master\Model\trainer.py", line 115, in call
reg_loss, cls_loss = self.loss_func(cls_logits, bbox_preds, labels, boxes)
File "D:\anaconda3\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "F:\Retinanet-Pytorch-master\Model\struct\MultiBoxLoss.py", line 57, in forward
predicted_locations = predicted_locations[pos_mask, :].view(-1, 4)
RuntimeError: copy_if failed to synchronize: device-side assert triggered
配置文件里面将batch_size减小为1，学习率也进行了修改为1e-4，还是报错，请问是什么原因

assign_priors里面负样本相关

RT，原始论文中 0.5>IOU>0.4的anchor label好像都赋值为-1以此来忽略最终的loss计算，IOU<0.4的才记为负样本。而你的代码中 IOU<0.5的都记为负样本。你这么做的依据在哪或者说有什么其他参考吗

BUG! 当图片(.xml)中不包含任何 object 时！

当训练集中存在一张图片不包含任何目标时，Data文件夹下的Transfroms_utils.py代码在进行boxes[:, 0] /= width计算时，会报错IndexError: too many indices for array。
原因是这张图片并没有真值图，即xml文件中无法找到bbox，所以报错。
现在的新数据集中，这类图片很常见，希望大神解决一下，谢谢！

输入尺寸和预测框大小

谢谢提供这个模型，由于我修改了FPN的结构，在训练的时候（8G的显存），输入尺寸为600时，总是出现CUDA out of memory，我想减小到输入尺寸为300，特征图（5层）应该变成 38、19、10、5、3，那么对应的预测框大小该如何设置？

net.load_pretrained_weight 没这个方法呢？

AttributeError: 'RetainNet' object has no attribute 'load_pretrained_weight'

如何分别设置输入图片IMAGE_SIZE的长宽?

感谢提供此代码！但由于原本设置的是将输入图像的长宽都resize到同一IMAGE_SIZE大小(600px)，但对于像KITTI这样长宽比悬殊的数据集，原图长宽比大约为1200*375左右，若resize到同一大小，就将导致行人/自行车到目标的像素严重缺失，无法识别。因此我希望能分别设置IMAGE_SIZE的width和height，请问这样的话，对于anchor和feature map大小，以及内部一系列参数的设置该如何修改？

ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (6,) + inhomogeneous part.

--- load weight finish ---
Setting up a new session...
Max_iter = 120000, Batch_size = 20
Model will train on cuda:[0]
--- Focal_loss alpha = 0.25 ,将对背景类进行衰减,请在目标检测任务中使用 ---
--- Multiboxloss : α=0.25 γ=2 num_classes=21
Set optimizer : SGD (
Parameter Group 0
dampening: 0
initial_lr: 0.001
lr: 0.001
momentum: 0.9
nesterov: False
weight_decay: 0.0005
)
Set scheduler : <torch.optim.lr_scheduler.MultiStepLR object at 0x00000248040508B0>
Set lossfunc : multiboxloss(
(loc_loss_fn): SmoothL1Loss()
(cls_loss_fn): focal_loss()
)
Start Train......

Traceback (most recent call last):
File "D:\software\PyCharm\PyCharm Community Edition 2022.1.3\plugins\python-ce\helpers\pydev\pydevd.py", line 1491, in _exec
pydev_imports.execfile(file, globals, locals) # execute the script
File "D:\software\PyCharm\PyCharm Community Edition 2022.1.3\plugins\python-ce\helpers\pydev_pydev_imps_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "D:/code/ai/Retinanet/Retinanet-Pytorch-master/Demo_train.py", line 36, in
trainer(net, train_dataset)
File "D:\code\ai\Retinanet\Retinanet-Pytorch-master\Model\trainer.py", line 112, in call
for iteration, (images, boxes, labels, image_names) in enumerate(data_loader):
File "D:\software\supermap\idesktopX\support\MiniConda\conda\envs\retinanet\lib\site-packages\torch\utils\data\dataloader.py", line 435, in next
data = self._next_data()
File "D:\software\supermap\idesktopX\support\MiniConda\conda\envs\retinanet\lib\site-packages\torch\utils\data\dataloader.py", line 1085, in _next_data
return self._process_data(data)
File "D:\software\supermap\idesktopX\support\MiniConda\conda\envs\retinanet\lib\site-packages\torch\utils\data\dataloader.py", line 1111, in _process_data
data.reraise()
File "D:\software\supermap\idesktopX\support\MiniConda\conda\envs\retinanet\lib\site-packages\torch_utils.py", line 428, in reraise
raise self.exc_type(msg)
ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "D:\software\supermap\idesktopX\support\MiniConda\conda\envs\retinanet\lib\site-packages\torch\utils\data_utils\worker.py", line 198, in _worker_loop
data = fetcher.fetch(index)
File "D:\software\supermap\idesktopX\support\MiniConda\conda\envs\retinanet\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "D:\software\supermap\idesktopX\support\MiniConda\conda\envs\retinanet\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "D:\code\ai\Retinanet\Retinanet-Pytorch-master\Data\Dataset_VOC.py", line 48, in getitem
image, boxes, labels = self.transform(image, boxes, labels)
File "D:\code\ai\Retinanet\Retinanet-Pytorch-master\Data\Transfroms.py", line 40, in call
img, boxes, labels = t(img, boxes, labels)
File "D:\code\ai\Retinanet\Retinanet-Pytorch-master\Data\Transfroms_utils.py", line 263, in call
mode = random.choice(self.sample_options)
File "mtrand.pyx", line 920, in numpy.random.mtrand.RandomState.choice
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (6,) + inhomogeneous part.
请问这是什么原因导致的呀

关于计算loss相关

如题，我看你计算loc和cls损失时都是计算正负样本的总损失。但是最后返回时却只除以了正样本数量。你能解释一下为什么要这样做吗。或者给我一个相关链接也行。