Giter VIP home page Giter VIP logo

deep-learning-for-image-processing's Issues

Unresolved reference 'ImageDataGenerator'

问题:在pycharm 工具打开的,下面的模块不能正常解析,但是代码能运行成功
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import layers, Model, Sequential

环境:pycharm TensorFlow 2.1 ,Python 3.6.10 测试如下
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow import keras

print(tf.path)
print(keras.path)
print(ImageDataGenerator)
输出:
['C:\tools\Anaconda3\envs\tf-gpu-pytorch\lib\site-packages\tensorflow']
['C:\tools\Anaconda3\envs\tf-gpu-pytorch\lib\site-packages\tensorflow_core\python\keras\api\_v2\keras']
<class 'tensorflow.python.keras.preprocessing.image.ImageDataGenerator'>

训练好的权重

感谢您做的这个教程!太棒了!
有已经训练好的weights吗 出于学习目的不想去训练

One line of code missing in PyTorch GoogLeNet?

在 PyTorch版本的GoogLeNet的train.py文件中,用来保存最好的模型的代码:

        if accurate_test > best_acc:
            torch.save(net.state_dict(), save_path)

似乎少了一句:

        if accurate_test > best_acc:
            best_acc = accurate_test
            torch.save(net.state_dict(), save_path)

Temporal Action Localization

老哥,我好崇拜你啊。你看能不能找机会做一下这篇文章,"Rethinking the Faster R-CNN Architecture for Temporal Action Localization",原作者没有提供代码。

cource_pdf

你好!从你的cource_pdf下载的课件ppt为什么总是提示无法打开文件,说是因为不支持该类型文件或已损坏。我用的wps,下载的文件也是没问题的,谁知道怎么解决吗?谢谢!

faster_rcnn中train_mobilenet.py训练完一个epoch就会报错

System information

  • OS Platform(e.g., window10 ):window10
  • Python version:3.7
  • Deep learning framework and version(e.g., Tensorflow2.1 or Pytorch1.3):Pytorch1.5
  • Use GPU or not:GPU
  • CUDA/cuDNN version(if you use GPU):cuda10.1
  • The network you trained(e.g., Resnet34 network):mobilenet

Describe the current behavior

Error info / logs
Traceback (most recent call last):
File "E:\Anaconda\envs\PyTorch\Lib\site-packages\numpy\core\function_base.py", line 117, in linspace
num = operator.index(num)
TypeError: 'numpy.float64' object cannot be interpreted as an integer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "D:/FirstYearMaster/ObjectDetection/Faster RCNN/faster_rcnn/train_mobilenet.py", line 146, in
main()
File "D:/FirstYearMaster/ObjectDetection/Faster RCNN/faster_rcnn/train_mobilenet.py", line 93, in main
utils.evaluate(model, val_data_set_loader, device=device)
File "E:\Anaconda\envs\PyTorch\Lib\site-packages\torch\autograd\grad_mode.py", line 15, in decorate_context
return func(*args, **kwargs)
File "D:\FirstYearMaster\ObjectDetection\Faster RCNN\faster_rcnn\train_utils\train_eval_utils.py", line 70, in evaluate
coco_evaluator = CocoEvaluator(coco, iou_types)
File "D:\FirstYearMaster\ObjectDetection\Faster RCNN\faster_rcnn\train_utils\coco_eval.py", line 28, in init
self.coco_eval[iou_type] = COCOeval(coco_gt, iouType=iou_type)
File "E:\Anaconda\envs\PyTorch\Lib\site-packages\pycocotools\cocoeval.py", line 76, in init
self.params = Params(iouType=iouType) # parameters
File "E:\Anaconda\envs\PyTorch\Lib\site-packages\pycocotools\cocoeval.py", line 527, in init
self.setDetParams()
File "E:\Anaconda\envs\PyTorch\Lib\site-packages\pycocotools\cocoeval.py", line 507, in setDetParams
self.iouThrs = np.linspace(.5, 0.95, np.round((0.95 - .5) / .05) + 1, endpoint=True)
File "<array_function internals>", line 6, in linspace
File "E:\Anaconda\envs\PyTorch\Lib\site-packages\numpy\core\function_base.py", line 121, in linspace
.format(type(num)))
TypeError: object of type <class 'numpy.float64'> cannot be safely interpreted as an integer.

怎么改写代码?

请问使用VGG检测的时候怎么实现批量检测,如果检测图中有多个目标,可以输出多个目标的概率吗?我现在用的是您的tensorflow-VGG 和PYTORCH-VGG

faster rcnn中的nms操作

             rpn_pre_nms_top_n_train=2000, rpn_pre_nms_top_n_test=1000,    # rpn中在nms处理前保留的proposal数(根据score)

             rpn_post_nms_top_n_train=2000, rpn_post_nms_top_n_test=1000,  # rpn中在nms处理后保留的proposal数

为什么您的代码中nms处理前和处理后的proposal数是一样的呢?这样设置有什么作用吗?

YOLOv3_spp的IOU计算问题

yolov3spp项目utils.py中的wh_iou是直接假设GT和anchor中心重合,然后取较短的w和h算IOU吗?

distributed

您好,我跑您的代码当进行多GPU训练时,出现错误:not using distributed mode,请问这个错误应该怎么解决

单通道图像训练

请问我想利用单通道灰度图像训练resnet,我该如何将resnet改成能识别单通道图像呢?
谢谢!

AlexNet compile中loss和metrics的设置

如果使用ImageDataGenerator生成数据集,那么产生的真实标签是one-hot标码,如果模型的输出层用的是softmax,那么预测值标签是概率分布,loss是categorical_crossentropy,那么metrics也应该是categorical_accurary而不是accuracy吧。

faster rcnn相关问题

您好,想问下您再faster rcnn里面是怎么划分数据集的呢?我看您视频里讲的是划分了训练集和测试集,但是我看您在训练时每一个epoch都使用测试集进行测试了,这似乎是验证集的用法,那么您这里划分的测试集的功能到底是什么呢?

Loss is nan, stopping training

老师您好!我看了您的视频感觉讲的非常好,我这边使用您的代码在VOC2012数据集上可以正常运行,但是我使用我自己标注的数据集就出现了这个问题:Loss is nan, stopping training。我不太确定这个问题产生的原因,想来咨询您一下。我使用的是Windows,1块2060的CPU、用的是res50_fpn。
image

关于目标检测指标的问题

你好,我在B站看了您的视频,觉得讲的特别好,我在实际应用中也遇到了一些问题;
类似于目标检测的问题,我研究的是时间行为定位(一个2D,一个1D)计算指标的方式差不多,现在我要计算mAP

前提:一共20类行为,213个测试视频,每个视频提取200个proposals。
问题:对于其中一类行为,GT有41个,prediction有2000个,
首先不是这一类视频中的proposal算FP,接着计算剩余proposals与GT的IoU,按照IoU从高到低排序,IoU小于0.5的算FP,IoU大于0.5的算TP,但如果之后检测到重叠的也算FP。这样算下来我TP最高也就是41,但是FP差不多2000,结果precision算的就会非常低。
想请教下您,不知道这样算有没有问题?

参数设置是论文中的,代码是 https://github.com/activitynet/ActivityNet/blob/master/Evaluation/eval_detection.py
论文:BSN: Boundary Sensitive Network for Temporal Action Proposal Generation

“On THUMOS14, we use top-2 video-level classes generated by UntrimmedNet [48] for proposals generated by BSN and other methods.Following previous works, on THUMOS14, we also implement SCNN-classifier on BSN proposals for proposal-level classification and adopt Greedy NMS as [7]. We use 100 and 200 proposals per video on ActivityNet-1.3 and THUMOS14 datasets separately.”

作者大大您好!在看完您的 resnet 训练代码后有几个地方不太明白,可能我的问题很小白,还请别介意,感谢指教

在您的tensorflow resnet 的 train.py 中这一段代码的作用是什么呢?(另外,很抱歉,目前我没看resnet相关论文,所以理论方面还不懂(捂脸))

  • train.py line 70
model = tf.keras.Sequential([feature,
                             tf.keras.layers.GlobalAvgPool2D(),
                             tf.keras.layers.Dropout(rate=0.5),
                             tf.keras.layers.Dense(1024),
                             tf.keras.layers.Dropout(rate=0.5),
                             tf.keras.layers.Dense(5),
                             tf.keras.layers.Softmax()])

这部分我不太明白为什么用 Sequential 再度堆叠了一个网络结构,feature 已经是 resnet 网络模型了 所以输出形状是dim=2, umm 接下来的我不确定我的理解对不对,我的理解是下一层的全局池化层输入形状是 dim=4 ,与这里 feature 输出形状不匹配,所以无法和 feature 连接,所以我搞不太明白这段的含义,或者要如何调整呢?我的任务是使用 resnet 对一组 40*18 尺寸的灰度图像做一个二分类。盼回复!提前感谢您!

resnet中训练代码求解答

在resnet的train.py中 missing_keys, unexpected_keys = net.load_state_dict(torch.load(model_weight_path), strict=False)这句代码没搞明白,注释掉程序可以进行训练,但是加上后会报错TypeError: 'NoneType' object is not iterable

我直接运行的train_res50_fpn代码,数据用的VOC2012数据。分别用有/无预训练模型训练,效果差的非常远。

我理解有无预训练模型效果会有一些差别,但是现在看上去没有预训练模型,网络完全无法收敛,可能的原因是什么呢?这样的话我修改模型用在自己的数据集上,没有预训练模型的话,又该怎么办呢?

首先没有用预训练模型,效果如下:
DONE (t=4.33s).
IoU metric: bbox
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.068
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.164
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.040
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.012
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.030
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.086
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.164
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.245
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.248
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.038
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.119
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.297

然后加载了COCO预训练模型,效果如下:
DONE (t=1.48s).
IoU metric: bbox
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.468
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.740
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.511
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.160
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.335
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.523
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.424
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.580
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.584
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.247
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.640

验证模块

在SSD中想获得每个类ap,我根据原SSD的eval代码进行修改,验证时发现得到的map比你使用的coco评价准则所获得的map降了3,4个点,请问你觉得是哪些地方有可能出问题(请问方便的话你可以添加获得每个类别的ap方法吗,感激不尽)

googlenet

System information

  • Have I written custom code:
  • OS Platform(e.g., window10 or Linux Ubuntu 16.04):
  • Python version:
  • Deep learning framework and version(e.g., Tensorflow2.1 or Pytorch1.3):
  • Use GPU or not:
  • CUDA/cuDNN version(if you use GPU):
  • The network you trained(e.g., Resnet34 network):

Describe the current behavior

Error info / logs

faster rcnn

您好,FASRER RCNN在加入FPN后(resnet50+fpn)其损失函数的计算方式是否改变?我看您的代码里是没有改变的。

关于proposal问题

您好,关于faster rcnn的proposal,我有些疑惑,用一张图片进行预测,预测结果没问题,但中间过程的proposal框与预想的差别很大,很奇怪,毕竟预测结果是没问题的。我有在b站上私信您,希望能具体谈到下

RuntimeError: Arguments for call are not valid.

Thanks for your code. when i run the code 'python train_res50_fpn.py',i meet this problem.How can i resolve it?
Traceback (most recent call last):
File "train_res50_fpn.py", line 3, in
from network_files.faster_rcnn_framework import FasterRCNN, FastRCNNPredictor
File "/home/yzhou/IpFPN/faster_rcnn/network_files/faster_rcnn_framework.py", line 4, in
from network_files.rpn_function import AnchorsGenerator, RPNHead, RegionProposalNetwork
File "/home/yzhou/IpFPN/faster_rcnn/network_files/rpn_function.py", line 6, in
from network_files import det_utils
File "/home/yzhou/IpFPN/faster_rcnn/network_files/det_utils.py", line 16, in
class BalancedPositiveNegativeSampler(object):
File "/home/yzhou/.pyenv/versions/anaconda3-5.3.1/envs/CenterNet/lib/python3.6/site-packages/torch/jit/init.py", line 1274, in script
_compile_and_register_class(obj, _rcb, qualified_name)
File "/home/yzhou/.pyenv/versions/anaconda3-5.3.1/envs/CenterNet/lib/python3.6/site-packages/torch/jit/init.py", line 1115, in _compile_and_register_class
_jit_script_class_compile(qualified_name, ast, rcb)
RuntimeError:
Arguments for call are not valid.
The following variants are available:

aten::index_put_(Tensor(a!) self, Tensor?[] indices, Tensor values, bool accumulate=False) -> (Tensor(a!)):
Expected a value of type 'Tensor' for argument 'values' but instead found type 'int'.

aten::index_put_(Tensor(a!) self, Tensor[] indices, Tensor values, bool accumulate=False) -> (Tensor(a!)):
Expected a value of type 'List[Tensor]' for argument 'indices' but instead found type 'List[Optional[Tensor]]'.

The original call is:
File "/home/yzhou/IpFPN/faster_rcnn/network_files/det_utils.py", line 85
)

        pos_idx_per_image_mask[pos_idx_per_image] = 1
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
        neg_idx_per_image_mask[neg_idx_per_image] = 1

运行train_res50_fpn.py,报如下错误。windows10环境,python3.7,pytorch1.5,谢谢!

RuntimeError:
Arguments for call are not valid.
The following variants are available:

aten::cat(Tensor[] tensors, int dim=0) -> (Tensor):
Expected a value of type 'List[Tensor]' for argument 'tensors' but instead found type 'Tensor'.

aten::cat.out(Tensor[] tensors, int dim=0, *, Tensor(a!) out) -> (Tensor(a!)):
Expected a value of type 'List[Tensor]' for argument 'tensors' but instead found type 'Tensor'.

The original call is:
File "C:\faster_rcnn\network_files\det_utils.py", line 216
assert isinstance(rel_codes, torch.Tensor)
boxes_per_image = [b.size(0) for b in boxes]
concat_boxes = torch.cat(boxes, dim=0)
~~~~~~~~~ <--- HERE

    box_sum = 0

第一次在github上提问题,谢谢

RuntimeError: Trying to pass too many CPU scalars to CUDA kernel!

Thanks for sharing you code . when I run 'python train_mobilenet.py ',I meet the problem.How I can do to solve the error!

Traceback (most recent call last):
File "train_mobilenet.py", line 157, in
main()
File "train_mobilenet.py", line 91, in main
train_loss=train_loss, train_lr=learning_rate)
File "/home/dl/桌面/faster_rcnn/train_utils/train_eval_utils.py", line 33, in train_one_epoch
loss_dict = model(images, targets)
File "/home/dl/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/dl/桌面/faster_rcnn/network_files/faster_rcnn_framework.py", line 87, in forward
proposals, proposal_losses = self.rpn(images, features, targets)
File "/home/dl/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/dl/桌面/faster_rcnn/network_files/rpn_function.py", line 615, in forward
labels, matched_gt_boxes = self.assign_targets_to_anchors(anchors, targets)
File "/home/dl/桌面/faster_rcnn/network_files/rpn_function.py", line 410, in assign_targets_to_anchors
matched_idxs = self.proposal_matcher(match_quality_matrix)
File "/home/dl/桌面/faster_rcnn/network_files/det_utils.py", line 347, in call
matches[below_low_threshold] = torch.tensor(self.BELOW_LOW_THRESHOLD) # -1
RuntimeError: Trying to pass too many CPU scalars to CUDA kernel!

RuntimeError: class '__torch__.network_files.image_list.ImageList' already defined.

Reloaded modules: transforms
Traceback (most recent call last):

File "/home2/cx/WeakDet/deep-learning-for-image-processing/pytorch_object_detection/faster_rcnn/train_res50_fpn.py", line 3, in
from network_files.faster_rcnn_framework import FasterRCNN, FastRCNNPredictor

File "/home2/cx/WeakDet/deep-learning-for-image-processing/pytorch_object_detection/faster_rcnn/network_files/faster_rcnn_framework.py", line 4, in
from network_files.rpn_function import AnchorsGenerator, RPNHead, RegionProposalNetwork

File "/home2/cx/WeakDet/deep-learning-for-image-processing/pytorch_object_detection/faster_rcnn/network_files/rpn_function.py", line 9, in
from network_files.image_list import ImageList

File "/home2/cx/WeakDet/deep-learning-for-image-processing/pytorch_object_detection/faster_rcnn/network_files/image_list.py", line 7, in
class ImageList(object):

File "/home/cx/anaconda3/envs/fcn_torch1.0/lib/python3.6/site-packages/torch/jit/init.py", line 1280, in script
_compile_and_register_class(obj, _rcb, qualified_name)

File "/home/cx/anaconda3/envs/fcn_torch1.0/lib/python3.6/site-packages/torch/jit/init.py", line 1108, in _compile_and_register_class
_jit_script_class_compile(qualified_name, ast, rcb)

RuntimeError: class 'torch.network_files.image_list.ImageList' already defined.

Own data training error useing SSD model.

When I use my own dataset for training, I meet the problem. My images format are RGB. Can you help me solve the problem?
System information

  • OS Platform(e.g., window10 or Linux Ubuntu 16.04):Linux Ubuntu 16.04
  • Python version:3.7
  • Deep learning framework and version(e.g., Tensorflow2.1 or Pytorch1.3): Pytorch1.5
  • CUDA/cuDNN version(if you use GPU): CUDA10.1
  • The network you trained(e.g., Resnet34 network): SSD

Error info / logs
Namespace(batch_size=4, device='cuda:0', epochs=15, output_dir='./save_weights', resume='', start_epoch=0)
Using cuda device training.
Using 4 dataloader workers
missing_keys: ['conf.0.weight', 'conf.0.bias', 'conf.1.weight', 'conf.1.bias', 'conf.2.weight', 'conf.2.bias', 'conf.3.weight', 'conf.3.bias', 'conf.4.weight', 'conf.4.bias', 'conf.5.weight', 'conf.5.bias', 'compute_loss.dboxes', 'postprocess.dboxes_xywh']
unexpected_keys: []
Epoch: [0] [ 0/104637] eta: 9 days, 4:57:54.872362 lr: 0.000500 total_losses: 10.8151 (10.8151) time: 7.6022 data: 7.0505 max mem: 1072
Epoch: [0] [ 50/104637] eta: 16:05:47.249883 lr: 0.000500 total_losses: 4.7562 (6.0402) time: 0.4383 data: 0.3700 max mem: 1169
Epoch: [0] [ 100/104637] eta: 15:38:36.884186 lr: 0.000500 total_losses: 4.0707 (5.2116) time: 0.4212 data: 0.3559 max mem: 1169
Traceback (most recent call last):
File "train_ssd300_object365.py", line 181, in
main(args)
File "train_ssd300_object365.py", line 121, in main
train_lr=learning_rate)
File "/data/ssd/train_utils/train_eval_utils.py", line 27, in train_one_epoch
for images, targets in metric_logger.log_every(data_loader, print_freq, header):
File "/data/ssd/train_utils/distributed_utils.py", line 204, in log_every
for obj in iterable:
File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 345, in next
data = self._next_data()
File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 838, in _next_data
return self._process_data(data)
File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 881, in _process_data
data.reraise()
File "/opt/conda/lib/python3.7/site-packages/torch/_utils.py", line 395, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in DataLoader worker process 2.
Original Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/data/ssd/object_dataset.py", line 82, in getitem
image, target = self.transform(image, target)
File "/data/ssd/transform.py", line 15, in call
image, target = trans(image, target)
File "/data/ssd/transform.py", line 181, in call
image = self.normalize(image)
File "/opt/conda/lib/python3.7/site-packages/torchvision/transforms/transforms.py", line 166, in call
return F.normalize(tensor, self.mean, self.std, self.inplace)
File "/opt/conda/lib/python3.7/site-packages/torchvision/transforms/functional.py", line 208, in normalize
tensor.sub
(mean).div
(std)
RuntimeError: output with shape [1, 300, 300] doesn't match the broadcast shape [3, 300, 300]

faster rcnn

up,请问下为什么您的代码只需要训练几千次就有很好的效果了,而别的文献中动不动就要几万次,请问您是采用的什么方法呢?另外我的任务也是比较简单,您用mob训练voc数据集时训练了多少次呢?以上提问基于您的faster rcnn教程,盼解答,谢谢您!

faster rcnn的transform问题

感谢分享,我有个问题想请教一下作者。
在faster rcnn的transform.py中有将多张图像在resize后打包成batch的操作。
而transform是在dataset类的get_item中调用的:
if self.transforms is not None:
image, target = self.transforms(image, target)
但是dataset类中是每次拿取一张图像,batch size应该是在调用dataloader时才指定的,在这里是如何打包成batch的呢?

FasterRCNN 训练错误

System information

  • Have I written custom code: no
  • OS Platform(e.g., window10 or Linux Ubuntu 16.04): linux
  • Python version: 3.8
  • Deep learning framework and version(e.g., Tensorflow2.1 or Pytorch1.3): torch1.6
  • Use GPU or not: yes
  • CUDA/cuDNN version(if you use GPU):
  • The network you trained(e.g., Resnet34 network): resnet50fpn

Describe the current behavior
您好,用faster_rcnn训练自己的数据集,一共六种物体,create model设置的num_classes=7,但是还是出现了这个错误。其他没有改过,求教该怎么解决呀?

Error info / logs

Namespace(batch_size=8, data_path='/research/dept8/qdou/zwang/data/robo/final', device='cuda:0', epochs=50, output_dir='./save_weights', resume='', start_epoch=0)
Using cuda device training.
Using 8 dataloader workers
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:84: operator(): block: [3,0,0], thread: [82,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:84: operator(): block: [3,0,0], thread: [83,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:84: operator(): block: [3,0,0], thread: [84,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
Traceback (most recent call last):
  File "train_res50_fpn.py", line 167, in <module>
    main(args)
  File "train_res50_fpn.py", line 99, in main
    utils.train_one_epoch(model, optimizer, train_data_loader,
  File "/research/dept8/qdou/zwang/faster_rcnn/train_utils/train_eval_utils.py", line 34, in train_one_epoch
    loss_dict = model(images, targets)
  File "/research/dept8/qdou/zwang/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/research/dept8/qdou/zwang/faster_rcnn/network_files/faster_rcnn_framework.py", line 93, in forward
    detections, detector_losses = self.roi_heads(features, proposals, images.image_sizes, targets)
  File "/research/dept8/qdou/zwang/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/research/dept8/qdou/zwang/faster_rcnn/network_files/roi_head.py", line 367, in forward
    proposals, matched_idxs, labels, regression_targets = self.select_training_samples(proposals, targets)
  File "/research/dept8/qdou/zwang/faster_rcnn/network_files/roi_head.py", line 222, in select_training_samples
    matched_idxs, labels = self.assign_targets_to_proposals(proposals, gt_boxes, gt_labels)
  File "/research/dept8/qdou/zwang/faster_rcnn/network_files/roi_head.py", line 144, in assign_targets_to_proposals
    labels_in_image[bg_inds] = 0
RuntimeError: copy_if failed to synchronize: cudaErrorAssert: device-side assert triggered

SSD代码中draw_box_utils.py的小问题

1、在复现代码的过程中发现,本仓库SSD代码中draw_box_utils.py文件的91行,存入的坐标为相对坐标,所以在绘制框的时候应该乘以图像的宽高,恢复成正常坐标。否则在裁剪操作后的绘图会出现一些问题。
for box, color in box_to_color_map.items():
xmin, ymin, xmax, ymax = box
(left, right, top, bottom) = (xmin * 1, xmax * 1,
ymin * 1, ymax * 1)
draw.line([(left, top), (left, bottom), (right, bottom),
(right, top), (left, top)], width=line_thickness, fill=color)

2、裁剪操作完是否应该把裁剪后的宽高存入target['height_width']中呢?不过不存入也不影响

催更

大佬的视频太好了,b站csdn和github全关注了,能不能出一期怎么简单修改yolov3spp网络结构的教程呀,感觉网上没有这方面的具体的视频教程。

Describe the current behavior

Error info / logs

About pytorch version

ssd_model.py
line4:from torch.jit.annotations import Optional, List, Dict, Tuple, Module
I just deleted “Optional” and run it directly under pytorch1.0. It seems like module Optional is useless......

COCO class num 问题

为什么测试faster RCNN 官方的resnet50 rpn网络权重时,num_classes要设成91, coco不应该是81吗?

顺便问一下,哪里可以找到coco类别的json文件啊?

aten::zeros_like.dtype(Tensor self, *, int dtype, int layout, Device device, bool pin_memory=False) -> (Tensor): Argument layout not provided.

你好我在GPU训练,运行到这段代码的时候
pos_idx_per_image_mask = torch.zeros_like( matched_idxs_per_image, dtype=torch.uint8 ) neg_idx_per_image_mask = torch.zeros_like( matched_idxs_per_image, dtype=torch.uint8 )
出现 aten::zeros_like(Tensor self) -> (Tensor):
Keyword argument dtype unknown.

aten::zeros_like.dtype(Tensor self, *, int dtype, int layout, Device device, bool pin_memory=False) -> (Tensor):
Argument layout not provided.

我该怎么解决,新手小白,万分感谢🙏!

What does the mean of self.include_top?

I'm confused about "self.include_top" in Test5/model in class ResNet(nn.Module), could you please show me some information about it?
Thank you in advance~

VGGmodel structure

Hi, I am using the VGG model. I find the dense layers have 4096 neural cells online, but your code writes 2048 cells.

faster rcnn

您好,我仔细看了下您的视频和代码,您在resnet50+fpn的训练脚本中设置的batchsize为4,然后看视频上您在训练时每个epoch迭代了178次,那您训练的图片样本是只有700多张吗?而您在mobilenet的训练脚本中设置的batchsize为8,这样每个epoch迭代了89次您训练了25个epoch,总的迭代次数就是两千多次吗?您的mobilenet在训练两千多次后达到了一个什么样的水平呢?我记得您在视频中讲到过,mobilenet的预训练模型只是针对特征提取网络的,那么使用mobilenet也是只用训练几千次就可以了吗?我的数据集由700张图片用您的代码,迭代了2500次,效果就不错了。以上基于您的faster rcnn代码和您的视频讲解,盼望您的解答!

predict issue

模型训练完成后,进行predict时出现如下操作:

Traceback (most recent call last):
File "/Users/mengxiangyu/Desktop/faster_rcnn/predict.py", line 47, in
model.load_state_dict(torch.load(train_weights)["model"])
File "/Users/mengxiangyu/anaconda3/envs/th1.6/lib/python3.7/site-packages/torch/serialization.py", line 577, in load
with _open_zipfile_reader(opened_file) as opened_zipfile:
File "/Users/mengxiangyu/anaconda3/envs/th1.6/lib/python3.7/site-packages/torch/serialization.py", line 241, in init
super(_open_zipfile_reader, self).init(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: [enforce fail at inline_container.cc:144] . PytorchStreamReader failed reading zip archive: failed finding central directory

是什么原因,是否是加载的模型有问题

混淆矩阵

您好:
请问混淆矩阵精度算出来的有nan。混淆矩阵里有一行全是0?
劳烦有时间回答,谢谢!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.