deep learning for image processing including classification and object-detection etc.

License: GNU General Public License v3.0

Python 99.84% HTML 0.16%

bilibili classification deep-learning object-detection pytorch segmentation tensorflow2

deep-learning-for-image-processing's People

Contributors

Stargazers

Watchers

Forkers

zzhaokaik d602749866 ai-erorr404 xjtuzc eternal1314 89243982 dia123blo soverngity nameyongyang liwangzhi ellsionjeep lizihao6868 xl123321 yangchuanli77 niupapachina wan-da caoyu9611 loganyan c-philosophy mfs123124 shadow-sjs lansedemeng123 chen960 xrw560 youaiai souqia xiaotongchao mollibottie333 juanable007 elsi0905 easttang jiazhen296 morvanli haitao19 xumin1993 pjcxl krealao caucer stephenlau007 joshuazm niuwenju onlyone596 jacob-us laster-lee yyqxys darwintk qqiuyp wu-xxxx luyao-z lanmei1211 zhongpeike wxlsky knightneverdie yaohuiwan jone-pan feuerwerk99 lilo-fucas sejudyblues bwleo 2464887679 thestar2019 qianrenjian hukaidididi leos8313 huixianglanchaixi downtao lltiany rongsj goldencrows starduct birdflyto princewlt rynnn1024 suidongcs2015 2429581027 ly54 jian1993 qq363307541 defdwlp create113 liuguoxionglang layerure the-mountain-rcj chrisyangi wamdell kalubin-awym cituma ls5555 yang11ak zhutongseu1994 tian-ji ys732141562 zbqbld ericl1u tangzihui yqw-yqw wanggp01 joihan zzj7 xiaoxiaokaiyan

deep-learning-for-image-processing's Issues

faster rcnn的transform问题

感谢分享，我有个问题想请教一下作者。
在faster rcnn的transform.py中有将多张图像在resize后打包成batch的操作。
而transform是在dataset类的get_item中调用的：
if self.transforms is not None:
image, target = self.transforms(image, target)
但是dataset类中是每次拿取一张图像，batch size应该是在调用dataloader时才指定的，在这里是如何打包成batch的呢？

What does the mean of self.include_top?

I'm confused about "self.include_top" in Test5/model in class ResNet(nn.Module), could you please show me some information about it?
Thank you in advance~

About pytorch version

ssd_model.py
line4：from torch.jit.annotations import Optional, List, Dict, Tuple, Module
I just deleted “Optional” and run it directly under pytorch1.0. It seems like module Optional is useless......

催更

大佬的视频太好了，b站csdn和github全关注了，能不能出一期怎么简单修改yolov3spp网络结构的教程呀，感觉网上没有这方面的具体的视频教程。

Describe the current behavior

Error info / logs

aten::zeros_like.dtype(Tensor self, *, int dtype, int layout, Device device, bool pin_memory=False) -> (Tensor): Argument layout not provided.

你好我在GPU训练，运行到这段代码的时候
pos_idx_per_image_mask = torch.zeros_like( matched_idxs_per_image, dtype=torch.uint8 ) neg_idx_per_image_mask = torch.zeros_like( matched_idxs_per_image, dtype=torch.uint8 )
出现 aten::zeros_like(Tensor self) -> (Tensor):
Keyword argument dtype unknown.

aten::zeros_like.dtype(Tensor self, *, int dtype, int layout, Device device, bool pin_memory=False) -> (Tensor):
Argument layout not provided.

我该怎么解决，新手小白，万分感谢🙏！

distributed

您好，我跑您的代码当进行多GPU训练时，出现错误：not using distributed mode，请问这个错误应该怎么解决

AlexNet compile中loss和metrics的设置

如果使用ImageDataGenerator生成数据集，那么产生的真实标签是one-hot标码，如果模型的输出层用的是softmax，那么预测值标签是概率分布，loss是categorical_crossentropy，那么metrics也应该是categorical_accurary而不是accuracy吧。

resnet中训练代码求解答

在resnet的train.py中 missing_keys, unexpected_keys = net.load_state_dict(torch.load(model_weight_path), strict=False)这句代码没搞明白，注释掉程序可以进行训练，但是加上后会报错TypeError: 'NoneType' object is not iterable

RPN阶段FPN为什么要区分不同level分别进行NMS?

RT.

faster rcnn相关问题

您好，想问下您再faster rcnn里面是怎么划分数据集的呢？我看您视频里讲的是划分了训练集和测试集，但是我看您在训练时每一个epoch都使用测试集进行测试了，这似乎是验证集的用法，那么您这里划分的测试集的功能到底是什么呢？

How to let custom datasets support the fast path for computing the aspect ratio?

Describe the current behavior
My custom dataset doesn't support the fast path, iterating the full dataset is too slow, do you have any ideas?
Error info / logs

单通道图像训练

请问我想利用单通道灰度图像训练resnet，我该如何将resnet改成能识别单通道图像呢？
谢谢！

FasterRCNN 训练错误

System information

Have I written custom code: no
OS Platform(e.g., window10 or Linux Ubuntu 16.04): linux
Python version: 3.8
Deep learning framework and version(e.g., Tensorflow2.1 or Pytorch1.3): torch1.6
Use GPU or not: yes
CUDA/cuDNN version(if you use GPU):
The network you trained(e.g., Resnet34 network): resnet50fpn

Describe the current behavior
您好，用faster_rcnn训练自己的数据集，一共六种物体，create model设置的num_classes=7，但是还是出现了这个错误。其他没有改过，求教该怎么解决呀？

Error info / logs

Namespace(batch_size=8, data_path='/research/dept8/qdou/zwang/data/robo/final', device='cuda:0', epochs=50, output_dir='./save_weights', resume='', start_epoch=0)
Using cuda device training.
Using 8 dataloader workers
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:84: operator(): block: [3,0,0], thread: [82,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:84: operator(): block: [3,0,0], thread: [83,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:84: operator(): block: [3,0,0], thread: [84,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
Traceback (most recent call last):
  File "train_res50_fpn.py", line 167, in <module>
    main(args)
  File "train_res50_fpn.py", line 99, in main
    utils.train_one_epoch(model, optimizer, train_data_loader,
  File "/research/dept8/qdou/zwang/faster_rcnn/train_utils/train_eval_utils.py", line 34, in train_one_epoch
    loss_dict = model(images, targets)
  File "/research/dept8/qdou/zwang/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/research/dept8/qdou/zwang/faster_rcnn/network_files/faster_rcnn_framework.py", line 93, in forward
    detections, detector_losses = self.roi_heads(features, proposals, images.image_sizes, targets)
  File "/research/dept8/qdou/zwang/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/research/dept8/qdou/zwang/faster_rcnn/network_files/roi_head.py", line 367, in forward
    proposals, matched_idxs, labels, regression_targets = self.select_training_samples(proposals, targets)
  File "/research/dept8/qdou/zwang/faster_rcnn/network_files/roi_head.py", line 222, in select_training_samples
    matched_idxs, labels = self.assign_targets_to_proposals(proposals, gt_boxes, gt_labels)
  File "/research/dept8/qdou/zwang/faster_rcnn/network_files/roi_head.py", line 144, in assign_targets_to_proposals
    labels_in_image[bg_inds] = 0
RuntimeError: copy_if failed to synchronize: cudaErrorAssert: device-side assert triggered

求训练好的权重

本人没有GPU，想用下训练好的权重文件，求分享，谢谢

One line of code missing in PyTorch GoogLeNet?

在 PyTorch版本的GoogLeNet的train.py文件中，用来保存最好的模型的代码：

        if accurate_test > best_acc:
            torch.save(net.state_dict(), save_path)

似乎少了一句：

        if accurate_test > best_acc:
            best_acc = accurate_test
            torch.save(net.state_dict(), save_path)

请教一下downsample变量的作用？

在model文件中，_make_layer函数使用了downsample = None，请教一下这个是起什么作用呢

作者大大您好！在看完您的 resnet 训练代码后有几个地方不太明白，可能我的问题很小白，还请别介意，感谢指教

在您的tensorflow resnet 的 train.py 中这一段代码的作用是什么呢？（另外，很抱歉，目前我没看resnet相关论文，所以理论方面还不懂（捂脸））

train.py line 70

model = tf.keras.Sequential([feature,
                             tf.keras.layers.GlobalAvgPool2D(),
                             tf.keras.layers.Dropout(rate=0.5),
                             tf.keras.layers.Dense(1024),
                             tf.keras.layers.Dropout(rate=0.5),
                             tf.keras.layers.Dense(5),
                             tf.keras.layers.Softmax()])

这部分我不太明白为什么用 Sequential 再度堆叠了一个网络结构，feature 已经是 resnet 网络模型了所以输出形状是dim=2， umm 接下来的我不确定我的理解对不对，我的理解是下一层的全局池化层输入形状是 dim=4 ，与这里 feature 输出形状不匹配，所以无法和 feature 连接，所以我搞不太明白这段的含义，或者要如何调整呢？我的任务是使用 resnet 对一组 40*18 尺寸的灰度图像做一个二分类。盼回复！提前感谢您！

RuntimeError: Trying to pass too many CPU scalars to CUDA kernel!

Thanks for sharing you code . when I run 'python train_mobilenet.py ',I meet the problem.How I can do to solve the error!

Traceback (most recent call last):
File "train_mobilenet.py", line 157, in
main()
File "train_mobilenet.py", line 91, in main
train_loss=train_loss, train_lr=learning_rate)
File "/home/dl/桌面/faster_rcnn/train_utils/train_eval_utils.py", line 33, in train_one_epoch
loss_dict = model(images, targets)
File "/home/dl/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/dl/桌面/faster_rcnn/network_files/faster_rcnn_framework.py", line 87, in forward
proposals, proposal_losses = self.rpn(images, features, targets)
File "/home/dl/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/dl/桌面/faster_rcnn/network_files/rpn_function.py", line 615, in forward
labels, matched_gt_boxes = self.assign_targets_to_anchors(anchors, targets)
File "/home/dl/桌面/faster_rcnn/network_files/rpn_function.py", line 410, in assign_targets_to_anchors
matched_idxs = self.proposal_matcher(match_quality_matrix)
File "/home/dl/桌面/faster_rcnn/network_files/det_utils.py", line 347, in call
matches[below_low_threshold] = torch.tensor(self.BELOW_LOW_THRESHOLD) # -1
RuntimeError: Trying to pass too many CPU scalars to CUDA kernel!

这个for循环存在的意义是什么,是否应该把strides的计算和 anchors_over_all_feature_maps移动到第一层for循环呢

deep-learning-for-image-processing/pytorch_object_detection/faster_rcnn/network_files/rpn_function.py

Line 196 in 0a1cce2

for i, (image_height, image_width) in enumerate(image_list.image_sizes):

faster rcnn

up，请问下为什么您的代码只需要训练几千次就有很好的效果了，而别的文献中动不动就要几万次，请问您是采用的什么方法呢？另外我的任务也是比较简单，您用mob训练voc数据集时训练了多少次呢？以上提问基于您的faster rcnn教程，盼解答，谢谢您！

拼写错误

YOLO拼写错误了，您看看

cource_pdf

你好！从你的cource_pdf下载的课件ppt为什么总是提示无法打开文件，说是因为不支持该类型文件或已损坏。我用的wps,下载的文件也是没问题的，谁知道怎么解决吗？谢谢！

Loss is nan, stopping training

老师您好！我看了您的视频感觉讲的非常好，我这边使用您的代码在VOC2012数据集上可以正常运行，但是我使用我自己标注的数据集就出现了这个问题：Loss is nan, stopping training。我不太确定这个问题产生的原因，想来咨询您一下。我使用的是Windows，1块2060的CPU、用的是res50_fpn。

Temporal Action Localization

老哥，我好崇拜你啊。你看能不能找机会做一下这篇文章，"Rethinking the Faster R-CNN Architecture for Temporal Action Localization"，原作者没有提供代码。

YOLOv3_spp的IOU计算问题

yolov3spp项目utils.py中的wh_iou是直接假设GT和anchor中心重合，然后取较短的w和h算IOU吗？

Unresolved reference 'ImageDataGenerator'

问题：在pycharm 工具打开的，下面的模块不能正常解析，但是代码能运行成功
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import layers, Model, Sequential

环境：pycharm TensorFlow 2.1 ，Python 3.6.10 测试如下
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow import keras

print(tf.path)
print(keras.path)
print(ImageDataGenerator)
输出：
['C:\tools\Anaconda3\envs\tf-gpu-pytorch\lib\site-packages\tensorflow']
['C:\tools\Anaconda3\envs\tf-gpu-pytorch\lib\site-packages\tensorflow_core\python\keras\api\_v2\keras']
<class 'tensorflow.python.keras.preprocessing.image.ImageDataGenerator'>

怎么改写代码？

请问使用VGG检测的时候怎么实现批量检测，如果检测图中有多个目标，可以输出多个目标的概率吗？我现在用的是您的tensorflow-VGG 和PYTORCH-VGG

我直接运行的train_res50_fpn代码，数据用的VOC2012数据。分别用有/无预训练模型训练，效果差的非常远。

我理解有无预训练模型效果会有一些差别，但是现在看上去没有预训练模型，网络完全无法收敛，可能的原因是什么呢？这样的话我修改模型用在自己的数据集上，没有预训练模型的话，又该怎么办呢？

googlenet

System information

Have I written custom code:
OS Platform(e.g., window10 or Linux Ubuntu 16.04):
Python version:
Deep learning framework and version(e.g., Tensorflow2.1 or Pytorch1.3):
Use GPU or not:
CUDA/cuDNN version(if you use GPU):
The network you trained(e.g., Resnet34 network):

Describe the current behavior

Error info / logs

RuntimeError: class 'torch.network_files.image_list.ImageList' already defined.

Reloaded modules: transforms
Traceback (most recent call last):

File "/home2/cx/WeakDet/deep-learning-for-image-processing/pytorch_object_detection/faster_rcnn/train_res50_fpn.py", line 3, in
from network_files.faster_rcnn_framework import FasterRCNN, FastRCNNPredictor

File "/home2/cx/WeakDet/deep-learning-for-image-processing/pytorch_object_detection/faster_rcnn/network_files/faster_rcnn_framework.py", line 4, in
from network_files.rpn_function import AnchorsGenerator, RPNHead, RegionProposalNetwork

File "/home2/cx/WeakDet/deep-learning-for-image-processing/pytorch_object_detection/faster_rcnn/network_files/rpn_function.py", line 9, in
from network_files.image_list import ImageList

File "/home2/cx/WeakDet/deep-learning-for-image-processing/pytorch_object_detection/faster_rcnn/network_files/image_list.py", line 7, in
class ImageList(object):

File "/home/cx/anaconda3/envs/fcn_torch1.0/lib/python3.6/site-packages/torch/jit/init.py", line 1280, in script
_compile_and_register_class(obj, _rcb, qualified_name)

File "/home/cx/anaconda3/envs/fcn_torch1.0/lib/python3.6/site-packages/torch/jit/init.py", line 1108, in _compile_and_register_class
_jit_script_class_compile(qualified_name, ast, rcb)

RuntimeError: class 'torch.network_files.image_list.ImageList' already defined.

loss 降不下来，检测结果很差，不知道什么原因，请问需要手动resize 训练和检测图片吗

System information

Have I written custom code:
OS Platform(e.g., window10 or Linux Ubuntu 16.04):
Python version:
Deep learning framework and version(e.g., Tensorflow2.1 or Pytorch1.3):
Use GPU or not:
CUDA/cuDNN version(if you use GPU):
The network you trained(e.g., Resnet34 network):

Describe the current behavior

Error info / logs

验证模块

在SSD中想获得每个类ap，我根据原SSD的eval代码进行修改，验证时发现得到的map比你使用的coco评价准则所获得的map降了3,4个点，请问你觉得是哪些地方有可能出问题（请问方便的话你可以添加获得每个类别的ap方法吗，感激不尽）

训练好的权重

感谢您做的这个教程!太棒了!
有已经训练好的weights吗出于学习目的不想去训练

COCO class num 问题

为什么测试faster RCNN 官方的resnet50 rpn网络权重时，num_classes要设成91， coco不应该是81吗？

顺便问一下，哪里可以找到coco类别的json文件啊？

RuntimeError: Arguments for call are not valid.

Thanks for your code. when i run the code 'python train_res50_fpn.py',i meet this problem.How can i resolve it?
Traceback (most recent call last):
File "train_res50_fpn.py", line 3, in
from network_files.faster_rcnn_framework import FasterRCNN, FastRCNNPredictor
File "/home/yzhou/IpFPN/faster_rcnn/network_files/faster_rcnn_framework.py", line 4, in
from network_files.rpn_function import AnchorsGenerator, RPNHead, RegionProposalNetwork
File "/home/yzhou/IpFPN/faster_rcnn/network_files/rpn_function.py", line 6, in
from network_files import det_utils
File "/home/yzhou/IpFPN/faster_rcnn/network_files/det_utils.py", line 16, in
class BalancedPositiveNegativeSampler(object):
File "/home/yzhou/.pyenv/versions/anaconda3-5.3.1/envs/CenterNet/lib/python3.6/site-packages/torch/jit/init.py", line 1274, in script
_compile_and_register_class(obj, _rcb, qualified_name)
File "/home/yzhou/.pyenv/versions/anaconda3-5.3.1/envs/CenterNet/lib/python3.6/site-packages/torch/jit/init.py", line 1115, in _compile_and_register_class
_jit_script_class_compile(qualified_name, ast, rcb)
RuntimeError:
Arguments for call are not valid.
The following variants are available:

aten::index_put_(Tensor(a!) self, Tensor?[] indices, Tensor values, bool accumulate=False) -> (Tensor(a!)):
Expected a value of type 'Tensor' for argument 'values' but instead found type 'int'.

aten::index_put_(Tensor(a!) self, Tensor[] indices, Tensor values, bool accumulate=False) -> (Tensor(a!)):
Expected a value of type 'List[Tensor]' for argument 'indices' but instead found type 'List[Optional[Tensor]]'.

The original call is:
File "/home/yzhou/IpFPN/faster_rcnn/network_files/det_utils.py", line 85
)

        pos_idx_per_image_mask[pos_idx_per_image] = 1
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
        neg_idx_per_image_mask[neg_idx_per_image] = 1

Process finished with exit code -1073741819 (0xC0000005)

你好，在我使用GPU训练时会报上述错误，但注释掉loss.backward()就不会报错，但这样就灭有意义了，另外，使用CPU训练不会报错，但速度太慢。希望能帮忙解答！万分感谢

运行train_res50_fpn.py，报如下错误。windows10环境，python3.7，pytorch1.5，谢谢！

RuntimeError:
Arguments for call are not valid.
The following variants are available:

aten::cat(Tensor[] tensors, int dim=0) -> (Tensor):
Expected a value of type 'List[Tensor]' for argument 'tensors' but instead found type 'Tensor'.

aten::cat.out(Tensor[] tensors, int dim=0, *, Tensor(a!) out) -> (Tensor(a!)):
Expected a value of type 'List[Tensor]' for argument 'tensors' but instead found type 'Tensor'.

The original call is:
File "C:\faster_rcnn\network_files\det_utils.py", line 216
assert isinstance(rel_codes, torch.Tensor)
boxes_per_image = [b.size(0) for b in boxes]
concat_boxes = torch.cat(boxes, dim=0)
~~~~~~~~~ <--- HERE

    box_sum = 0

第一次在github上提问题，谢谢

关于目标检测指标的问题

你好，我在B站看了您的视频，觉得讲的特别好，我在实际应用中也遇到了一些问题；
类似于目标检测的问题，我研究的是时间行为定位（一个2D，一个1D）计算指标的方式差不多，现在我要计算mAP

前提：一共20类行为，213个测试视频，每个视频提取200个proposals。
问题：对于其中一类行为，GT有41个，prediction有2000个，
首先不是这一类视频中的proposal算FP，接着计算剩余proposals与GT的IoU，按照IoU从高到低排序，IoU小于0.5的算FP，IoU大于0.5的算TP，但如果之后检测到重叠的也算FP。这样算下来我TP最高也就是41，但是FP差不多2000，结果precision算的就会非常低。
想请教下您，不知道这样算有没有问题？

参数设置是论文中的，代码是 https://github.com/activitynet/ActivityNet/blob/master/Evaluation/eval_detection.py
论文：BSN: Boundary Sensitive Network for Temporal Action Proposal Generation

“On THUMOS14, we use top-2 video-level classes generated by UntrimmedNet [48] for proposals generated by BSN and other methods.Following previous works, on THUMOS14, we also implement SCNN-classifier on BSN proposals for proposal-level classification and adopt Greedy NMS as [7]. We use 100 and 200 proposals per video on ActivityNet-1.3 and THUMOS14 datasets separately.”

混淆矩阵

您好：
请问混淆矩阵精度算出来的有nan。混淆矩阵里有一行全是0？
劳烦有时间回答，谢谢！

faster_rcnn中train_mobilenet.py训练完一个epoch就会报错

System information

OS Platform(e.g., window10 ):window10
Python version:3.7
Deep learning framework and version(e.g., Tensorflow2.1 or Pytorch1.3):Pytorch1.5
Use GPU or not:GPU
CUDA/cuDNN version(if you use GPU):cuda10.1
The network you trained(e.g., Resnet34 network):mobilenet

Describe the current behavior

Error info / logs
Traceback (most recent call last):
File "E:\Anaconda\envs\PyTorch\Lib\site-packages\numpy\core\function_base.py", line 117, in linspace
num = operator.index(num)
TypeError: 'numpy.float64' object cannot be interpreted as an integer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "D:/FirstYearMaster/ObjectDetection/Faster RCNN/faster_rcnn/train_mobilenet.py", line 146, in
main()
File "D:/FirstYearMaster/ObjectDetection/Faster RCNN/faster_rcnn/train_mobilenet.py", line 93, in main
utils.evaluate(model, val_data_set_loader, device=device)
File "E:\Anaconda\envs\PyTorch\Lib\site-packages\torch\autograd\grad_mode.py", line 15, in decorate_context
return func(*args, **kwargs)
File "D:\FirstYearMaster\ObjectDetection\Faster RCNN\faster_rcnn\train_utils\train_eval_utils.py", line 70, in evaluate
coco_evaluator = CocoEvaluator(coco, iou_types)
File "D:\FirstYearMaster\ObjectDetection\Faster RCNN\faster_rcnn\train_utils\coco_eval.py", line 28, in init
self.coco_eval[iou_type] = COCOeval(coco_gt, iouType=iou_type)
File "E:\Anaconda\envs\PyTorch\Lib\site-packages\pycocotools\cocoeval.py", line 76, in init
self.params = Params(iouType=iouType) # parameters
File "E:\Anaconda\envs\PyTorch\Lib\site-packages\pycocotools\cocoeval.py", line 527, in init
self.setDetParams()
File "E:\Anaconda\envs\PyTorch\Lib\site-packages\pycocotools\cocoeval.py", line 507, in setDetParams
self.iouThrs = np.linspace(.5, 0.95, np.round((0.95 - .5) / .05) + 1, endpoint=True)
File "<array_function internals>", line 6, in linspace
File "E:\Anaconda\envs\PyTorch\Lib\site-packages\numpy\core\function_base.py", line 121, in linspace
.format(type(num)))
TypeError: object of type <class 'numpy.float64'> cannot be safely interpreted as an integer.

关于proposal问题

您好，关于faster rcnn的proposal，我有些疑惑，用一张图片进行预测，预测结果没问题，但中间过程的proposal框与预想的差别很大，很奇怪，毕竟预测结果是没问题的。我有在b站上私信您，希望能具体谈到下

faster rcnn中的nms操作

             rpn_pre_nms_top_n_train=2000, rpn_pre_nms_top_n_test=1000,    # rpn中在nms处理前保留的proposal数(根据score)

             rpn_post_nms_top_n_train=2000, rpn_post_nms_top_n_test=1000,  # rpn中在nms处理后保留的proposal数

为什么您的代码中nms处理前和处理后的proposal数是一样的呢？这样设置有什么作用吗？

faster rcnn

您好，FASRER RCNN在加入FPN后（resnet50+fpn）其损失函数的计算方式是否改变？我看您的代码里是没有改变的。

Own data training error useing SSD model.

When I use my own dataset for training, I meet the problem. My images format are RGB. Can you help me solve the problem?
System information

OS Platform(e.g., window10 or Linux Ubuntu 16.04):Linux Ubuntu 16.04
Python version:3.7
Deep learning framework and version(e.g., Tensorflow2.1 or Pytorch1.3): Pytorch1.5
CUDA/cuDNN version(if you use GPU): CUDA10.1
The network you trained(e.g., Resnet34 network): SSD

Error info / logs
Namespace(batch_size=4, device='cuda:0', epochs=15, output_dir='./save_weights', resume='', start_epoch=0)
Using cuda device training.
Using 4 dataloader workers
missing_keys: ['conf.0.weight', 'conf.0.bias', 'conf.1.weight', 'conf.1.bias', 'conf.2.weight', 'conf.2.bias', 'conf.3.weight', 'conf.3.bias', 'conf.4.weight', 'conf.4.bias', 'conf.5.weight', 'conf.5.bias', 'compute_loss.dboxes', 'postprocess.dboxes_xywh']
unexpected_keys: []
Epoch: [0] [ 0/104637] eta: 9 days, 4:57:54.872362 lr: 0.000500 total_losses: 10.8151 (10.8151) time: 7.6022 data: 7.0505 max mem: 1072
Epoch: [0] [ 50/104637] eta: 16:05:47.249883 lr: 0.000500 total_losses: 4.7562 (6.0402) time: 0.4383 data: 0.3700 max mem: 1169
Epoch: [0] [ 100/104637] eta: 15:38:36.884186 lr: 0.000500 total_losses: 4.0707 (5.2116) time: 0.4212 data: 0.3559 max mem: 1169
Traceback (most recent call last):
File "train_ssd300_object365.py", line 181, in
main(args)
File "train_ssd300_object365.py", line 121, in main
train_lr=learning_rate)
File "/data/ssd/train_utils/train_eval_utils.py", line 27, in train_one_epoch
for images, targets in metric_logger.log_every(data_loader, print_freq, header):
File "/data/ssd/train_utils/distributed_utils.py", line 204, in log_every
for obj in iterable:
File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 345, in next
data = self._next_data()
File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 838, in _next_data
return self._process_data(data)
File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 881, in _process_data
data.reraise()
File "/opt/conda/lib/python3.7/site-packages/torch/_utils.py", line 395, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in DataLoader worker process 2.
Original Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/data/ssd/object_dataset.py", line 82, in getitem
image, target = self.transform(image, target)
File "/data/ssd/transform.py", line 15, in call
image, target = trans(image, target)
File "/data/ssd/transform.py", line 181, in call
image = self.normalize(image)
File "/opt/conda/lib/python3.7/site-packages/torchvision/transforms/transforms.py", line 166, in call
return F.normalize(tensor, self.mean, self.std, self.inplace)
File "/opt/conda/lib/python3.7/site-packages/torchvision/transforms/functional.py", line 208, in normalize
tensor.sub(mean).div(std)
RuntimeError: output with shape [1, 300, 300] doesn't match the broadcast shape [3, 300, 300]

VGGmodel structure

Hi, I am using the VGG model. I find the dense layers have 4096 neural cells online, but your code writes 2048 cells.

boxes与losses不对应

deep-learning-for-image-processing/pytorch_object_detection/faster_rcnn/network_files/rpn_function.py

Line 628 in 0a1cce2

return boxes, losses

这里返回的经过nms等操作处理的boxes,而losses是未经这些处理的所有anchors的loss,是在后面roipooling删减保留了对应boxes的losses吗
后面的roipooling部分因为B站没视频还没看,所以先来问问up

我数据集是灰度图像，当我的输入in_channels改成1时为什么报错？

RuntimeError: Given groups=1, weight of size 64 1 3 3, expected input[32, 3, 224, 224] to have 1 channels, but got 3 channels instead

predict issue

模型训练完成后，进行predict时出现如下操作：

Traceback (most recent call last):
File "/Users/mengxiangyu/Desktop/faster_rcnn/predict.py", line 47, in
model.load_state_dict(torch.load(train_weights)["model"])
File "/Users/mengxiangyu/anaconda3/envs/th1.6/lib/python3.7/site-packages/torch/serialization.py", line 577, in load
with _open_zipfile_reader(opened_file) as opened_zipfile:
File "/Users/mengxiangyu/anaconda3/envs/th1.6/lib/python3.7/site-packages/torch/serialization.py", line 241, in init
super(_open_zipfile_reader, self).init(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: [enforce fail at inline_container.cc:144] . PytorchStreamReader failed reading zip archive: failed finding central directory

是什么原因，是否是加载的模型有问题

SSD代码中draw_box_utils.py的小问题

1、在复现代码的过程中发现，本仓库SSD代码中draw_box_utils.py文件的91行，存入的坐标为相对坐标，所以在绘制框的时候应该乘以图像的宽高，恢复成正常坐标。否则在裁剪操作后的绘图会出现一些问题。
for box, color in box_to_color_map.items():
xmin, ymin, xmax, ymax = box
(left, right, top, bottom) = (xmin * 1, xmax * 1,
ymin * 1, ymax * 1)
draw.line([(left, top), (left, bottom), (right, bottom),
(right, top), (left, top)], width=line_thickness, fill=color)

2、裁剪操作完是否应该把裁剪后的宽高存入target['height_width']中呢？不过不存入也不影响

faster rcnn

您好，我仔细看了下您的视频和代码，您在resnet50+fpn的训练脚本中设置的batchsize为4，然后看视频上您在训练时每个epoch迭代了178次，那您训练的图片样本是只有700多张吗？而您在mobilenet的训练脚本中设置的batchsize为8，这样每个epoch迭代了89次您训练了25个epoch，总的迭代次数就是两千多次吗？您的mobilenet在训练两千多次后达到了一个什么样的水平呢？我记得您在视频中讲到过，mobilenet的预训练模型只是针对特征提取网络的，那么使用mobilenet也是只用训练几千次就可以了吗？我的数据集由700张图片用您的代码，迭代了2500次，效果就不错了。以上基于您的faster rcnn代码和您的视频讲解，盼望您的解答！

wzmiaomiao / deep-learning-for-image-processing Goto Github PK

deep-learning-for-image-processing's People

Contributors

Stargazers

Watchers

Forkers

deep-learning-for-image-processing's Issues

Recommend Projects

Recommend Topics

Recommend Org