bubbliiiing / faster-rcnn-keras Goto Github PK

View Code? Open in Web Editor NEW

285.0 8.0 90.0 5.56 MB

这是一个faster-rcnn的keras实现的库，可以利用voc数据集格式的数据进行训练。

License: MIT License

Python 100.00%

faster-rcnn-keras's Introduction

Faster-Rcnn：Two-Stage目标检测模型在Keras当中的实现

Top News

2022-04:支持多GPU训练，新增各个种类目标数量计算。

2022-03:进行了大幅度的更新，支持step、cos学习率下降法、支持adam、sgd优化器选择、支持学习率根据batch_size自适应调整、新增图片裁剪。
BiliBili视频中的原仓库地址为：https://github.com/bubbliiiing/faster-rcnn-keras/tree/bilibili

2021-10:进行了大幅度的更新，增加了大量注释、增加了大量可调整参数、对代码的组成模块进行修改、增加fps、视频预测、批量预测等功能。

性能情况

训练数据集	权值文件名称	测试数据集	输入图片大小	mAP 0.5:0.95	mAP 0.5
VOC07+12	voc_weights_resnet.h5	VOC-Test07	-	-	81.16
VOC07+12	voc_weights_vgg.h5	VOC-Test07	-	-	76.28

所需环境

tensorflow-gpu==1.13.1
keras==2.1.5

文件下载

训练所需的voc_weights_resnet.h5、voc_weights_vgg.h5和主干的权值可以去百度网盘下载
链接: https://pan.baidu.com/s/1O5lTyservM2SHqMTz7YciA
提取码: fqv2

VOC数据集下载地址如下，里面已经包括了训练集、测试集、验证集（与测试集一样），无需再次划分：
链接: https://pan.baidu.com/s/1-1Ej6dayrx3g0iAA88uY5A
提取码: ph32

训练步骤

a、训练VOC07+12数据集

数据集的准备
本文使用VOC格式进行训练，训练前需要下载好VOC07+12的数据集，解压后放在根目录
数据集的处理
修改voc_annotation.py里面的annotation_mode=2，运行voc_annotation.py生成根目录下的2007_train.txt和2007_val.txt。
开始网络训练
train.py的默认参数用于训练VOC数据集，直接运行train.py即可开始训练。
训练结果预测
训练结果预测需要用到两个文件，分别是frcnn.py和predict.py。我们首先需要去frcnn.py里面修改model_path以及classes_path，这两个参数必须要修改。
model_path指向训练好的权值文件，在logs文件夹里。
classes_path指向检测类别所对应的txt。
完成修改后就可以运行predict.py进行检测了。运行后输入图片路径即可检测。

b、训练自己的数据集

数据集的准备
本文使用VOC格式进行训练，训练前需要自己制作好数据集，
训练前将标签文件放在VOCdevkit文件夹下的VOC2007文件夹下的Annotation中。
训练前将图片文件放在VOCdevkit文件夹下的VOC2007文件夹下的JPEGImages中。
数据集的处理
在完成数据集的摆放之后，我们需要利用voc_annotation.py获得训练用的2007_train.txt和2007_val.txt。
修改voc_annotation.py里面的参数。第一次训练可以仅修改classes_path，classes_path用于指向检测类别所对应的txt。
训练自己的数据集时，可以自己建立一个cls_classes.txt，里面写自己所需要区分的类别。
model_data/cls_classes.txt文件内容为：

cat
dog
...

修改voc_annotation.py中的classes_path，使其对应cls_classes.txt，并运行voc_annotation.py。

开始网络训练
训练的参数较多，均在train.py中，大家可以在下载库后仔细看注释，其中最重要的部分依然是train.py里的classes_path。
classes_path用于指向检测类别所对应的txt，这个txt和voc_annotation.py里面的txt一样！训练自己的数据集必须要修改！
修改完classes_path后就可以运行train.py开始训练了，在训练多个epoch后，权值会生成在logs文件夹中。
训练结果预测
训练结果预测需要用到两个文件，分别是frcnn.py和predict.py。在frcnn.py里面修改model_path以及classes_path。
model_path指向训练好的权值文件，在logs文件夹里。
classes_path指向检测类别所对应的txt。
完成修改后就可以运行predict.py进行检测了。运行后输入图片路径即可检测。

预测步骤

a、使用预训练权重

下载完库后解压，在百度网盘下载frcnn_weights.pth，放入model_data，运行predict.py，输入

img/street.jpg

在predict.py里面进行设置可以进行fps测试和video视频检测。

b、使用自己训练的权重

按照训练步骤训练。
在frcnn.py文件里面，在如下部分修改model_path和classes_path使其对应训练好的文件；model_path对应logs文件夹下面的权值文件，classes_path是model_path对应分的类。

_defaults = {
    #--------------------------------------------------------------------------#
    #   使用自己训练好的模型进行预测一定要修改model_path和classes_path！
    #   model_path指向logs文件夹下的权值文件，classes_path指向model_data下的txt
    #   如果出现shape不匹配，同时要注意训练时的model_path和classes_path参数的修改
    #--------------------------------------------------------------------------#
    "model_path"    : 'model_data/voc_weights_resnet.h5',
    "classes_path"  : 'model_data/voc_classes.txt',
    #---------------------------------------------------------------------#
    #   网络的主干特征提取网络，resnet50或者vgg
    #---------------------------------------------------------------------#
    "backbone"      : "resnet50",
    #---------------------------------------------------------------------#
    #   只有得分大于置信度的预测框会被保留下来
    #---------------------------------------------------------------------#
    "confidence"    : 0.5,
    #---------------------------------------------------------------------#
    #   非极大抑制所用到的nms_iou大小
    #---------------------------------------------------------------------#
    "nms_iou"       : 0.3,
    #---------------------------------------------------------------------#
    #   用于指定先验框的大小
    #---------------------------------------------------------------------#
    'anchors_size'  : [128, 256, 512],
}

运行predict.py，输入

img/street.jpg

在predict.py里面进行设置可以进行fps测试和video视频检测。

评估步骤

a、评估VOC07+12的测试集

本文使用VOC格式进行评估。VOC07+12已经划分好了测试集，无需利用voc_annotation.py生成ImageSets文件夹下的txt。
在frcnn.py里面修改model_path以及classes_path。model_path指向训练好的权值文件，在logs文件夹里。classes_path指向检测类别所对应的txt。
运行get_map.py即可获得评估结果，评估结果会保存在map_out文件夹中。

b、评估自己的数据集

本文使用VOC格式进行评估。
如果在训练前已经运行过voc_annotation.py文件，代码会自动将数据集划分成训练集、验证集和测试集。如果想要修改测试集的比例，可以修改voc_annotation.py文件下的trainval_percent。trainval_percent用于指定(训练集+验证集)与测试集的比例，默认情况下 (训练集+验证集):测试集 = 9:1。train_percent用于指定(训练集+验证集)中训练集与验证集的比例，默认情况下训练集:验证集 = 9:1。
利用voc_annotation.py划分测试集后，前往get_map.py文件修改classes_path，classes_path用于指向检测类别所对应的txt，这个txt和训练时的txt一样。评估自己的数据集必须要修改。
在frcnn.py里面修改model_path以及classes_path。model_path指向训练好的权值文件，在logs文件夹里。classes_path指向检测类别所对应的txt。
运行get_map.py即可获得评估结果，评估结果会保存在map_out文件夹中。

Reference

https://github.com/qqwweee/keras-yolo3/
https://github.com/pierluigiferrari/ssd_keras
https://github.com/kuhung/SSD_keras
https://github.com/jinfagang/keras_frcnn
https://github.com/Cartucho/mAP

faster-rcnn-keras's People

Contributors

Stargazers

Watchers

Forkers

windrise daichaodong hyt19988 eve66666 yangyongcs bityangke lizhenqi111 empathu sejudyblues shuangzixing89 liaoxianfu raoquanquan stducc longhuiszy fraduq kongweifeng678 frank21st chenlee98 pujianjian yukun1995 kang9779 hommmm kuailedagongzai rookie103 2606079013 re-re-remix flipped0423 fqy2333 lwqworld 2016011969sunyifang jhcy1117 haodehao 1849615215 yyyangup cui-f helloworld8710 hudaosong1996 swwh 1999john duxianghong 15538049136 hellocodeyjl dingxiaojun2016 antonizhubar loadangel qybing lizheng8866 zhn1234 antonizdp cocacolabai introyz rehan-dl tortorish yui-aragakki dreamriverforever xiafeng-nb misright lllovage aim-dection-issue zkailinzhang chenhl5055 helonin slyslaa deng-yongbiao pikapikamsl hhhkkkjjj firststeping kk-ggboom yupengg liuzc188 yuhangbeyond zhl6 letian01 yutao456 alan-img mygithubye rovelan nuhhatipoglu wsx2001 codertcm yellow-een jomsonm maxlai666 yyq1609 hanhan5201 joesandos wesley-yang neukaren hongwei94 shuaibibobo

faster-rcnn-keras's Issues

关于calc_iou函数得到的roi目标值和model_clasifier的预测值的维度问题

大佬，请问下，在roi_helper的calc_iou函数中，第102行有一段代码：
labels[label_pos:4+label_pos] = [1, 1, 1, 1]
这里将labels的这几个位置置为1是干嘛用的呢？
第115行，Y2 = np.concatenate([np.array(y_class_regr_label),np.array(y_class_regr_coords)],axis=1)
这样得到的Y2的形状将是[n_rois, 4*(n_classes-1)+4*(n_classes-1)]，就是[n_rois, 8*(n_classes-1)]。
然后在train.py中，第162行，将Y2直接送入model_classifier中进行训练。
loss_class = model_classifier.train_on_batch([X, X2[:, sel_samples, :]], [Y1[:, sel_samples, :], Y2[:, sel_samples, :]])
然而，model_classifier的输出list的第二个元素来自
out_regr = TimeDistributed(Dense(4 * (nb_classes-1), activation='linear', kernel_initializer='zero'), name='dense_regress_{}'.format(nb_classes))(out)，这里的形状是[1, n_rois, 4*(n_classes-1)]。这样，y_true和y_pred维度就不一样了，这样计算loss有问题吗？
我运行下来是没有问题的，但是不太理解这里是为什么。大佬能指点一二吗？

请问一下utils/config.py中的 logs/model.h5是和voc_weight.h5一样吗？

请问一下utils/config.py中self.model_path = "logs/model.h5"，该文件是voc_weight.h5吗？如果不是，能提供一下吗

我发现了一个log错误！

1.在作者train.py文件的第164行：
write_log(callback, ['detection_cls_loss', 'detection_reg_loss', 'detection_acc'], loss_class, train_step)。

2.我用tensorboard查看log发现detection_acc的图有点像loss的曲线，
所以我打印了检测名称：
print(model_classifier.metrics_names)，
得到的结果是：
['loss', 'dense_class_2_loss', 'dense_regress_2_loss', 'dense_class_2_acc']，
所以我怀疑这里有点问题，请作者确认一下。

能说一下环境吗

keras版本和tensorflow版本？

训练自己的数据集时: IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

不知道该怎么改

能條batch_size之類的嗎?GPU都會爆掉

roi_helpers.py的115行报错

8/2000 [..............................] - ETA: 2:14:03 - rpn_cls: 2.5315 - rpn_regr: 3.9138 - detector_cls: 1.7752 - detector_regr: 0.4206Traceback (most recent call last):
File "f:/deeplearning-exp/faster-rcnn-keras-master/train.py", line 119, in
X2, Y1, Y2, IouS = calc_iou(R, config, boxes[0], width, height, NUM_CLASSES)
File "f:\deeplearning-exp\faster-rcnn-keras-master\utils\roi_helpers.py", line 115, in calc_iou
Y2 = np.concatenate([np.array(y_class_regr_label),np.array(y_class_regr_coords)],axis=1)
File "<array_function internals>", line 6, in concatenate
numpy.AxisError: axis 1 is out of bounds for array of dimension 1
这一行似乎应该改成Y2 = np.concatenate([np.array(y_class_regr_label),np.array(y_class_regr_coords)],axis=0)，是么？

File "/mistgpu/miniconda/envs/tensorflow-gpu/lib/python3.6/site-packages/keras/engine/base_layer.py", line 793, in compute_output_shape 'method on your layer (%s).' % self.class.name) NotImplementedError: Please run in eager mode or implement the `compute_output_shape` method on your layer (BatchNormalization).

File "/mistgpu/miniconda/envs/tensorflow-gpu/lib/python3.6/site-packages/keras/engine/base_layer.py", line 793, in compute_output_shape
'method on your layer (%s).' % self.class.name)
NotImplementedError: Please run in eager mode or implement the compute_output_shape method on your layer (BatchNormalization).

请问UP主能不能把代码更新到tensorflow2

置信度获取

请问在代码的哪一块能获取到box的置信度？

运行anchors.py时遇到错误

ModuleNotFoundError: No module named 'utils.config'; 'utils' is not a package

关于 nets/frcnn_training.py 第260行的问题

classification[mask_neg][val_locs] = -1 这一句代码好像不会改变原 classification 的值。

我做了下面的测试

import numpy as np
x = np.array([1,2,3,4,5,6])
print(x)                   # [1 2 3 4 5 6]
mask = x > 2
print(mask)            # [False False  True  True  True  True]
x[mask] = 100
print(x)                   # [  1   2 100 100 100 100]
x[mask][0] = -1
print(x)                   # [  1   2 100 100 100 100]

输出：
[1 2 3 4 5 6]
[False False True True True True]
[ 1 2 100 100 100 100]
[ 1 2 100 100 100 100]

版本问题

方便说下你这版的tensorflow和keras的版本吗

validation的问题

博主你好，我想问一下这个代码是否可以在training的过程中validation？

就没人问问这是python几吗？？？

训练完成后，如何看训练结果？

c++

我用c++ tensorflow预测到这一步了但是后面我看python代码进行了2次预测，请问下c++下面怎么处理这个结果呢

when i run get_map.py,there is Error: No ground-truth files found!? how can i do ?

map上不去

训练自己的数据集1000张 7个类别，loss稳定在0.3左右，定位效果不错，为啥分类都是错的呢，只有一个类别map在70几，其他的都是0？

resnet运行报错

直接运行nets/resnet.py出现以下错误：
`Traceback (most recent call last):
File "D:\Program Files\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 1864, in _create_c_op
c_op = c_api.TF_FinishOperation(op_desc)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Shape must be rank 1 but is rank 0 for 'bn_conv1/Reshape_4' (op: 'Reshape') with input shapes: [1,1,1,64], [].

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "E:/PycharmProjects/faster-rcnn-keras/nets/resnet.py", line 250, in
model = ResNet50(inputs)
File "E:/PycharmProjects/faster-rcnn-keras/nets/resnet.py", line 162, in ResNet50
x = BatchNormalization(name='bn_conv1')(x)
File "D:\Program Files\Anaconda3\lib\site-packages\keras\engine\base_layer.py", line 457, in call
output = self.call(inputs, **kwargs)
File "E:/PycharmProjects/faster-rcnn-keras/nets/resnet.py", line 91, in call
epsilon=self.epsilon)
File "D:\Program Files\Anaconda3\lib\site-packages\keras\backend\tensorflow_backend.py", line 1908, in batch_normalization
mean = tf.reshape(mean, (-1))
File "D:\Program Files\Anaconda3\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 9093, in reshape
"Reshape", tensor=tensor, shape=shape, name=name)
File "D:\Program Files\Anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper
op_def=op_def)
File "D:\Program Files\Anaconda3\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "D:\Program Files\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 3616, in create_op
op_def=op_def)
File "D:\Program Files\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 2027, in init
control_input_ops)
File "D:\Program Files\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 1867, in _create_c_op
raise ValueError(str(e))
ValueError: Shape must be rank 1 but is rank 0 for 'bn_conv1/Reshape_4' (op: 'Reshape') with input shapes: [1,1,1,64], [].

Process finished with exit code 1
`
异常是从BatchNormalization抛出的，请问有人遇到这样的问题吗

UP主，啥时候把代码升级到2.1

训练自己的样本时map老是上不去。。。。请问有啥调参的建议不？

看了下输出的结果发现recall非常低，即便precession已经降到0了，recall也只有80多，尝试了下调学习率（根本不太可能有用吧。。。。），没有效果，怀疑是RPN的问题，增加了RPN里两个loss的权重，也不行，请问还有别的优化方法吗？另外关于epoch_length参数是要设置成和自己的样本数量一致吗？如果我设置了一个小于样本数量的值a，就随机从我的样本中抽取a个进行训练？望各位不吝赐教/(ㄒoㄒ)/~~

您好，请问这个模型的初始权重(voc_weights.h5)文件可以换成其他网络吗？比如换成vgg16或者resnet101什么的。

您好，我有几个问题想请教您！
1.请问这个模型的初始权重(voc_weights.h5)文件可以换成其他网络吗？比如换成vgg16或者resnet101什么的。

2.您设定的epoch为100个，每个epoch2000(step?)次，我的电脑(win10.1050ti.4G)一个epoch训练时间大概要半个小时。之前我在琢磨git上的Faster-RCNN-TensorFlow-Python3，他迭代一次所需的时间只有1-2s(我上个厕所就迭代了一千多次。。。但后面他的demo.py文件我运行起来无法生成预测的图片，怎么改都不行，加上训练所需时间实在太短，感觉不真实，所以就放弃研究那个模型了。)请问您有研究过那个模型吗？为什么两个模型的训练时间相差这么大？是我电脑环境的问题吗

我用的
tf-gpu 1.12.0
cudn 9.0.176
cudnn 7.3.0.29

训练自己数据集时候遇到问题，麻烦博主看一下

Epoch 1/50: 0%| | 0/114 [00:04<?, ?it/s<class 'dict'>]
Traceback (most recent call last):
File "F:/JetBrains/PycharmProjects/faster-rcnn-keras/train.py", line 208, in
callback)
File "F:/JetBrains/PycharmProjects/faster-rcnn-keras/train.py", line 52, in fit_one_epoch
X2, Y1, Y2 = calc_iou(R, config, boxes[i], NUM_CLASSES)
File "F:\JetBrains\PycharmProjects\faster-rcnn-keras\utils\roi_helpers.py", line 122, in calc_iou
np.int32)] = 1
IndexError: index 1 is out of bounds for axis 1 with size 1

就是这行代码：
y_class_regr_label[
np.arange(np.shape(gt_roi_loc)[0])[:pos_roi_per_this_image], np.array(gt_roi_label[:pos_roi_per_this_image],
np.int32)] = 1

关于Get_map.py文件运行中的报错

Traceback (most recent call last):
File "f:/dece/faster-rcnn-keras-master/get_map.py", line 706, in
+ " ; Recall=" + "{0:.2f}%".format(rec[score05_idx]*100) + " ; Precision=" + "{0:.2f}%".format(prec[score05_idx]*100)) ore05_idx]*100))
IndexError: index 0 is out of bounds for axis 0 with size 0

报错显示如上

想請問utils.py內encoded跟decoded的問題

作者您好!

在FasterRCNN的論文裡面，計算tx, ty, tw, th的公式如圖所示
而在encode_boxes有看到
`

    encoded_box[:, :2][assign_mask] = box_center - assigned_priors_center
    encoded_box[:, :2][assign_mask] /= assigned_priors_wh
    encoded_box[:, :2][assign_mask] *= 4
    encoded_box[:, 2:4][assign_mask] = np.log(box_wh / assigned_priors_wh)
    encoded_box[:, 2:4][assign_mask] *= 4

和decode_boxes
`

    # 真实框距离先验框中心的xy轴偏移情况
    decode_bbox_center_x = mbox_loc[:, 0] * prior_width / 4
    decode_bbox_center_x += prior_center_x
    decode_bbox_center_y = mbox_loc[:, 1] * prior_height / 4
    decode_bbox_center_y += prior_center_y
    
    # 真实框的宽与高的求取
    decode_bbox_width = np.exp(mbox_loc[:, 2] / 4)
    decode_bbox_width *= prior_width
    decode_bbox_height = np.exp(mbox_loc[:, 3] /4)
    decode_bbox_height *= prior_height

想請問為何要針對encode_box*4和對decode_box/4呢?

问题已删除

训练自己数据集是否可以用预训练权重

请问训练自己的数据集是需要从头开始训练，还是可以使用你提供的预训练好的权重进行继续训练啊。

关于如何使用gpu训练的问题

是按照博主的环境配置的

（我的其他项目是可以用gpu训练的。）
同时，我也在博主的train.py中加了
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
可是训练的时候使用的还是cpu，无法用gpu加速，训练得太慢了。

可爱的博主知道是什么原因吗？
求指教！

关于电脑配置的问题:RTX2060,6g显存在batch_size=2时会爆

请问我rtx2060，6g的显存，batch_size=1的时候都会爆是为什么呀？

can you explain the get_dr_txt and get_gt_txt mean?

hello, when we have done the train, the predict.py can plot the predict boxes, but ,how can we evaluate the recall , precission and mAP

模型是你训练的吗？

这个模型是你自己训练的还是下载的论文原作者的？感觉去检测视频时候，效果很差。

尝试改成tf2.0版本，但出现OOM问题

能训练，但是训练几十个step之后就会OOM。减少了classifier_layers函数中的channel能延迟OOM到来，但没法根治问题。

能否一次训练多张图片？

博主的程序是不是一次训练一张图片，可否一次训练多张？

请问如果设置了Freeze_train = True 但是跳过了freeze train的epoch直接train unfreeze 的epoch了是为什么呢？答：因为我漏看了！unfreeze_epoch数是总训练eposh!

TypeError: Dimension value must be integer or None or have an index method, got value '(None, None, 1024)' with type '<class 'tuple'>'

./nets/frcnn.py
classifier = get_resnet50_classifier(feature_map_input, roi_input, 14, num_classes)

./nets/classifier.py

def get_resnet50_classifier(base_layers, input_rois, roi_size, num_classes=21):
    # batch_size, 38, 38, 1024 -> batch_size, num_rois, 14, 14, 1024
    out_roi_pool = RoiPoolingConv(roi_size)([base_layers, input_rois])

    # batch_size, num_rois, 14, 14, 1024 -> num_rois, 1, 1, 2048
    out = resnet50_classifier_layers(out_roi_pool)

    # batch_size, num_rois, 1, 1, 2048 -> batch_size, num_rois, 2048
    out = TimeDistributed(Flatten())(out)

    # batch_size, num_rois, 2048 -> batch_size, num_rois, num_classes
    out_class   = TimeDistributed(Dense(num_classes, activation='softmax', kernel_initializer=random_normal(stddev=0.02)), name='dense_class_{}'.format(num_classes))(out)
    # batch_size, num_rois, 2048 -> batch_size, num_rois, 4 * (num_classes-1)
    out_regr    = TimeDistributed(Dense(4 * (num_classes - 1), activation='linear', kernel_initializer=random_normal(stddev=0.02)), name='dense_regress_{}'.format(num_classes))(out)
    return [out_class, out_regr]

./nets/resnet.py

def conv_block_td(input_tensor, kernel_size, filters, stage, block, strides=(2, 2)):
    nb_filter1, nb_filter2, nb_filter3 = filters
    conv_name_base  = 'res' + str(stage) + block + '_branch'
    bn_name_base    = 'bn' + str(stage) + block + '_branch'

-----> here is the issue--------------------------------------------------------------
    x = TimeDistributed(Conv2D(nb_filter1, (1, 1), strides=strides, kernel_initializer='normal'), name=conv_name_base + '2a')(input_tensor)
--------------------------------------------------------------------------------------
    x = TimeDistributed(BatchNormalization(), name=bn_name_base + '2a')(x)
    x = Activation('relu')(x)
...

def resnet50_classifier_layers(x):
    # batch_size, num_rois, 14, 14, 1024 -> batch_size, num_rois, 7, 7, 2048
    x = conv_block_td(x, 3, [512, 512, 2048], stage=5, block='a', strides=(2, 2))
    # batch_size, num_rois, 7, 7, 2048 -> batch_size, num_rois, 7, 7, 2048
    x = identity_block_td(x, 3, [512, 512, 2048], stage=5, block='b')
    # batch_size, num_rois, 7, 7, 2048 -> batch_size, num_rois, 7, 7, 2048
    x = identity_block_td(x, 3, [512, 512, 2048], stage=5, block='c')
    # batch_size, num_rois, 7, 7, 2048 -> batch_size, num_rois, 1, 1, 2048
    x = TimeDistributed(AveragePooling2D((7, 7)), name='avg_pool')(x)

我在colab上试着运行，但在上面箭头处出现错误，请问是什么原因？
TypeError: Dimension value must be integer or None or have an index method, got value '(None, None, 1024)' with type '<class 'tuple'>'

数据集的百度网盘链接失效了😭

博主大大，数据集的百度网盘链接失效了，麻烦再发一下数据集，球球了

map calculated with Cartucho repository is different from reported value

I calcualated the maP with Cartucho repository (https://github.com/Cartucho/mAP), with weights reported and the value is only 63 , different from 77 reported. Thank you

想问下博主如何让模型从0开始训练

请问标签文件的格式是？

为什么我训练了120个epoch了，准确率也只有58，上不去，voc2007数据集。请问你们的超参数是多少？

您好，请问这个模型的初始权重(voc_weights.h5)文件可以换成其他网络吗？比如换成vgg16或者resnet101什么的。

损失问题

大佬，您好我将您的模型用于voc数据集训练时，发现后两个损失值一直在上升，即detectior_cls和detectior_regr,最近做毕设用到这个模型，希望您能百忙之中尽快回复
1/2000 [..............................] - ETA: 11:45:41 - rpn_cls: 0.9137 - rpn_regr: 2.7919 - detector_cls: 0.1544 - detector_regr: 0.2115
2/2000 [..............................] - ETA: 7:20:18 - rpn_cls: 1.0191 - rpn_regr: 2.2316 - detector_cls: 0.2252 - detector_regr: 0.3208
3/2000 [..............................] - ETA: 5:54:12 - rpn_cls: 1.0463 - rpn_regr: 1.8886 - detector_cls: 0.2790 - detector_regr: 0.4301
4/2000 [..............................] - ETA: 5:11:04 - rpn_cls: 1.0501 - rpn_regr: 1.7989 - detector_cls: 0.3448 - detector_regr: 0.5418
5/2000 [..............................] - ETA: 4:41:25 - rpn_cls: 1.0798 - rpn_regr: 1.6981 - detector_cls: 0.3767 - detector_regr: 0.5938
6/2000 [..............................] - ETA: 4:21:43 - rpn_cls: 1.0718 - rpn_regr: 1.5993 - detector_cls: 0.4244 - detector_regr: 0.6206

resnet.py的BatchNormalisation有问题。Shape must be rank 1 but is rank 0 for 'bn_conv1/Reshape_4'

Python 3.6

运行resnet的话，会报错这个TypeError。
检查之后发现可能是这行代码的问题
第77行
if sorted(reduction_axes) == range(K.ndim(x))[:-1]:

range部分是不是要变成list？
我把它变成list之后就可以顺畅运行了。

up的代码居然改了！！

本来只想把之前同事修改的训练代码改回up的代码，没想到这是一条不归路。疯狂复制才回到了正轨。。

Does this repo support pre-computed proposals?

Hi,
Does this repo support pre-computed proposals?
BTW， have you tested on Nvidia RTX3090 GPU?

roi_helpers.py报错

File "train.py", line 207, in <module>
    fit_one_epoch(model_rpn, model_all, epoch, epoch_size, epoch_size_val, gen, gen_val, Interval_Epoch, callback)
  File "train.py", line 55, in fit_one_epoch
    X2, Y1, Y2 = calc_iou(R, config, boxes[i], NUM_CLASSES)
  File "/media/user/date/qianyi/deeplearningwork/faster-rcnn-keras/utils/roi_helpers.py", line 110, in calc_iou
    Y1 = np.eye(num_classes)[np.array(gt_roi_label,np.int32)]
IndexError: index 6 is out of bounds for axis 0 with size 2

NUM_CLASSES=2，只有一个类可以进行训练吗出错如上，求助，谢谢

下载不了为什么，无法找到网页

ValueError: cannot reshape array of size 0 into shape (0,newaxis)

I am training the model with basic resnet 50 weights from imagenet with voc pascal images. I got this error after a few ages. I think it could come from images without enough positive and negative bounding boxes. Can you solve?

faster-rcnn-keras-master\faster-rcnn-keras-master\utils\roi_helpers.py", line 122, in calc_iou

y_class_regr_label = np.reshape(y_class_regr_label, [np.shape(gt_roi_loc)[0], -1])
File "<array_function internals>", line 6, in reshape
lib\site-packages\numpy\core\fromnumeric.py", line 301, in reshape

return _wrapfunc(a, 'reshape', newshape, order=order)
ib\site-packages\numpy\core\fromnumeric.py", line 61, in _wrapfunc

return bound(*args, **kwds)
ValueError: cannot reshape array of size 0 into shape (0,newaxis)