aloyschen / tensorflow-yolo3 Goto Github PK

View Code? Open in Web Editor NEW

143.0 143.0 58.0 2.51 MB

tensorflow implementation of yolov3

Python 99.90% Jupyter Notebook 0.10%

tensorflow yolov3

tensorflow-yolo3's People

Contributors

Stargazers

Watchers

tensorflow-yolo3's Issues

is there a efficient way to shuffle the data?

dataset = dataset.repeat().shuffle(70000).batch(batch_size).prefetch(batch_size)
i test the shuffle function and i believe the buffer_size decide the max index of the original data can be sampled, and my data is huge, so when i use the model to train, it stucked at the 40k+, like this:
2018-11-21 21:07:14.170579: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:98] Filling up shuffle buffer (this may take a while): 46287 of 70000
2018-11-21 21:07:24.262936: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:98] Filling up shuffle buffer (this may take a while): 46432 of 70000
no more logs
any suggestions would be appreciate!

Some problems with yolo_loss...

在yolo_loss里面计算raw_true_wh时，一般将无效的grid区域设为0。这里为什么设为1呢：

raw_true_wh = tf.log(tf.where(tf.equal(y_true[index][..., 2:4] / anchors[anchor_mask[index]] * input_shape[::-1], 0), tf.ones_like(y_true[index][..., 2:4]), y_true[index][..., 2:4] / anchors[anchor_mask[index]] * input_shape[::-1]))

其他项目里面：
raw_true_wh = K.log(y_true[l][..., 2:4] / anchors[anchor_mask[l]] * input_shape[::-1])
raw_true_wh = K.switch(object_mask, raw_true_wh, K.zeros_like(raw_true_wh)) # avoid log(0)=-inf
这里，无效的wh设为了0.
有谁可以解答一下吗？thanks....

InternalError (see above for traceback): Blas SGEMM launch failed : m=173056, n=32, k=64

hi, running your implementation occurs this error,as follow:
InternalError (see above for traceback): Blas SGEMM launch failed : m=173056, n=32, k=64
[[node darknet53/conv2d_3/Conv2D (defined at /ZFS4T/hitzht/tensorflow-yolo3-master-voc/model/yolo3_model.py:109) = Conv2D[T=DT_FLOAT, _class=["loc:@darknet53/batch_normalization_3/cond/FusedBatchNorm_1/Switch"], data_format="NHWC", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](darknet53/LeakyRelu_1, darknet53/conv2d_3/kernel/read)]]
[[{{node while/strided_slice_1/stack/_615}} = _HostRecvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_3963_while/strided_slice_1/stack", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
i try to run the train() with CPU,the error is disappear.hope you give me some advice.thanks

Time problem

How long did it take to complete for training?
Please let me know training details.

new_width and new_height

new_high = new_high * tf.minimum(input_width / new_width, input_high / new_high)
new_width = new_high * tf.minimum(input_width / new_width, input_high / new_high)
line2：I think it is new_width*tf.minimum()

Anybody trained successfully without pretrain?

How about the loss? Thanks!

What about the performance of tensorflow-yolov3

Hi,
I have a problem. I used a single GTX 1080TI to run. Then I followed the steps you said, and tested the image named dog,jpg. But, I found the time of predicting was a few seconds, while darknet was just about 22 ms.
Is there anything wrong with my operation? And what is the performance you think?
Thanks very much!

Why the memory used keeps increasing when training on GPU?

when i use this code to train model on coco2014 dataset, the memory used keeps increasing until been killed.

Is this a bug in the loss function?

In yolo3_model.py in the function yolo_loss at the end you are dividing by yolo_output shape:
class_loss = tf.reduce_sum(class_loss) / tf.cast(tf.shape(yolo_output[0])[0], tf.float32)
Shouldn't it be yolo_output[index] instead of yolo_output[0]? You are using it for all the losses.

为什么在训练voc数据集时，验证集损失很小，但是预测效果很差？

在训练时，验证集的损失能下降到10左右，但是实际测试的时候，计算出来的物体得分数很低，普遍低于0.3.而且画的框位置对的也不是很准。

边框回归中的 wh_loss计算部分只对true_box的wh取log运算了，而预测边框的宽高却没有？

在yolo_head函数中：
box_wh = tf.exp(predictions[..., 2:4]) * anchors_tensor / input_shape[::-1]

在yolo_loss函数中：
raw_true_wh = tf.log(tf.where(tf.equal(y_true[index][..., 2:4] / anchors[anchor_mask[index]] * input_shape[::-1], 0), tf.ones_like(y_true[index][..., 2:4]), y_true[index][..., 2:4] / anchors[anchor_mask[index]] * input_shape[::-1]))
...
wh_loss = object_mask * box_loss_scale * 0.5 * tf.square(raw_true_wh - predictions[..., 2:4])

此处取log，我理解的意思是去除绝对边框wh对预测box正确性的影响。不是边框回归公式中的取对数。
希望可以解答一下。

weights loading has something wrong

I tried to train this model without darknet53.weights. when I finish 5 Epoch with model saved, I found that I can't load correct weights from my old trained model weights . The loss start from 0, and rise always , can you tell me why.
我没用darknet53做预训练，直接跑的coco数据集，保存了一个阶段成果之后再加载我之前训练的参数，发现损失值loss从0开始一直往上增加，请问这个现象是为什么呢？

Export .ckpt file from training to .pb file ?

I want to generate a .pb file from the ckpt generate during the training.
My issue : I can't find the input node and the output node(s) of this implementation of YOLOv3

I exported all the nodes in text file but here are more than 10 000, I still more confused.

Customizing Number of Anchors to 5

I have modified your code to run and predict for 5 anchors. It works fine when training but the detect part is failing with
AbortedError (see above for traceback): Operation received an exception:Status: 3, message: could not create a dilated convolution forward descriptor, in file tensorflow/core/kernels/mkl_conv_ops.cc:1111
[[node darknet53/conv2d_2/Conv2D (defined at D:\yolo_2\tensorflow-yolo3-master\tensorflow-yolo3-master\model\yolo3_model.py:110) ]]

Any help will be appreciated!!!!

Thanks

has no attribute 'glorot_uniform_initializer

你好,我运行detect.py时,发生:AttributeError: module 'tensorflow' has no attribute 'glorot_uniform_initializer',但是这个函数确实在tf中文件存在,请问怎么解决?

Running the detector with a webcam

I'm running your code on a webcam and it is very slow (On CPU). I'm wondering if there are parts of the detection code that I can optimize to run the code faster on CPU.

something about Preprocess_true_boxes

From line 99 to 113 in dataReader.py, the process of IOU is only related to wh of boxes. As I known, when we calculate the value of IOU between two boxes, all of x, y, w, h should be used. Is there anything wrong with me?

Converged YOLO loss

@aloyschen I am trying to implement YOLOv3 using some ideas and modules from your code.
I can't get the loss of my model to converge. My training loss is hovering around 10 even after 100 epochs on a dataset with 200 images of raccoon.
I have disected the model to contain only 2 scales and I am using the pre trained darknet-53 weights with no optimization running over the feature extractor.

I was wondering on which dataset you tried the training of the model and what was the number of epochs, what training loss was like, and other related information for which your model converged and started giving some reasonable predictions.

All the training details are provided in the following cfg

num_parallel_calls = 4
input_shape = 416
max_boxes = 20
jitter = 0.3
hue = 0.1
sat = 1.0
cont = 0.8
bri = 0.1
norm_decay = 0.99
weight_decay = 5e-4
norm_epsilon = 1e-4
pre_train = True
train_last_layers_only = False
num_anchors = 6
num_classes = 1
training = True
disect = True
disect_scale = 1
ignore_thresh = .5
learning_rate = 1e-4
train_batch_size = 10
val_batch_size = 4
# train_num = 4761
# val_num = 250
train_num = 190
val_num = 10
Epoch = 200
obj_threshold = 0.3
nms_threshold = 0.5
gpu_index = "0"
log_dir = './logs'
data_dir = './dataset/'
model_dir = './converted/'
yolov3_cfg_path = './darknet_data/yolov3.cfg'
yolov3_weights_path = './darknet_data/yolov3.weights'
darknet53_weights_path = './darknet_data/darknet53.weights'
anchors_path = './yolo_anchors.txt'
classes_path = './model_data/raccoon_classes.txt'
train_annotations_file = './train.txt'
val_annotations_file = './val.txt'
output_dir = './tfrecords/'

Tensorboard screenshots are attached below

请问为什么要对bbox进行pad啊

bbox = tf.cond(tf.greater(tf.shape(bbox)[0], 20), lambda: bbox[:20], lambda: tf.pad(bbox, paddings = [[0, 20 - tf.shape(bbox)[0]], [0, 0]], mode = 'CONSTANT'))

出自data_reader中的Preprocess

mAP(23.43) is much low than use the pretrain weights yolov3.weights(48.14)

我使用你的代码设置了pre_train为true(使用了darknet53 pretrain的权重)训练模型，最后的loss值一直在32上下波动。使用eval函数做测试得到的mAP为23.43。如果配置使用yolov3.wights测试mAP，得到的值是48.14。两者相差很大，请问训练时有什么需要注意的吗？有什么办法能提高mAP值吗？

How to generate .tfrecords files from COCO?

I want to know what rule to use to generate the .tfrecords file.
Thank you for much for your reply.

The w and h not change after padding when processing images

The w and h not change after padding when processing images ， so xmax and ymax not need to add the padding offset dx /dy, I dot know if there is anything wrong

    xmin = xmin * new_width / image_width + dx  
    xmax = xmax * new_width / image_width + dx
    ymin = ymin * new_high / image_high + dy 
    ymax = ymax * new_high / image_high + dy

aloyschen / tensorflow-yolo3 Goto Github PK

tensorflow-yolo3's People

Contributors

Stargazers

Watchers

Forkers

tensorflow-yolo3's Issues

Recommend Projects

Recommend Topics

Recommend Org