rykov8 / ssd_keras Goto Github PK
View Code? Open in Web Editor NEWPort of Single Shot MultiBox Detector to Keras
License: MIT License
Port of Single Shot MultiBox Detector to Keras
License: MIT License
I am getting a file not found error in generate function of file SSD_Training.ipynb, complaining a file is not found.
the stack trace is as below:
Exception in thread Thread-4:
Traceback (most recent call last):
File "/home/vorale/anaconda3/lib/python3.5/threading.py", line 914, in _bootstrap_inner
self.run()
File "/home/vorale/anaconda3/lib/python3.5/threading.py", line 862, in run
self._target(*self._args, **self._kwargs)
File "/home/vorale/anaconda3/lib/python3.5/site-packages/keras/engine/training.py", line 429, in data_generator_task
generator_output = next(self._generator)
File "/home/vorale/Downloads/ssd_keras-master/ssd_training2.py", line 186, in generate
img = imread(img_path).astype('float32')
File "/home/vorale/anaconda3/lib/python3.5/site-packages/scipy/misc/pilutil.py", line 154, in imread
im = Image.open(name)
File "/home/vorale/anaconda3/lib/python3.5/site-packages/PIL/Image.py", line 2280, in open
fp = builtins.open(filename, "rb")
FileNotFoundError: [Errno 2] No such file or directory: '../../frames/frame02579.png'
Is there any folder I haven't included?
In the examples, we hard-code in the image size. Is this uniformity a requirement of the algorithm, or is it possible for the algorithm to deal with images of varying size?
Hi,
Can you explain the gt_pascal.pkl format?
and how is it formated from the pascal format 👍
<?xml version="1.0" encoding="UTF-8" ?>
;;<annotations>
;; <folder>/home/user/path-to-kitti-root/training/image/</folder>
;; <filename>000000.png</filename>
;; <size>
;; <width>1224</width>
;; <height>370</height>
;; <depth>3</depth>
;; </size>
;; <object>
;; <name>Pedestrian</name>
;; <truncated>0</truncated>
;; <occluded>0</occluded>
;; <alpha>-0.20</alpha>
;; <bndbox>
;; <xmin>712.40</xmin>
;; <ymin>143.00</ymin>
;; <xmax>810.73</xmax>
;; <ymax>307.92</ymax>
;; </bndbox>
;; <dimensions>
;; <height>1.89</height>
;; <width>0.48</width>
;; <length>1.20</length>
;; </dimensions>
;; <location>
;; <x>1.84</x>
;; <y>1.47</y>
;; <z>8.41</z>
;; </location>
;; <rotation_y>0.01</rotation_y>
;; <property>-0.20,0.00,0</property>
;; </object>
;; <object>
;; .
;; .
;; .
;; </object>
;; <object>
;; .
;; .
;; .
;; </object>
;;</annotations>
Hello @rykov8 sorry for bothering you again,
Looking at your ssd_training.py
file it seems like that variances do not take part of the training loss or even in any part of the training pipeline. However, in your ssd_utils.py
, the method detection_out
does change the prediction by multiplying them by their respective variance.
decode_bbox_center_x = mbox_loc[:, 0] * prior_width * variances[:, 0]
decode_bbox_center_x += prior_center_x
decode_bbox_center_y = mbox_loc[:, 1] * prior_width * variances[:, 1]
decode_bbox_center_y += prior_center_y
decode_bbox_width = np.exp(mbox_loc[:, 2] * variances[:, 2])
decode_bbox_width *= prior_width
decode_bbox_height = np.exp(mbox_loc[:, 3] * variances[:, 3])
I do understand that one has to decode the boxes since they were encoded using the transformation described in equation 2 from SSD (and faster R-CNN)
My main concern is that the variances are changing explicitly the values already outputted by the CNN without considering them directly in the training procedure; furthermore, I do not seem to find any reference, neither in the SSD or in Faster R-CNN papers, that make a reference to these variances
. Maybe I am missing something in the papers or in the implementation, in that case I would be very grateful if you could tell me if I making a mistake or maybe if you could elaborate on the use of these variances.
Thank you very much.
After read all the issues here, still don't know what prior_boxes_ssd300.pkl is.
Hello,
it seems the Mega link for downloading the weights is dead.
Are you planning on making a new one?
Thanks for your work!
tensorflow is problem for windows
is it possible make a code with theano backend
Hi,
I want to run SSD_training.ipynb. I converted that to .py and when I run that, I get the following error. What is that am missing?
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 754, in run
self.__target(*self.__args, **self.__kwargs)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 429, in data_generator_task
generator_output = next(self._generator)
File "SSD_training.py", line 204, in generate
img = imread(img_path).astype('float32')
File "/usr/local/lib/python2.7/dist-packages/scipy/misc/pilutil.py", line 154, in imread
im = Image.open(name)
File "/usr/local/lib/python2.7/dist-packages/PIL/Image.py", line 2312, in open
fp = builtins.open(filename, "rb")
IOError: [Errno 2] No such file or directory: u'../../frames/frame01884.png'
Traceback (most recent call last):
File "SSD_training.py", line 285, in
nb_worker=1)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 1528, in fit_generator
str(generator_output))
ValueError: output of generator should be a tuple (x, y, sample_weight) or (x, y). Found: None
Hi,
I am trying to replicate the same model but for the 500x500 version in the paper.
Apart from the input image_shape, and the priors, what other parameters need to be changed?
I am getting an error like this when running fit_generator
InvalidArgumentError: Incompatible shapes: [16,7308,4] vs. [16,28461,4]
[[Node: sub_1 = Sub[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"](strided_slice_20, strided_slice_21)]]
[[Node: mul_11/_375 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_5128_mul_11", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
I tried seeing if there is anything hard-coded in the SSD300 object or generators, but perhaps I do not see it.
Your advice would be appreciated! :)
For the new priors, I am using https://gist.github.com/codingPingjun/aa54be7993ca6b2d484cccf5a2c5c3d4 but with 500x500 size.
Hi,
Thanks for the great work!
When I tried to run the training script, ran into shape mismatch at this line: https://github.com/rykov8/ssd_keras/blob/master/ssd_utils.py#L149
Should assignment[:, 5:-8][best_iou_mask]
be assignment[:, 4:-8][best_iou_mask]
?
Hi rykov8,
First of, great job! and thanks a lot for sharing this repo with us.
I'm trying to shorten the ssd network a bit to see if I can gain on speed during training.
I see you set num_priors
variable to either 3 or 6, and then use it to determine the number of filters nb_filter
in the Conv2D layers responsible for the location and confidence multiboxes.
Now, in trying to make a shorter network I end up with a shape mismatch coming from the last merge()
layer (prediction layer):
Exception: "concat" mode can only merge layers with matching output shapes except for the concat axis. Layer shapes: [(None, 247500, 4), (None, 247500, 2), (None, 337500, 8)]
That is, the shape of these three layers:
net['mbox_loc'] = Reshape((num_boxes, 4),
name='mbox_loc_final')(net['mbox_loc'])
net['mbox_conf'] = Reshape((num_boxes, num_classes),
name='mbox_conf_logits')(net['mbox_conf'])
net['mbox_priorbox'] = merge([net['conv1_2_mbox_priorbox'],
net['conv2_2_mbox_priorbox']],
mode='concat',
concat_axis=1,
name='mbox_priorbox')
I try to look in the literature with no luck. Can you maybe explain how to set this parameter? What does it depend on?
My input image shape is (300, 300,3)
, just like in your example and I'm using the same priors you pickled.
Thanks in advance!
I want to ask two questions about the PriorBox layer in your work. What's the meaning of min_szie and max_size? Maybe there are some relationships between these parameters and S__min, S__max in paper. Really hope you can give me replies. Thank you! @rykov8
hi @rykov8
Thanks for your good work!
I think you'd better specify the keras's version in read me because I have met so many issues caused by the versions of keras .
I got the following error
Using TensorFlow backend.
Traceback (most recent call last):
File "ssd.py", line 9, in
from keras.layers import GlobalAveragePooling2D
ImportError: cannot import name GlobalAveragePooling2D
In ssd.py,
I found
net['input'] = input_tensor
net['conv1_1'] = Convolution2D(64, 3, 3,
activation='relu',
border_mode='same',
name='conv1_1')(net['input'])
I know Convolution2D, but I can not understant it times (net['input']) ,and I did not find this usage in keras's document. Can you provide more details about this?
Thank you!
Hi rykov8,
Do you know how to pre-train the vgg part in ssd network? And what dataset do you use to do that?
Thanks a lot.
e.g.: ssd_training.py. A lot of code are used by tensorflow API. Not pure keras API.
Great work! Thanks a lot!
The detection takes around 2 second per image on a mac using only CPU.
It's quite different from the performance of test provided in the paper.
Apart from hardware, is it possible that it's caused by the overhead of Keras?
Also, may I ask is it possible to shrink the network somehow?
Thank you.
hi did you use pretrained VGG model? Then how did you subsample the parameter from fc6 and fc7?
Hi. @rykov8 .FIrstly, thanks for this keras port for SSD. You are amazing :)
I have been trying to train the model for hand detection. I have basically a single class then, that of a hand. I set NUM_CLASSES=2 as you specified in other issues. Can you please let me know about my input format. My data currently has the 5 tuple.:- label, x0, y0, width, height to specify the coords of the bounding box. I have generated the same corresponding to each hand image in my dataset . Do we represent our input through prior_boxes_ssd300.pkl and gt_pascal.pkl?? How exactly do i do that. What is the prior_boxes_ssd300.pkl for. It would be great if you can help me out. Thanks in advance..
I just want to detect the fish and capture it from a image. One picture only contain one kind of fish. My dataset has 7 kinds of fishes image. How to train my dataset and i can capture fish from a image. Thanks.
How can I execute the code in SSD.ipynb and view output?
Hi, your implementation is really great!
But I have a question. If I want to detect some small objects and I want to detect from conv3, should I change prior_boxes_ssd300.pkl? I think this pkl is in the same format with the class PriorBox, but I don't know how to generate the pkl, can you give me some advices?
Thank you in advance.
Hi @rykov8, I have a error problem.
When I try to train my own image dataset(follow SSD_training notebook), I face a error as follow:
832/1264 [==================>...........] - ETA: 634s - loss: 2.3086Exception in thread Thread-12:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 754, in run
self.__target(*self.__args, **self.__kwargs)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 404, in data_generator_task
generator_output = next(generator)
File "/home/optsai/文件/Object detection/test_training.py", line 197, in generate
img = jitter(img)
File "/home/optsai/文件/Object detection/test_training.py", line 102, in contrast
gs = self.grayscale(rgb).mean() * np.ones_like(rgb)
File "/home/optsai/文件/Object detection/test_training.py", line 86, in grayscale
return rgb.dot([0.299, 0.587, 0.114])
ValueError: shapes (300,300) and (3,) not aligned: 300 (dim 1) != 3 (dim 0)
Can you tell me how to fix it?? This is my code.
train.txt
When I try the notebook, I get this error message:
ValueError: Filter must not be larger than the input: Filter: (3, 3) Input: (1, 1)
in this line:
x = Convolution2D(24, 3, 3, border_mode='same',
name='pool6_mbox_loc')(net['pool6'])
It passes when I change it to:
x = Convolution2D(24, 1, 1, border_mode='same',
name='pool6_mbox_loc')(net['pool6'])
but then loading the weights fails.
pool6
is indeed reshaped to (1, 1, 256)
, so what can I do to fit the (3, 3)
convolution?
To those who have got this to work, could anyone point to some simple tutorial code? Admittedly getting lost in all the dependencies to work. I have not fully explored yet, but its tough to even see what the inputs and outputs to the model are. (images, ground truth bbox arrays, target class arrays etc) Just looking for some simple starter code to begin exploring the true complexity of this amazing implementation.
Hi @rykov8, I have a training error when I train my own image dataset.
1120/1133 [============================>.] - ETA: 15s - loss: 1.8311Exception in thread Thread-6:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 754, in run
self.__target(*self.__args, **self.__kwargs)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 404, in data_generator_task
generator_output = next(generator)
File "/home/optsai/文件/Object detection/training(fine-tune).py", line 169, in generate
img = imread(img_path).astype('float32')
TypeError: float() argument must be a string or a number
Do you meet this error before??
After messing around a bit, I was able to convert prior_boxes_ssd300.pkl to a Python2 compatible format. I'm submitting it here in case it is useful for anybody.
prior_boxes_ssd300_python2.pkl.zip
The problem with the .pkl files included in this repo is that they use Pickle Protocol 3, which is only supported on Python3, not Python2. Everything else seems to work with Python2, so it's unfortunate to leave Python2 out just because of that.
I also want to take the time to thank you for porting SSD to Keras! It makes it much easier to work with, compared to the original Caffe implementation.
Hi! I recently noticed that the detection_out method on BBoxUtility doesn't seem to work with actual ground truth data, as far as I can tell. I believe it works on output that I have gotten from the SSD network, but it seems strange that it wouldn't support the actual grount truth; if the network was trained to perfection, its output should match that of the ground truth, in theory (at least for some dataset).
The reason I care about this is because in my application, I want to debug my generator class, similar to the generator in SSD_training.ipynb. I want to run the generator and visualize its output. That should always be a good idea to make sure you're training the network on reasonable data. I can do that easily with the image data I'm feeding the network, but for the ground truth I have found no way to visualize it. The most obvious way would be to do like you'd visualize the network's output similar to how it's done in SSD_training.ipynb, by running it through detection_out and then interpreting the output visually. However, detection_out doesn't seem to produce anything reasonable when fed ground truth data.
Here's some sample code that show what I mean:
from ssd_utils import BBoxUtility
import pickle
import numpy as np
NUM_CLASSES = 4
priors = pickle.load(open('prior_boxes_ssd300.pkl', 'rb'))
bbox_util = BBoxUtility(NUM_CLASSES, priors)
gt_pascal = pickle.load(open('gt_pascal.pkl', 'rb'))
gt = gt_pascal[u'frame03196.png']
y = bbox_util.assign_boxes(gt)
print(y.shape)
# Visualization of y
#import cv2
#cv2.imshow("Y", y.reshape((7308/12,16*12)).transpose())
#cv2.waitKey(0)
gt2 = bbox_util.detection_out(y.reshape(1,7308,16))
print(gt)
print(gt2)
# Why are there no coordinates here?? class numbers and confidence seem okay
# but nowhere are actual coordinates of any box.
I've tried this with several different ground truth arrays, both from gt_pascal.pkl and also things I've constructed on my own. Always I get garbage positions of these boxes, with xmin = 1 and xmax = 0. As far as I can tell the class numbers and confidence (should always be 1 for the relevant boxes) seem fine, it's the coordinates that get messed up somehow.
Am I missing something, or is this a bug in BBoxUtility?
Hi.
Thanks a lot for your SSD implementation, I tested your code in images and worked fine.
But when I tried to apply to a webcam,I changed vid_test.run('path/to/your/video.mkv') to vid_test.run(0) and tried also vid_test.run(), I got the following error
python videotest_example.py
Using TensorFlow backend.
Traceback (most recent call last):
File "videotest_example.py", line 25, in
vid_test.run()
File "/Users/walidahmed/Desktop/Code/ssd_keras-master/testing_utils/videotest.py", line 87, in run
vidw = vid.get(cv2.cv.CV_CAP_PROP_FRAME_WIDTH)
AttributeError: 'module' object has no attribute 'cv'
I have my cv2.version :'3.2.0-dev'
Can you please advice?
Walid
Hello @rykov8 ,
In the file ssd_utils.py in the assign_boxes method you mention:
assignment[:, -7:] are all 0. See loss for more details
I believe that assignment[:, -8:]
counts the number of positive examples (examples assigned to a prior box); however, I have not find any use of all the zeros between assignment[:, -7:-1]
in the loss function.
Did I miss something in regards to their use? or should we change the loss function so that it contains only the counter at the end?
If so, we could probably substitute the counter from y_true[:, :, -8]
, with the assigned probability values for the ground truth bounding boxes for the background class in y_true[:, :, 4]
right?
Thanks
hi i have a question about assigning boxes to gt
In matching step, paper firstly match each gt to the default box with maximum iou(1 default box per gt)
Second, they assigned a default box whose iou is larger than 0.5 with any gt. So, did you assign a default box to 'multi class'? or just 'one class' with maximum iou?
for example, a default box B has iou with aeroplane 0.6, person 0.7 in an image. then did you assign that box B to the only person? or both of aeroplane and person?
When I use VOC2007 data to train, every thing is perfect.
But when I download 'cat' from Imagenet website, the training result is wrong even I changed the number of classes to 1 and the other corresponding place to 2.
Could anyone give a hint?
As title.
Hi.. i'm trying to use your framework on my data with 512x512 size as input
how should i evaluate my model?
i'm currently monitoring val_loss, should i monitor the val_acc ?
what is the val_acc meaning when using class and location per prior box?
what is val_acc of 0.11 meaning?
in the paper they mention mAP.. how can i calculate it ?
thank you very much for the help!
hi
In last part of ssd_train.py, you add pos_conf_loss and neg_conf_loss.
When you calculate neg_conf_loss, you just select top_k boxes from max_conf, however, i think max_conf can include positive boxes(matched to a gt) because you did not restrict max_conf to have y_pred[:,:,-8] '0' or y_pred[:,:,4] '1'. What do you think about it?
Also, I`m implementing SSD300 in pascal voc on my own. However, when i draw confusion matrix for each epoch, most of the samples are biaised to negative class(background class). Can you give me any comment?
Hi, I am trying to load model with 'th' image_dim_ordering and 'tensorflow' backend, but it raising some errors. Is it possible to run model with 'th' image_dim_order?
Using TensorFlow backend.
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so locally
Traceback (most recent call last):
File "./test.py", line 30, in <module>
model = SSD300(input_shape, num_classes=NUM_CLASSES)
File "/home/dummy/ssd_keras/ssd.py", line 77, in SSD300
net['conv4_3_norm'] = Normalize(20, name='conv4_3_norm')(net['conv4_3'])
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 572, in __call__
self.add_inbound_node(inbound_layers, node_indices, tensor_indices)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 635, in add_inbound_node
Node.create_node(self, inbound_layers, node_indices, tensor_indices)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 166, in create_node
output_tensors = to_list(outbound_layer.call(input_tensors[0], mask=input_masks[0]))
File "/home/dummy/ssd_keras/ssd_layers.py", line 43, in call
output *= self.gamma
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/math_ops.py", line 814, in binary_op_wrapper
return func(x, y, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/math_ops.py", line 987, in _mul_dispatch
return gen_math_ops.mul(x, y, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_math_ops.py", line 1613, in mul
result = _op_def_lib.apply_op("Mul", x=x, y=y, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 759, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2242, in create_op
set_shapes_for_outputs(ret)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1617, in set_shapes_for_outputs
shapes = shape_func(op)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1568, in call_with_requiring
return call_cpp_shape_fn(op, require_shape_fn=True)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/common_shapes.py", line 610, in call_cpp_shape_fn
debug_python_shape_fn, require_shape_fn)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/common_shapes.py", line 675, in _call_cpp_shape_fn_impl
raise ValueError(err.message)
ValueError: Dimensions must be equal, but are 38 and 512 for 'mul' (op: 'Mul') with input shapes: [?,512,38,38], [512].
How to understand encoding box codes? who can give me some papers or materials,thanks。
Thanks for the work you have done here. I agree that Caffe is painful to use.
A quick question. How did you port the weights from Caffe to Keras? Do you have code to do this?
The reason I ask is that I would like to port the coco trained weights from the original repo.
Hi, Great repo!
Have you tried reproducing the results on PASCAL VOC2007 reported in the original paper?
This information in the README would be very helpful!
When I run SSD.py file.
Fail to run this code : model.load_weights('weights_SSD300.hdf5', by_name=True)
It gives error:
ValueError: Dimension 0 in both shapes must be equal, but are 64 and 3 for 'Assign_4' (op: 'Assign') with input shapes: [64,300,3,3], [3,3,3,64].
Seems the weights_SSD300.hdf5 doesn't match your model. Could you please help me with this. Thank you very much.
Hi your code worked fine when I applied it to images, video and cam.
I am trying to classify vehicles and pedestrians, I checked the file gt_pascal.pkl and read one of its values by
import pickle
f = open("gt_pascal.pkl")
data = pickle.load(f) #<type 'dict'>
print(data.get('frame05183.png'))
I have several question on training and I hope you can help me
1- Where is frame05183.png stored?
2-To do my training with 3 classes, I believe I will have to edit "gt_pascal.pkl", but where should I store my images?
3-What objects are you actually trying to train to detect in SSD_training.ipynb?
Thanks a lot
Hi again!
Im trying to use your implementation on a different problem than PASCAL VOC dataset suggests. In my case, I need to identify much smaller objects (ground truth boxes are 50px50p in 768X1024 images).
For what I've seen so far min_size
and max_size
determine the dimension of the default boxes. Are these parameters implemented to be pixels? or what are they? Cause in the paper they talk about scales, with values ranging from 0 to 1, and I'm not sur eif you implemented a different version of it and conceptually they do the same or if I'm mixing up concepts.
Thanks in advance!
Hello, thank you so much for this Keras implementation of SSD!
I have successfully ported the caffe weights for SSD trained on the COCO dataset (300x300 input, 80+1 classes), and now I'm trying to utilize these weights to help retrain SSD on my specific problem.
I need SSD to output 200-some attributes instead of 81 object classes, and since one object can have multiple attributes, I need SSD to output class scores that don't sum to 1.
So I tried just re-training without any major changes (had to randomly initialize the weights of 6 layers that relied on COCO's 81 class output, but loaded the rest just fine), and my training loss was stuck at around 200.
I then realized this would never train because the class score outputs are normalized to sum to 1, so I changed the Activation function on the last layer of SSD from "softmax" to "sigmoid" (maybe I should use "tanh" instead?), and I'm currently training successfully I think, but I won't know for a while. The loss started at 32 and is now down to 7 after 8000 samples, and still decreasing nicely.
Anyways, I was just wondering about SSD's custom loss function, since I see it uses a softmax loss for conf_loss, in ssd_training.py. Should I change this to some other loss function? If so, which one?
I found the output is from conv4_3, fc7, conv6_2, conv7_2, conv8_2, pool6
but the paper is from conv4_3 , fc7, conv9_2, conv10_2, conv11_2
and the number of default box is different, this model use either3 or 6 , but the paper uses 4 or 6?
So this model still has the same performance(same level) as that of the paper?
Thank you
In the ssd_training.ipynb, I found the following code in the function of random_sized_crop seems problematic:
if (x_rel < cx < x_rel + w_rel and y_rel < cy < y_rel + h_rel):
xmin = (box[0] - x) / w_rel
ymin = (box[1] - y) / h_rel
xmax = (box[2] - x) / w_rel
ymax = (box[3] - y) / h_rel
Since the coordinates are box[:] are relative coordinates, I think these lines should be
if (x_rel < cx < x_rel + w_rel and y_rel < cy < y_rel + h_rel):
xmin = (box[0] - x_rel) / w_rel
ymin = (box[1] - y_rel) / h_rel
xmax = (box[2] - x_rel) / w_rel
ymax = (box[3] - y_rel) / h_rel
How to change the model to ssd500?
Do I need to write a new SSD500 architecture, or just change the input size to 500?
Does prior_boxes_ssd300.pkl need to be changed?
Hi again
I have a question about calculating gradient
In ssd_training.py, it seems you give loss for each batch, not a averaged loss of total batch.
Is this correct?
First of all, thank you for your work and helping with all the issues.
I want to use the code commercially. Does the MIT license in LICENSE
apply to the code only or also the downloadable weights? If not, what is the license on the weights?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.