zehaos / mobilenet Goto Github PK

View Code? Open in Web Editor NEW

1.6K 1.6K 470.0 413 KB

MobileNet build with Tensorflow

License: Apache License 2.0

Python 99.30% Shell 0.70%

detection mobilenets multigpu slim tensorflow

mobilenet's People

Contributors

Stargazers

Watchers

Forkers

ratko92 caomw sunjieee wujiahongpku jjsong shirleyqian barongeng zzmjohn jiayingjie92 yuckfu benjamesbabala stevenlol wwwanghao ilibx lyk125 guokr1991 rotorliu wangsheng1991 parvizp pythons cv9527 xshhhm aicarmark elliskui gjtjx walkoncross skyrambler wjgaas andongchen hjp709394 philo-zhang muxi166 edwardmark rzjm walter1218 shuangseu bikong2 tammyyang yuyang3478 yuechengli itsayush dreadlord1984 pustar scholltan leliaonvidia longchuan1985 jiangqh maltebaumann chandrahasjr yycho0108 zsz02 seedlingfl k-du barronalex aromazyl alpslee soledad89 solderzzc bnurbekov vseledkin ayshrv vladpaunescu hungsing92 qati zmoon111 davidduo cooliybob rocksat czy1977 baileyqbb reyadrahman zhangyuancv myles-zmy tedliaotw freeterfei weiliangxiao coocoky zuoshaobo lxh-123 sfzhoucode jassonvia cnn-gan midasc sputnikav chenfsjz davidsunny86 jjyukgm sunmyfong neuroailab alexbigboy haddis3 flyflywang yiqinggit ericyao2013 wujsy wzhen1 jlertle fanhaidi slidelucask richardierg

mobilenet's Issues

Sharing your frozen graph?

Hey Zehaos,
Thanks a lot for this repo! I see that in tools/test_forzen_graph.py, you seems to have a frozen graph with placeholder inserted:

23 | graph_filename = "../mobilenet-model/with_placeholder/frozen_graph.pb"

Would you mind sharing your frozen graph file? Thanks a lot in advance.

TypeError: init() got an unexpected keyword argument 'load_instance_masks'

我在训练自己的数据库时，train.py之后就遇到了这个问题，不知道是什么原因。

Problem on running train image classifier

I'm trying to run train_image_classifier.py on MNIST dataset using CPU, but I get the below error:

INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.FailedPreconditionError'>, input must have 3 channels but input only has 1 channels.
	 [[Node: distort_image/distort_color_3/adjust_saturation/RGBToHSV = RGBToHSV[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](distort_image/distort_color_3/adjust_saturation/Identity)]]

What should I do?

train_mobilenet_on_imagenet.sh

您好！能用train_mobilenet_on_imagenet.sh在flowers数据集上训练mobilenet模型吗？

Is there pb file available for downloading?

or can you suggest how to convert from ckpt to pb?
Thanks!

Is in your code after conv first relu then bn?

In your depthwise_separable function
` def _depthwise_separable_conv(inputs,
num_pwc_filters,
width_multiplier,
sc,
downsample=False):
""" Helper function to build the depth-wise separable convolution layer.
"""
num_pwc_filters = round(num_pwc_filters * width_multiplier)
_stride = 2 if downsample else 1

# skip pointwise by setting num_outputs=None
depthwise_conv = slim.separable_convolution2d(inputs,
                                              num_outputs=None,
                                              stride=_stride,
                                              depth_multiplier=1,
                                              kernel_size=[3, 3],
                                              scope=sc+'/depthwise_conv')

bn = slim.batch_norm(depthwise_conv, scope=sc+'/dw_batch_norm')
pointwise_conv = slim.convolution2d(bn,
                                    num_pwc_filters,
                                    kernel_size=[1, 1],
                                    scope=sc+'/pointwise_conv')
bn = slim.batch_norm(pointwise_conv, scope=sc+'/pw_batch_norm')
return bn`

After executed slim.separable_convolutin2d() then will execute relu activation function!So in your code is first execute relu activation function and then execute slim.batch_norm.
But in moblilenet paper is first execute BN and then relu!

Fused-BN

I saw inference time will be faster if Fused-BN is on. How do I turn it on?
Thanks

tensorflow的 mobilenet 链接该更新了，原有的失效

原有的链接为https://github.com/tensorflow/models/tree/master/official，404 not found.需要更新为
https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet_v1.md

Batch norm and ReLU reversed?

Hi according to the paper (and as you posted in the README) the basic building block is:

But if I follow TF-Slim code and your code I think you are having depthwise -> ReLU -> BN -> Pointwise -> ReLU -> BN.

The ReLU is part of slim convolution layers.

Also you seem to be using separable_convolution2d with num_outputs=None to get a depthwise convolution, you can just use https://www.tensorflow.org/api_docs/python/tf/nn/depthwise_conv2d instead, no?

trained model error

I have trained mobilenet on my own dataset, the training run 700000 iters,loss is 2e-3, the image in tensor board is right, when training a warring

WARNING:tensorflow:Error encountered when serializing LAYER_NAME_UIDS.
Type is unsupported, or the types of the items don't match field type in CollectionDef.

but when I eval_image_classifier test, the Accuracy is very low,
2017-05-26 19:38:54.825458: I tensorflow/core/kernels/logging_ops.cc:79] eval/Accuracy[0.13553691]
2017-05-26 19:38:54.825467: I tensorflow/core/kernels/logging_ops.cc:79] eval/Recall_5[0.26291946]
and i write the eval as
network_fn = nets_factory.get_network_fn(
'mobilenet',
num_classes=(59532),
is_training=False,
width_multiplier=1.0)
logits, end_points = network_fn(input_tensor)
...
predict_values, logit_values = sess.run([end_points['squeeze'], logits], feed_dict={input_tensor: img})

it says MobileNet/Conv1/Weight2 is not initialized

ii anything wrong?

Need of other Pre-trained weight of mobilenet(width_multiplier=[0.25, 0.5, 0.75])

Hello,
thanks a lot for releasing the implementation of MobileNets!
I have downloaded the mobilenet_width_1.0 and used the head of the model weights as the init for other task. And it perform well ! I am training the mobilenet(width_multiplier=[0.25, 0.5, 0.75]) in imagenet data to get the Pre-trained weight because the model size and inference time. But loss converge too slow. Would you please release the weight of the other width_multiplier if you have trained before? :)

Thanks!

voc数据集上的训练

你好，你试过voc2007 的训练吗？ config 里的相关参数要改哪些啊？？求教

Mobilenet vs Mobilenetdet

What's the difference between mobilenet and mobilenetdet?

Bug in batch_delta

the batch_delta function of mobilenetdet.py divides the delta_x by the box width instead of the anchor width, the ssd paper describes division by the anchor width and this is also used in interpre_detection.

如何使用自己的图片训练

Train MobileNet Detector (Debugging)

Prepare KITTI data.
After download KITTI data, you need to split it data into train/val set.

cd /path/to/kitti_root
mkdir ImageSets
cd ./ImageSets
ls ../training/image_2/ | grep ".png" | sed s/.png// > trainval.txt
python ./tools/kitti_random_split_train_val.py
这一步不太懂 KITTI 格式能否使用自己的图片呢？

The pretrained weight link is closed.

Could you fix the link for the pretrained weight?

prediction time goes up gradually

I followed this tutorial to retrain Mobilenet and i am calling the Predict() function in a loop. I printed out the how much time it takes one prediction to happen and it starts around 0.6 seconds and goes up gradually. My PC specs are i7 4790, GTX 1070, 16 GB RAM.

Here's the python script using the model.

def load_graph(model_file):
  graph = tf.Graph()
  graph_def = tf.GraphDef()

  with open(model_file, "rb") as f:
    graph_def.ParseFromString(f.read())
  with graph.as_default():
    tf.import_graph_def(graph_def)

  return graph

def read_tensor_from_image_file(file_name, input_height=299, input_width=299,
                input_mean=0, input_std=255):
  input_name = "file_reader"
  output_name = "normalized"
  file_reader = tf.read_file(file_name, input_name)

  image_reader = tf.image.decode_jpeg(file_reader, channels = 3,
                                        name='jpeg_reader')
  float_caster = tf.cast(image_reader, tf.float32)
  dims_expander = tf.expand_dims(float_caster, 0)
  resized = tf.image.resize_bilinear(dims_expander, [input_height, input_width])
  normalized = tf.divide(tf.subtract(resized, [input_mean]), [input_std])
  sess = tf.Session()
  result = sess.run(normalized)

  return result

def load_labels(label_file):
  label = []
  proto_as_ascii_lines = tf.gfile.GFile(label_file).readlines()
  for l in proto_as_ascii_lines:
    label.append(l.rstrip())
  return label


model_file = "tactic-model//retrained_graph.pb"
label_file = "tactic-model//retrained_labels.txt"

graph = load_graph(model_file)
labels = load_labels(label_file)

def Predict(file_name):

  input_height = 224
  input_width = 224
  input_mean = 128
  input_std = 128
  input_layer = "input"
  output_layer = "final_result"

  t = read_tensor_from_image_file(file_name,
                                  input_height=input_height,
                                  input_width=input_width,
                                  input_mean=input_mean,
                                  input_std=input_std)

  input_name = "import/" + input_layer
  output_name = "import/" + output_layer
  input_operation = graph.get_operation_by_name(input_name);
  output_operation = graph.get_operation_by_name(output_name);


  with tf.Session(graph=graph) as sess:
    results = sess.run(output_operation.outputs[0],
                      {input_operation.outputs[0]: t})
  results = np.squeeze(results)

  index, Value = max(enumerate(results), key = operator.itemgetter(1))

  return labels[index]

What are the default hyperparameters for training mobilenet on imagenet?

I think in the mobilenet paper, it is mentioned that they trained in a similar way as the inception v3. In the inception V3 paper, they used a learning rate of learning rate decay of 0.94 every 2 epochs, which is what you defined here: https://github.com/Zehaos/MobileNet/blob/master/train_image_classifier.py

However, on the training script here: https://github.com/Zehaos/MobileNet/blob/fce86531a7a93ccee5e31893e8587b7076889f24/scripts/train_mobilenet_on_imagenet.sh

the learning rate is instead 0.1 (inception v3 used 0.045), and the decay rate is 0.1 every 30 epochs. Which one gave you a better performance? I am testing the retraining performance, and using the 0.1 learning rate and 0.1 decay every 30 epochs seem to give me a less optimal performance in the loss compared to the loss graph you shown. Also, in total, how many epochs did you train the model for?

Thanks for your help. The imagenet training is indeed not as easy with just 1 GPU.

Benchmark on GPU

Hello.

Thank you for your great implementation on MobileNet.

I am curious if you did the benchmark between the mobilenet and its original convolution version on Tensorflow? It has been known when the number of groups equals the number of output channels, the performance of the network is worse than its fully convolution version on GPU. For example, pytorch has an issue like this: pytorch/pytorch#1708.

Thank you for your help!

error in running scripts/train_mobilenetdet_on_kitti.sh

Hi,

when trying to run scripts/train_mobilenetdet_on_kitti.sh I'm getting
ValueError: Cannot reshape a tensor with 7884864 elements to shape [39,12,9,2] (8424 elements) for 'Reshape_7' (op: 'Reshape') with input shapes: [1971216,4], [4] and with input tensors computed as partial shapes: input[1] = [39,12,9,2].

please advice,
Ofer

Running Feedforward

Hi,

Nice work! I am interesting in benchmarking mobilenet vs. yolo2 in object detection for CPU. However, I found that it was a bit headache to find the correct way to run a forward pass to get the classification and bbox. Can you explain a bit about the details on how to do it?

Is there a graph.pbtxt and checkpoint file available after your imagenet training?

Thanks for your pre-trained model! I'd just like to ask if you have the graph.pbtxt and checkpoint file (not the data file you already uploaded) available as well. Because it seems that when you run checkpoint_file = tf.train.latest_checkpoint(logdir), where logdir is the folder storing the files you uploaded, there is an error of the correct checkpoint not getting found. I believe after training, there should be 5 files included:

model.ckpt.meta
model.ckpt.index
model.ckpt.data-00000-of-00001
checkpoint
graph.pbtxt

the checkpoint file should contain the latest checkpoint information in order for the get_latest_checkpoint function to work.

Train on custom dataset

Hi,

I have a custom dataset containing 2 classes, each one has 20,000 image samples. How can I train MobileNet on this dataset?

Training - Hyper-parameters

Hello,

thanks a lot for releasing the implementation and pre-trained weights of MobileNets!
I am trying as well to reproduce the accuracy reported in the paper, and I have a bit of trouble finding the optimal hyper-parameters for training. What is your experience on the subject? Have any tip? :)

Thanks!

Custom image training

Hi,

i need to train MobileNet with some custom images. i try the kitty folders structure but doesn't works. Can you give me some help?

Federico

Can't find the proper output node name

Hi Zehaos,

First of all, thanks for sharing your works!

In order to run this model on iOS device, I was trying to use freeze_graph.py to freeze your ckpt files, but I can not find the right name of the output node.

Do you have any idea that what output node I should use to freeze your ckpt file?

Many thanks!

def set_anchors(img_shape, fea_shape):

In "mobilenetdet.py" this file, the "set_anchors(img_shape, fea_shape)" function have bug.

目标检测的实现

你好，我们刚刚尝试了用你们的model fine tune在了自己的dataset上（traffic sign classification），感觉效果不错。我看到关于目标检测这部分还在debugging之中，想问下现在主要哪部分还未完善呢？由于时间紧迫，我们需要尽快完成这部分的功能，所以想看看可不可以自己把剩下的部分补充好

Error in python: double free or corruption(!prev): 0x00000000016abbd0 ***

The code which the incur the error is located in line 179 of eval_image_classifier.py .
Environment: ubuntu 14.04. tensorflow version: 1.0.0

gpu大小

你好，你训练的时候（batch_size=128)是用到了２块GTX1060么？我用泰坦x时提示我显存不够

Questions about how to run

I saw many models in the folder nets. Seems most of them have a corresponding test.py . What are they testing?Seems its using some random array to run forward.Is it testing speed ? If so why is it gonna take 3~5 s for a batch that's quite slow.Hope for help

mobilenet performs worse than inceptionv1 on size and speed

I trained a mobilenet model using tensorflow flowers dataset. I'm not sure why the model size is larger than InceptionV1 and it costs more time on inference compared to inceptionV1.

Inference always return 997 using pretrained weight

Hi, I am trying to use the pre-trained weight to test the model, however, it always return 997 as the result. The code I use is listed below.

sess = tf.Session()
input_tensor = tf.placeholder(tf.float32, [224, 224, 3])
preprocessed_image = preprocess_for_eval(input_tensor, 224, 224)
preprocessed_image_batch = tf.expand_dims(preprocessed_image, 0)
args_scope = mobilenet_arg_scope()
with slim.arg_scope(args_scope):
     logits, end_points = mobilenet(
                inputs=preprocessed_image_batch,
                num_classes=1001,
                is_training=False,
                width_multiplier=1,
                scope='MobileNet'
            )

saver = tf.train.Saver()
saver.restore(sess, checkpoint_file_path)

with tf.gfile.FastGFile(image_path, 'rb') as f:
      image_data = f.read()
image = tf.image.decode_jpeg(image_data)

predict_values, logit_values = self.sess.run(
            [end_points['Predictions'], logits],
            feed_dict={input_tensor: image.eval(session=sess)}
        )
print(np.argmax(predict_values[0]))

May I know where should I look into for this weird behaviour?

Fused Batch-Norm导致量化后的模型 inference时奔溃

Hi，
感谢你们提供的代码和模型，我刚刚用你们提供的model在flowers数据集上 fine tune该model时打开了Fused Batch-Norm，fine tune后的模型在量化之后inference时程序奔溃，出现以下错误：
tensorflow.python.framework.errors_impl.InvalidArgumentError: requested_output_max must be >= requested_output_min, but got -nan and 0
[[Node: MobileNet/conv_ds_7/pointwise_conv/convolution_eightbit_requantize = Requantize[Tinput=DT_QINT32, out_type=DT_QUINT8, _device="/job:localhost/replica:0/task:0/cpu:0"](MobileNet/conv_ds_7/pointwise_conv/convolution_eightbit_quantized_conv, MobileNet/conv_ds_7/pointwise_conv/convolution_eightbit_quantized_conv:1, MobileNet/conv_ds_7/pointwise_conv/convolution_eightbit_quantized_conv:2, MobileNet/conv_ds_7/pointwise_conv/convolution_eightbit_requant_range, MobileNet/conv_ds_7/pointwise_conv/convolution_eightbit_requant_range:1)]]

请问你们是否做过Fused-BN + Quantized的相关测试？？或者对以上的错误有什么建议？？谢谢！！

Loss较大，并且一直在震荡

@Zehaos 你好，MobileNet非常棒！我目前的工作是把我的网络结构最后7层的卷积，全部改成mobileconv的形式，然后freeze之前的参数，用已训练好的模型初始化，然后再训练。目的是想试试能不能用mobileconv的形式替代原始的conv。但是现在出现了一些问题。

solver中的hyperparameters使用您提到的一般设置（--optimizer=sgd；--learning_rate=0.00005；--learning_rate_decay_factor=0.5）但是，网络的loss一直无法达到普通conv的精度；
Loss一直处于震荡状态不再减小。
请问，您能给一些建议如何解决这些问题吗？

Get location of objects in MobileNets

Hi,

I want to draw a bounding box around different objects which were used to train MobileNets.
Is it possible to get the location of different objects in an image using MobileNets?

Thanks.

关于RMSPropOptimizer

你好，训练时报这个错，'RMSPropOptimizer' object has no attribute 'apply_gradientsls'，请问是什么原因？

YellowFin optimizer has been intergrated.

I have no gpu resources to train on the imagenet. I have validated it on mnist dataset, got 97.64% accuracy.

Call for training ~_~

Best,
Zehao

ValueError: Can not squeeze dim[1], expected a dimension of 1, got 3 for 'MobileNet/SpatialSqueeze' (op: 'Squeeze') with input shapes: [32,3,17,1024].

Dear Zehao Shi,

I am receiving the following error while running your model.

(?, ?, 3)
Traceback (most recent call last):
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\framework\common_shapes.py", line 671, in _call_cpp_shape_fn_impl
input_tensors_as_shapes, status)
File "C:\ProgramData\Anaconda3\lib\contextlib.py", line 89, in exit
next(self.gen)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Can not squeeze dim[1], expected a dimension of 1, got 3 for 'MobileNet/SpatialSqueeze' (op: 'Squeeze') with input shapes: [32,3,17,1024].

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:/Users/hp/PycharmProjects/Mobilenets/MobileNet-master/train_object_detector.py", line 628, in
tf.app.run()
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\platform\app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "C:/Users/hp/PycharmProjects/Mobilenets/MobileNet-master/train_object_detector.py", line 524, in main
clones = model_deploy.create_clones(deploy_config, clone_fn, [batch_queue])
File "C:\Users\hp\PycharmProjects\Mobilenets\MobileNet-master\deployment\model_deploy.py", line 195, in create_clones
outputs = model_fn(args, kwargs)
File "C:/Users/hp/PycharmProjects/Mobilenets/MobileNet-master/train_object_detector.py", line 499, in clone_fn
end_points = network_fn(images)
File "C:\Users\hp\PycharmProjects\Mobilenets\MobileNet-master\nets\nets_factory.py", line 112, in network_fn
return func(images, num_classes, is_training=is_training, width_multiplier=width_multiplier)
File "C:\Users\hp\PycharmProjects\Mobilenets\MobileNet-master\nets\mobilenet.py", line 83, in mobilenet
net = tf.squeeze(net, [1, 2], name='SpatialSqueeze')
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\ops\array_ops.py", line 2259, in squeeze
return gen_array_ops._squeeze(input, axis, name)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 3378, in _squeeze
squeeze_dims=squeeze_dims, name=name)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 768, in apply_op
op_def=op_def)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 2338, in create_op
set_shapes_for_outputs(ret)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 1719, in set_shapes_for_outputs
shapes = shape_func(op)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 1669, in call_with_requiring
return call_cpp_shape_fn(op, require_shape_fn=True)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\framework\common_shapes.py", line 610, in call_cpp_shape_fn
debug_python_shape_fn, require_shape_fn)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\framework\common_shapes.py", line 676, in _call_cpp_shape_fn_impl
raise ValueError(err.message)
ValueError: Can not squeeze dim[1], expected a dimension of 1, got 3 for 'MobileNet/SpatialSqueeze' (op: 'Squeeze') with input shapes: [32,3,17,1024].*

Please suggest me a solution. Thanks!

Accuracy difference on PC and Phone

Hello everyone,
I have a question. Which is I used MobileNets for classification purpose and implemented it on android and PC, the problem is I get different prediction values on android and PC (linux) for the same picture and model. The difference is magnitude of 5% on average.
What can cause this difference ?
Is it because of the different decoding technique used on phone and PC?

I check the input tensor in to MobileNets on phone and PC and I found some differences on the pixel values.

regularization loss is large

I tried your pre-trained mobilenet model. The regularization loss is large about 5.25. Have you noticed this. I'm not quite sure if the large regularization means overfitting. If you trained the model using slim. I believe the weight_decay parameter can help reduce this loss.

MobileNet for semantic segmentation

Does anybody tested MobileNet for semantic segmentation ?

为什么CPU的速度能那么快？

谢谢。

问题 OOM when allocating tensor with shape[4,7,7,1024]

我把这个 train_mobilenet_on_imagenet.sh 文件中检测脚本给注释掉了，
现在只跑训练会跑这个错误 ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[64,32,112,112]
然后我修改了 --batch_size=4 ,可以跑起来，但是一会还是会报错 OOM when allocating tensor with shape[4,7,7,1024]
还请大神指导一下。。。。

Incorrect comment on the top of train_mobilenetdet_on_kitti.sh

This script performs the following operations:

1. Trains a MobileNet model on the Imagenet training set.

2. Evaluates the model on the Imagenet validation set.

"
https://github.com/Zehaos/MobileNet/blob/master/scripts/train_mobilenetdet_on_kitti.sh

Thank you for making these scripts available!

i meet a error: TypeError: separable_convolution2d() got an unexpected keyword argument 'data_format'

Your tensorflow is which version?
i meet a error:
TypeError: separable_convolution2d() got an unexpected keyword argument 'data_format'
@Zehaos

I wonder whether train_mobilenet_on_imagenet.py have to run on GPU, when on CPU,the following error occured

InvalidArgumentError (see above for traceback): Cannot assign a device to node 'clone_1/gradients/clone_1/MobileNet/conv_1/batch_norm/moments/sufficient_statistics/Sub_grad/BroadcastGradientArgs': Could not satisfy explicit device specification '/device:GPU:1' because no devices matching that specification are registered in this process; available devices: /job:localhost/replica:0/task:0/cpu:0
[[Node: clone_1/gradients/clone_1/MobileNet/conv_1/batch_norm/moments/sufficient_statistics/Sub_grad/BroadcastGradientArgs = BroadcastGradientArgs[T=DT_INT32, _device="/device:GPU:1"](clone_1/gradients/clone_1/MobileNet/conv_1/batch_norm/moments/sufficient_statistics/Sub_grad/Shape, clone_1/gradients/clone_1/MobileNet/conv_1/batch_norm/moments/sufficient_statistics/Sub_grad/Shape_1)]]

command line for training

What is the exact command line you used to train (for the run where you got 66.51% top-1 accuracy)?

Thanks

Error occurs :errors_impl.NotFoundError: /data/ShizehaoDataset/mobilenet-model when training on kitti

I've changed all the required paths in .sh file for kitti training, but it throws the following error and the most weird thing is that I've searched the whole project and there is no such string "/data/ShizehaoDataset/mobilenet-model".
太奇怪了。。。

 File "train_object_detector.py", line 622, in main
    init_fn=_get_init_fn(),
  File "train_object_detector.py", line 371, in _get_init_fn
    checkpoint_path = tf.train.latest_checkpoint(FLAGS.checkpoint_path)
  File "/usr/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1482, in latest_checkpoint
    if file_io.get_matching_files(v2_path) or file_io.get_matching_files(
  File "/usr/lib/python2.7/site-packages/tensorflow/python/lib/io/file_io.py", line 269, in get_matching_files
    compat.as_bytes(filename), status)]
  File "/usr/lib64/python2.7/contextlib.py", line 24, in __exit__
    self.gen.next()
  File "/usr/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.NotFoundError: /data/ShizehaoDataset/mobilenet-model

Loading mobilenet checkpoint issue:

 bash ./scripts/train_mobilenet_on_imagenet.sh
INFO:tensorflow:Scale of 0 disables regularizer.
INFO:tensorflow:Summary name /clone_loss is illegal; using clone_loss instead.
INFO:tensorflow:Fine-tuning from /home/gabbar/Downloads/mobilenet/graph.pbtxt
2017-06-28 12:42:27.146565: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-28 12:42:27.146602: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-28 12:42:27.146611: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-06-28 12:42:27.146619: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-28 12:42:27.146629: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-06-28 12:42:27.284000: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:893] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2017-06-28 12:42:27.284547: I tensorflow/core/common_runtime/gpu/gpu_device.cc:940] Found device 0 with properties: 
name: GeForce GTX TITAN X
major: 5 minor: 2 memoryClockRate (GHz) 1.076
pciBusID 0000:09:00.0
Total memory: 11.92GiB
Free memory: 11.47GiB
2017-06-28 12:42:27.284572: I tensorflow/core/common_runtime/gpu/gpu_device.cc:961] DMA: 0 
2017-06-28 12:42:27.284581: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   Y 
2017-06-28 12:42:27.284594: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:09:00.0)
INFO:tensorflow:Restoring parameters from /home/gabbar/Downloads/mobilenet/graph.pbtxt
2017-06-28 12:42:28.056787: W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /home/gabbar/Downloads/mobilenet/graph.pbtxt: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
2017-06-28 12:42:28.057147: W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /home/gabbar/Downloads/mobilenet/graph.pbtxt: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
2017-06-28 12:42:28.057209: W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /home/gabbar/Downloads/mobilenet/graph.pbtxt: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
2017-06-28 12:42:28.057251: W tensorflow/core/framework/op_kernel.cc:1158] Data loss: Unable to open table file /home/gabbar/Downloads/mobilenet/graph.pbtxt: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
2017-06-28 12:42:28.057356: W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /home/gabbar/Downloads/mobilenet/graph.pbtxt: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
2017-06-28 12:42:28.057430: W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /home/gabbar/Downloads/mobilenet/graph.pbtxt: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
2017-06-28 12:42:28.057446: W tensorflow/core/framework/op_kernel.cc:1158] Data loss: Unable to open table file /home/gabbar/Downloads/mobilenet/graph.pbtxt: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
2017-06-28 12:42:28.057653: W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /home/gabbar/Downloads/mobilenet/graph.pbtxt: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?

I specify the .pbtxt file location in the --checkpoint_path in under train_image_classifier in train_mobilenet_on_imagenet.sh. However I get the above error. How do I use the pre-trained model you have uploaded?