Comments (29)
It is unit testing which is used for testing the functionality. I don't know weather it is a standard way to test the speed.
from mobilenet.
soga, its quite slow and I didn't find where you test the speed?Could you please show me that part code?Hope for you help @Zehaos
from mobilenet.
嗖得丝奈~ You can use python 'time' package or timeline, it seems to be a more powerful tools, but I have no knowledge about it yet.
from mobilenet.
@SunAnLan It probably depends on your batch size and whether you're using CPU or GPU. The implementation is at least 3x faster in training compared to an inception v3's training for me.
from mobilenet.
纳入后头~洪洞阿里噶多 @Zehaos
from mobilenet.
因吹斯听。。。
from mobilenet.
@kwotsin , Could you share your equipment and the inference time? Thanks.
from mobilenet.
Did you try to run forward ? For me it’s quite slow? @kwotsin
from mobilenet.
You can measure inference time in this function testForward
import time
start_time = time.time()
output = sess.run(logits)
duration = time.time() - start_time
from mobilenet.
@Zehaos what is the total training time to train this model?
how many hours for one epoch?
As for me, it is very very slow.
from mobilenet.
@Zehaos Yeah, that's exactly what I am doing now. and through I got the forward speed is about 0.09s per pic on GTX 1080 8G . I don't think it's pretty fast
from mobilenet.
@shicai Using 2 GTX1080 in asynchronous mode, it spends 14d 14h 20min totally, about 2.8h per epoch.
Have you ever train other network on imagenet before? I have no concept about the training time.
from mobilenet.
@SunAnLan You'd better run it for several iteration, and then compute the average.
Looking forward to your result!
from mobilenet.
@SunAnLan I got 0.09s using a full-load gtx1060.... May be this is not a right way to benchmark.
from mobilenet.
I got 0.004 +/- 0.000 sec per image, using time_benchmark.py.
from mobilenet.
I also got 0.09s per pic now . average after 100 iteration @Zehaos
from mobilenet.
@SunAnLan Use time_benchmark.py instead.
from mobilenet.
get ! @Zehaos
from mobilenet.
@Zehaos @shicai It is quite slow.
I haven't use tensorflow for imagenet training, However each epoch costs about 35 mins when training resnet18 using MxNet (two gtx 1080). As for mobilenet, it costs about 2h using mxnet, Since currently the separate conv2d is implemented by group.
from mobilenet.
@austingg @shicai I have do a benchmark on backward speed README, it seems that the backward speed of mobilenet is quite slow. Backward time is almost 4 times as the forward time in gpu mode. 2 times is common in other network. CNN-Benchmarks
from mobilenet.
@Zehaos I ran my model training on an NVIDIA GTX860M, and my image inference time is roughly 0.85s for an image of size 299x299, when I froze my graph.
from mobilenet.
@kwotsin Could you show me your code?
from mobilenet.
@Zehaos No problem. Here is the gist: https://gist.github.com/kwotsin/292eb12600be02b75bf69ff8010d07ce
Note: I did not train on the ImageNet due to limited resources (no storage and GPU not powerful enough). I only trained on the flowers dataset which is small enough for me to handle.
I also included the showing of the image but the run time is computed before the image is shown. Btw do you have a rough timeline of when your pre-trained imagenet model for mobilenet will be out? Looking forward to it! :D
Here's the output:
2017-05-04 11:52:56.022869: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2017-05-04 11:52:56.023266: I tensorflow/core/common_runtime/gpu/gpu_device.cc:887] Found device 0 with properties:
name: GeForce GTX 860M
major: 5 minor: 0 memoryClockRate (GHz) 1.0195
pciBusID 0000:01:00.0
Total memory: 3.95GiB
Free memory: 3.70GiB
2017-05-04 11:52:56.023291: I tensorflow/core/common_runtime/gpu/gpu_device.cc:908] DMA: 0
2017-05-04 11:52:56.023297: I tensorflow/core/common_runtime/gpu/gpu_device.cc:918] 0: Y
2017-05-04 11:52:56.023315: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 860M, pci bus id: 0000:01:00.0)
2017-05-04 11:52:56.079684: I tensorflow/compiler/xla/service/platform_util.cc:58] platform CUDA present with 1 visible devices
2017-05-04 11:52:56.079726: I tensorflow/compiler/xla/service/platform_util.cc:58] platform Host present with 8 visible devices
2017-05-04 11:52:56.080847: I tensorflow/compiler/xla/service/service.cc:183] XLA service 0x59d7430 executing computations on platform Host. Devices:
2017-05-04 11:52:56.080872: I tensorflow/compiler/xla/service/service.cc:191] StreamExecutor device (0): <undefined>, <undefined>
2017-05-04 11:52:56.081013: I tensorflow/compiler/xla/service/platform_util.cc:58] platform CUDA present with 1 visible devices
2017-05-04 11:52:56.081024: I tensorflow/compiler/xla/service/platform_util.cc:58] platform Host present with 8 visible devices
2017-05-04 11:52:56.081499: I tensorflow/compiler/xla/service/service.cc:183] XLA service 0x5a27830 executing computations on platform CUDA. Devices:
2017-05-04 11:52:56.081532: I tensorflow/compiler/xla/service/service.cc:191] StreamExecutor device (0): GeForce GTX 860M, Compute Capability 5.0
Prediction: dandelion, Probability: 0.304545
Prediction: daisy, Probability: 0.228195
Prediction: tulips, Probability: 0.201852
Prediction: sunflowers, Probability: 0.201204
Prediction: roses, Probability: 0.0642046
RUN TIME: 0.864599943161
from mobilenet.
@kwotsin Why not start the timer exactly before fetching the prediction?
The pretrained weights is now available README
from mobilenet.
Hmm I'm not very sure how the inference time is measured, so I simply timed from the start till the end. I was thinking if I were to predict an image, I'll probably need to construct the graph beforehand also which would take some time, so I included the graph construction time as well.
from mobilenet.
I tried your time_benchmark.py. I got 0.063 +/- 0.006 sec forward on my GTX 1080 8G. How did you make it that fast??? @Zehaos
from mobilenet.
@SunAnLan time_benchmark.py#L36 Use gpu instead...
from mobilenet.
My fault!....2333 @Zehaos
from mobilenet.
Hi, when i use engine:CAFFE, it's real slow, I use 8 TitanXP, batchsize=1024, 10s per iter, real slow.
other model like inception-resnet-v2 just use 1s per iter
from mobilenet.
Related Issues (20)
- Test single image HOT 3
- How to run in terminal?
- Mobile Net Detection Reshape Problem
- Poor performance on large objects?
- Depthwise convolution
- mobilenet_v1_eval.py has a big bug? HOT 3
- how to set the train and val datasets folder name.
- Do we need to preprocess, i.e. resize images to a fixed predefined image input resolution (for example 256*256) on our own, in order to train properly?
- Using eval_image_classifier.py HOT 1
- during classification ,I am doing performance testing on AWS with inception model flask api with gunicorn (creating multiple process) Error: OOM when allocating tensor with shape[800,1280,3] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[Node: Cast = CastDstT=DT_FLOAT, SrcT=DT_UINT8, _device="/job:localhost/replica:0/task:0/device:GPU:0"]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
- has anyone tried the mobileNet on KITTI dataset HOT 10
- Loss does not converge HOT 4
- how to train my own classifier with mobilenet
- mobilenet not work HOT 1
- 超参数设置问题
- consul upgrade election error
- What's the meaning of the parameter: group ?
- how to usa VGG to train 448*448 size picture
- 这个官方提供的slim练习代码吗?
- 文献公式解释
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mobilenet.