Giter VIP home page Giter VIP logo

Comments (29)

Zehaos avatar Zehaos commented on May 24, 2024

It is unit testing which is used for testing the functionality. I don't know weather it is a standard way to test the speed.

from mobilenet.

SunAnLan avatar SunAnLan commented on May 24, 2024

soga, its quite slow and I didn't find where you test the speed?Could you please show me that part code?Hope for you help @Zehaos

from mobilenet.

Zehaos avatar Zehaos commented on May 24, 2024

嗖得丝奈~ You can use python 'time' package or timeline, it seems to be a more powerful tools, but I have no knowledge about it yet.

from mobilenet.

kwotsin avatar kwotsin commented on May 24, 2024

@SunAnLan It probably depends on your batch size and whether you're using CPU or GPU. The implementation is at least 3x faster in training compared to an inception v3's training for me.

from mobilenet.

SunAnLan avatar SunAnLan commented on May 24, 2024

纳入后头~洪洞阿里噶多 @Zehaos

from mobilenet.

shicai avatar shicai commented on May 24, 2024

因吹斯听。。。

from mobilenet.

Zehaos avatar Zehaos commented on May 24, 2024

@kwotsin , Could you share your equipment and the inference time? Thanks.

from mobilenet.

SunAnLan avatar SunAnLan commented on May 24, 2024

Did you try to run forward ? For me it’s quite slow? @kwotsin

from mobilenet.

Zehaos avatar Zehaos commented on May 24, 2024

@SunAnLan

You can measure inference time in this function testForward

import time

start_time = time.time()
output = sess.run(logits)
duration = time.time() - start_time

from mobilenet.

shicai avatar shicai commented on May 24, 2024

@Zehaos what is the total training time to train this model?
how many hours for one epoch?
As for me, it is very very slow.

from mobilenet.

SunAnLan avatar SunAnLan commented on May 24, 2024

@Zehaos Yeah, that's exactly what I am doing now. and through I got the forward speed is about 0.09s per pic on GTX 1080 8G . I don't think it's pretty fast

from mobilenet.

Zehaos avatar Zehaos commented on May 24, 2024

@shicai Using 2 GTX1080 in asynchronous mode, it spends 14d 14h 20min totally, about 2.8h per epoch.
Have you ever train other network on imagenet before? I have no concept about the training time.

from mobilenet.

Zehaos avatar Zehaos commented on May 24, 2024

@SunAnLan You'd better run it for several iteration, and then compute the average.
Looking forward to your result!

from mobilenet.

Zehaos avatar Zehaos commented on May 24, 2024

@SunAnLan I got 0.09s using a full-load gtx1060.... May be this is not a right way to benchmark.

from mobilenet.

Zehaos avatar Zehaos commented on May 24, 2024

I got 0.004 +/- 0.000 sec per image, using time_benchmark.py.

from mobilenet.

SunAnLan avatar SunAnLan commented on May 24, 2024

I also got 0.09s per pic now . average after 100 iteration @Zehaos

from mobilenet.

Zehaos avatar Zehaos commented on May 24, 2024

@SunAnLan Use time_benchmark.py instead.

from mobilenet.

SunAnLan avatar SunAnLan commented on May 24, 2024

get ! @Zehaos

from mobilenet.

austingg avatar austingg commented on May 24, 2024

@Zehaos @shicai It is quite slow.
I haven't use tensorflow for imagenet training, However each epoch costs about 35 mins when training resnet18 using MxNet (two gtx 1080). As for mobilenet, it costs about 2h using mxnet, Since currently the separate conv2d is implemented by group.

from mobilenet.

Zehaos avatar Zehaos commented on May 24, 2024

@austingg @shicai I have do a benchmark on backward speed README, it seems that the backward speed of mobilenet is quite slow. Backward time is almost 4 times as the forward time in gpu mode. 2 times is common in other network. CNN-Benchmarks

from mobilenet.

kwotsin avatar kwotsin commented on May 24, 2024

@Zehaos I ran my model training on an NVIDIA GTX860M, and my image inference time is roughly 0.85s for an image of size 299x299, when I froze my graph.

from mobilenet.

Zehaos avatar Zehaos commented on May 24, 2024

@kwotsin Could you show me your code?

from mobilenet.

kwotsin avatar kwotsin commented on May 24, 2024

@Zehaos No problem. Here is the gist: https://gist.github.com/kwotsin/292eb12600be02b75bf69ff8010d07ce
Note: I did not train on the ImageNet due to limited resources (no storage and GPU not powerful enough). I only trained on the flowers dataset which is small enough for me to handle.

I also included the showing of the image but the run time is computed before the image is shown. Btw do you have a rough timeline of when your pre-trained imagenet model for mobilenet will be out? Looking forward to it! :D

Here's the output:

2017-05-04 11:52:56.022869: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2017-05-04 11:52:56.023266: I tensorflow/core/common_runtime/gpu/gpu_device.cc:887] Found device 0 with properties: 
name: GeForce GTX 860M
major: 5 minor: 0 memoryClockRate (GHz) 1.0195
pciBusID 0000:01:00.0
Total memory: 3.95GiB
Free memory: 3.70GiB
2017-05-04 11:52:56.023291: I tensorflow/core/common_runtime/gpu/gpu_device.cc:908] DMA: 0 
2017-05-04 11:52:56.023297: I tensorflow/core/common_runtime/gpu/gpu_device.cc:918] 0:   Y 
2017-05-04 11:52:56.023315: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 860M, pci bus id: 0000:01:00.0)
2017-05-04 11:52:56.079684: I tensorflow/compiler/xla/service/platform_util.cc:58] platform CUDA present with 1 visible devices
2017-05-04 11:52:56.079726: I tensorflow/compiler/xla/service/platform_util.cc:58] platform Host present with 8 visible devices
2017-05-04 11:52:56.080847: I tensorflow/compiler/xla/service/service.cc:183] XLA service 0x59d7430 executing computations on platform Host. Devices:
2017-05-04 11:52:56.080872: I tensorflow/compiler/xla/service/service.cc:191]   StreamExecutor device (0): <undefined>, <undefined>
2017-05-04 11:52:56.081013: I tensorflow/compiler/xla/service/platform_util.cc:58] platform CUDA present with 1 visible devices
2017-05-04 11:52:56.081024: I tensorflow/compiler/xla/service/platform_util.cc:58] platform Host present with 8 visible devices
2017-05-04 11:52:56.081499: I tensorflow/compiler/xla/service/service.cc:183] XLA service 0x5a27830 executing computations on platform CUDA. Devices:
2017-05-04 11:52:56.081532: I tensorflow/compiler/xla/service/service.cc:191]   StreamExecutor device (0): GeForce GTX 860M, Compute Capability 5.0
Prediction: dandelion, Probability: 0.304545 

Prediction: daisy, Probability: 0.228195 

Prediction: tulips, Probability: 0.201852 

Prediction: sunflowers, Probability: 0.201204 

Prediction: roses, Probability: 0.0642046 

RUN TIME: 0.864599943161

from mobilenet.

Zehaos avatar Zehaos commented on May 24, 2024

@kwotsin Why not start the timer exactly before fetching the prediction?
The pretrained weights is now available README

from mobilenet.

kwotsin avatar kwotsin commented on May 24, 2024

Hmm I'm not very sure how the inference time is measured, so I simply timed from the start till the end. I was thinking if I were to predict an image, I'll probably need to construct the graph beforehand also which would take some time, so I included the graph construction time as well.

from mobilenet.

SunAnLan avatar SunAnLan commented on May 24, 2024

I tried your time_benchmark.py. I got 0.063 +/- 0.006 sec forward on my GTX 1080 8G. How did you make it that fast??? @Zehaos

from mobilenet.

Zehaos avatar Zehaos commented on May 24, 2024

@SunAnLan time_benchmark.py#L36 Use gpu instead...

from mobilenet.

SunAnLan avatar SunAnLan commented on May 24, 2024

My fault!....2333 @Zehaos

from mobilenet.

Satisfie avatar Satisfie commented on May 24, 2024

Hi, when i use engine:CAFFE, it's real slow, I use 8 TitanXP, batchsize=1024, 10s per iter, real slow.
other model like inception-resnet-v2 just use 1s per iter

from mobilenet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.