Comments (9)
Hi @liviolima80 we're seeing about 5 times faster than upstream Caffe across various tests on Cortex-A72 and Cortex-A54 using the NEON backend. I'm afraid we don't have benchmarking running on Cortex-A9, but realistically a smaller speedup would be expected.
from armnn.
Do you have public benchmark with the comparison to official Caffe library?
from armnn.
Hi @liviolima80 I have had a look and we have no public benchmark data at this time. We're working hard on the machine learning part of https://developer.arm.com/ and I will pass on your request to the relevant teams.
from armnn.
OK @MatthewARM ,
in the meantime, any option to run armnn in multi-thread mode in order to use multiple processors?
from armnn.
Hi @liviolima80 multi-threading is high up on our roadmap. In the meantime, you can try calling the methods of arm_compute::Scheduler yourself. The documentation for that is here: https://arm-software.github.io/ComputeLibrary/v18.03/classarm__compute_1_1_scheduler.xhtml
from armnn.
Hi @MatthewARM ,
I tried to take a look to the documentation and code but it is quite difficult to understand how to use. is it possible to have a clear and easy to use example on how to use the scheduler?
Regards
from armnn.
Please have a look at this question
from armnn.
Hi @AnthonyARM,
thank you for the link. I tried to change the number of threads in the scheduler setting but it seems that the better performance are obtained with the default config that is equal to 4 threads, that corresponds to the number of cores in the Cortex A9 architecture I'm using.
I'm looking forward to see if next releases of armnn will have some further improvement
from armnn.
I'm trying to compare the processing time running a convolutional network with Caffe C++ library and ArmnnCaffeParser built on ARM ComputeLibrary. These are details:
target architecture: Freescale IMX6Q
input image: 256x256 pixel and 3 channels
architecture:
conv2d 3x3 - 32 filters + RELU
conv2d 3x3 - 32 filters + RELU
max pool 2x2
conv2d 3x3 - 64 filters + RELU
conv2d 3x3 - 64 filters + RELU
max pool 2x2
conv2d 3x3 - 128 filters + RELU
max pool 2x2
conv2d 3x3 - 256 filters + RELU
max pool 2x2
conv2d 3x3 - 256 filters + RELU
max pool 2x2
fully connected layer with 48 output
relu
fully connected layer with 3 output
softmaxprocessing time:
- Caffe c++ library (without ComputeLibrary optimization) : 4.5 seconds
- ArmnnCaffeParser (with ComputeLibrary optimization): 2.2 seconds
I'm a little bit disappointed since my expectations were to boost the processing time more than a factor 2
Do you think that I'm doing something wrong or is this improvement factor comparable with your benchmarks?
Thanks
How did you calculate processing time ?
We are building a caffe2 parser for armnn we want to compare it with the default caffe2
from armnn.
Related Issues (20)
- Gather(ND) dim error HOT 1
- Link error on older Android devices HOT 8
- BUG: using delegate with transformer | AttributeError: 'NoneType' object has no attribute 'c_void_p' HOT 22
- About the configuration setting of externalMemoryManagementEnabled HOT 1
- The mali gpu computation capability is only used no more than 15% when inference resnet50 with ArmNN HOT 1
- onnx deployment fail HOT 7
- Another error on another model and by now it's two out of three :( HOT 22
- Unsupported Operation "Transpose" in armNN::OnnxParser while loading the onnx model file (in the goal to run inference) HOT 8
- Build Issues -Werrors on armbian 24 HOT 1
- nvalid attempt to construct ConstTensor from non-constant TensorInfo HOT 1
- Profiler service warnings seen in ArmNN v24.02 HOT 4
- Does ExecuteNetwork support "GpuAcc" runtime ? HOT 13
- Running YOLOv5 ONNX model with onnx parser of armnn fails with unsupported operation HOT 2
- Crash when support ArmNN AIDL backend based on a shim over the NNAPI Support Library
- Crash when support ArmNN AIDL backend based on a shim over the NNAPI Support Library HOT 2
- Unitests failed HOT 3
- understanding the onnx parser HOT 2
- Whisper tflite doesn't work HOT 4
- Error compiling `OnnxMnist-Armnn.cpp` HOT 2
- How to set priority of the application running on GPU? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from armnn.