Giter VIP home page Giter VIP logo

Comments (4)

damianmoore avatar damianmoore commented on May 14, 2024

Might be solvable by using Tensorflow Docker base image rather than pipenv installing it from pypi.

from photonix.

damianmoore avatar damianmoore commented on May 14, 2024

This happens because the version of Tensorflow on PyPI is compiled to use CPU instructions like AVX, AVX2, SSE4.1, SSE4.2 and FMA which my Scaleway baremetal server and HP ProLiant microserver do not support. I'm assuming the Tensorflow Docker images are compiled in the same way so using those will be no use.

I'm experimenting with compiling my own wheel package without the need for these CPU extensions. If there are notable performance issues then I'll look at installing different packages depending on current CPU once our Docker image has loaded.

from photonix.

damianmoore avatar damianmoore commented on May 14, 2024

Tensorflow build which runs on my HP ProLiant microserver is here https://github.com/damianmoore/tensorflow-builder/releases . Running benchmarks to determine the impact against more optimised one on PyPI.

from photonix.

damianmoore avatar damianmoore commented on May 14, 2024

These are some quick benchmarks of the PyPI version of Tensorflow versus my own build from the comment above (no CPU optimisations). As expected the unoptimised build performs slower, by not by very much. These were measured using the Object Detection model (which uses this pre-trained model) on a Dell XPS 13 2017 (9370 i7-8550U).

I ran 3 object detection predictions with each build and the test code was from this function. There was a common amount of overhead collecting tests etc. that can be removed from all results.

                                    Run 1   Run 2   Run 3   Mean
PyPI build:                         62.74   61.25   61.57   61.85
Custom build (unoptimised):         69.37   70.72   69.66   69.92
Testing overhead (to subtract):     15.96   15.12   15.97   15.68

This shows the custom build that works on all the tested machines takes 13.04% longer than the optimised one on PyPI. Alternatively, you could say the PyPI build completes in 88.45% of the time of the custom build.

This seems like a small difference in speed and that it would be acceptable use the custom (no CPU extensions) build everywhere. When we have time, we can produce different Tensorflow builds that are downloaded depending on CPU flags that are detected,

from photonix.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.