Noticed after running the image that fixed locking issues on the live demo site

Tensorflow "Illegal instruction" on some machines about photonix HOT 4 CLOSED

damianmoore commented on May 14, 2024

Tensorflow "Illegal instruction" on some machines

from photonix.

Comments (4)

damianmoore commented on May 14, 2024

Might be solvable by using Tensorflow Docker base image rather than pipenv installing it from pypi.

from photonix.

damianmoore commented on May 14, 2024

This happens because the version of Tensorflow on PyPI is compiled to use CPU instructions like AVX, AVX2, SSE4.1, SSE4.2 and FMA which my Scaleway baremetal server and HP ProLiant microserver do not support. I'm assuming the Tensorflow Docker images are compiled in the same way so using those will be no use.

I'm experimenting with compiling my own wheel package without the need for these CPU extensions. If there are notable performance issues then I'll look at installing different packages depending on current CPU once our Docker image has loaded.

from photonix.

damianmoore commented on May 14, 2024

Tensorflow build which runs on my HP ProLiant microserver is here https://github.com/damianmoore/tensorflow-builder/releases . Running benchmarks to determine the impact against more optimised one on PyPI.

from photonix.

damianmoore commented on May 14, 2024

These are some quick benchmarks of the PyPI version of Tensorflow versus my own build from the comment above (no CPU optimisations). As expected the unoptimised build performs slower, by not by very much. These were measured using the Object Detection model (which uses this pre-trained model) on a Dell XPS 13 2017 (9370 i7-8550U).

I ran 3 object detection predictions with each build and the test code was from this function. There was a common amount of overhead collecting tests etc. that can be removed from all results.

                                    Run 1   Run 2   Run 3   Mean
PyPI build:                         62.74   61.25   61.57   61.85
Custom build (unoptimised):         69.37   70.72   69.66   69.92
Testing overhead (to subtract):     15.96   15.12   15.97   15.68

This shows the custom build that works on all the tested machines takes 13.04% longer than the optimised one on PyPI. Alternatively, you could say the PyPI build completes in 88.45% of the time of the custom build.

This seems like a small difference in speed and that it would be acceptable use the custom (no CPU extensions) build everywhere. When we have time, we can produce different Tensorflow builds that are downloaded depending on CPU flags that are detected,

from photonix.

Recommend Projects

Tensorflow "Illegal instruction" on some machines about photonix HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent