Numpy and scipy need BLAS and LAPACK libraries for fast matrix and vector computation.
OSX provides a standard library for these called Accelerate, that is present
on every OSX machine at /System/Library/Frameworks/Accelerate.framework
.
Normally we link numpy and scipy against the Accelerate framework in OSX on the basis that it is fast, usually reliable, and we don't have to build it.
At the moment (June 2014) we have two problems with Accelerate:
We are getting a segfault in recent OSX Accelerate with float32 values not aligned to 32-byte boundaries:
Accelerate also doesn't play well with multiprocessing in Pythons before 3.4:
Of these two, ATLAS is the more stable.
Building numpy with OpenBLAS in June 2014 was giving me a test error
Now the details:
A binary installer should work for as many Pythons as possible. In particular it should work for the Python.org installer which is a standard default.
- Python.org provides a dual i386 / x86_64 build
- It advertises itself thus with its platform information;
python -c "import distutils.util; print(distutils.util.get_platform())"
givesmacosx-10.6-intel
, whereintel
means a combined i386 / x86_64 binary. - A compatible binary installer, such as a wheel, should therefore also be a combined i386 / x86_64 build
I need gcc and gfortran compilers to build ATLAS and scipy.
I chose gcc / gfortran 4.8.2 from (at the time) https://gcc.gnu.org/wiki/GFortranBinaries#MacOS
I've put the binary I used in this repo at https://github.com/matthew-brett/numpy-atlas-binaries/tree/master/archives to make sure the builds are reproducible.
Another option I could have used was the default Apple gcc 4.2. This uses clang as a backend
clang still seems to be slower than modern gcc - see: http://openbenchmarking.org/result/1204215-SU-LLVMCLANG23
With clang, I'd still need a gfortran compiler, maybe the default suggested on the scipy website hosted on CRAN. This is currently very old - version 4.2.3.
The clang / CRAN gfortran combination has the advantage that they do dual
architecture compilation. That is if you pass -arch i386 -arch x86_64
then
these compilers will compile 32 bit and 64 bit binaries and fuse them so that
a 32 bit or 64 bit program can link against them.
Instead, I bravely went for modern gcc, knowing that I would have to take care of the i386 / x86_64 dual architecture problem by hand.
The ATLAS build is fairly straightforward once you know what you're doing, but it's impractical to farm out to virtual machines because:
- it relies on having nearly complete use of the machine CPUs in order to test timing of candidate routines
- anything but a build with known architecture ("architectural defaults" in ATLAS parlance) takes about 6 hours even on a fast machine.
I've therefore built ATLAS on my laptop with CPU throttling turned off as far as I could, and no other significant processes running.
I build 32 and 64 bit ATLAS binaries separately. The resulting built ATLAS libraries are in the repo archives directory. The scripts I used to build them are here: 32 bit ; 64 bit.
These archives are linked against gfortran libs specific to this build of the compiler.
The default ATLAS build builds static libraries only. I then built dynamic libs from the static libs using this script
This is just a sketch. The full build process is in the waf
wscript
in the repository.
- for each architecture (i386, x86_64)
- copy ATLAS directory for arch into build directory and make dynamic ATLAS libs
- copy numpy / scipy sources to new directory
- if building scipy, build numpy with given back-compatibility tag to build against
- use ATLAS environment variable to point numpy / scipy at ATLAS binaries with matching architecture
- Compile with crafted compile / link flags to remove default
arch
flags and add specific architecture flags. This gives an architecture-specific wheel - use
delocate
anddelocate-wheel
to copy required dynamic libraries into the built wheel
- Fuse the architecture specific wheels into one combined architecture wheel
using
lipo -create
command viadelocate-fuse
.
I run this process for numpy / scipy using a command like this:
workon py27 # use python virtualenv
python waf distclean configure build cp_wheels
where py27
is a virtualenv for Python 2.7. Repeat using virtualenvs for
Pythons 3.3 and 3.4. cp_wheels
unconditionally copies the fused build
wheels into a directory ~/wheelhouse-atlas
.
- Maybe iterate over Python.org versions in the build process rather than doing it by hand on the command line, as above.
- Work out some way of uploading the build products from the travis-ci builders.