Comments (13)
Can you provide a patch? We don't have an aarch64 development environment, so I am not quite sure what to do. Is neon part of the standard aarch64 ISA, or is it an extension that requires special compiler flags?
from fftw3.
It is beyond me. I have googled this problem. I think Ubuntu community is working on it.
from fftw3.
matteo-frigo: yes NEON is part of the standard ISA (with double the registers, you'll like it!!!).
I just found this issue, I'm currently working it around by setting NEON_CFLAGS=-D__ARM_NEON__
from fftw3.
I don't know much about ARMv8, but there does seem to be some level of
support in gcc 4.9.2. Taken from the manpage:
-mfpu=name
This specifies what floating-point hardware (or hardware
emulation)
is available on the target. Permissible names are: vfp, vfpv3,
vfpv3-fp16, vfpv3-d16, vfpv3-d16-fp16, vfpv3xd, vfpv3xd-fp16,
neon,
neon-fp16, vfpv4, vfpv4-d16, fpv4-sp-d16, neon-vfpv4, fp-armv8,
neon-fp-armv8, and crypto-neon-fp-armv8.
If -msoft-float is specified this specifies the format of
floating-
point values.
If the selected floating-point hardware includes the NEON
extension
(e.g. -mfpu=neon), note that floating-point operations are not
generated by GCC's auto-vectorization pass unless
-funsafe-math-optimizations is also specified. This is because
NEON hardware does not fully implement the IEEE 754 standard for
floating-point arithmetic (in particular denormal values are
treated as zero), so the use of NEON instructions may lead to a
loss of precision.
Can someone please explain why this is not sufficient?
On Mon, Jan 12, 2015 at 9:53 AM, Jérôme Carretero [email protected]
wrote:
matteo-frigo: yes NEON is part of the standard ISA (with double the
registers, you'll like it!!!).I just found this issue, I'm currently working it around by setting
NEON_CFLAGS=-DFAKE_NEON_FLAGS—
Reply to this email directly or view it on GitHub
#25 (comment).
from fftw3.
Actually I was blocked further when trying to build (using an Android aarch64-linux-android sysroot); can someone build for 64-bit ARM at all?
from fftw3.
Figured out that there is no simd-support
code for aarch64... which explains the issue. I guess that should be contributed ;)
from fftw3.
I have closed it by mistake. @zougloub thanks for rooting the cause.
from fftw3.
I've added support for double-precision Neon (available in aarch64) in my fork, https://github.com/rdolbeau/fftw3. It should asupport both SP and DP version on aarch64. Only tested in Qemu so far. Any report welcome.
Upd: pass 'make check' on the X-Gene 1.
from fftw3.
@rdolbeau, wow, looks nice. @matteo-frigo, can you take a look?
(Dealing with MIT's Technology Licensing Office may be a hassle for incorporating nontrivial patches, however, unless you are willing to make a copyright assignment to MIT. The problem is that MIT still sells non-GPL licenses for FFTW.)
from fftw3.
I've also added a cycle counter in my fork for better planning (using the virtual counter register, see the commit message for details; thanks to ARM for the informations). Any report/comment/criticism welcome.
from fftw3.
Aarch64 is supported in the current trunk.
from fftw3.
Does this mean that FFTW releases have support for double precision FFT operations using optimized NEON instructions? How can I compile it to do so? @rdolbeau
When compiling with --enable-neon you get the error message configure: error: NEON requires single precision
.
from fftw3.
Armv7 (32 bits such such as A7,A9,A15, ...) do not have hardware support for double-precision NEON, and gives the message above.
Aarch64 (64 bits, such as A53, A72, ...) is required to get double-precision NEON support with --enable-neon.
from fftw3.
Related Issues (20)
- FFTW3LibraryDepends.cmake missing from autotools build HOT 4
- Error encountered when running 'make' after modifying Makefile.am
- Occasional failures in MPI part of the unit tests on ARM neoverse_v1 HOT 36
- Assessment of the difficulty in porting CPU architecture for fftw3 HOT 1
- FFTW SIMD Support HOT 1
- non-deterministic wisdom output HOT 4
- Link FFTW in Android Studio
- Fftw build failed when open ENABLE_OPENMP HOT 1
- Segmentation fault in check()
- Support DCT/DST-V-VIII HOT 1
- Smallbin double linked list corruption for specific data HOT 2
- FFTW threads: ld: error: undefined symbol: fftwf_threads_set_callback HOT 3
- `afft`?
- IFFT error HOT 3
- Compile application error
- Compilation Error HOT 3
- SVE Imlementation HOT 1
- Error: Unbound module "Pervasives" HOT 3
- "fftw_execute " case "Segmentation fault" HOT 4
- H
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fftw3.