fftw / fftw3 Goto Github PK
View Code? Open in Web Editor NEWDO NOT CHECK OUT THESE FILES FROM GITHUB UNLESS YOU KNOW WHAT YOU ARE DOING. (See below.)
License: GNU General Public License v2.0
DO NOT CHECK OUT THESE FILES FROM GITHUB UNLESS YOU KNOW WHAT YOU ARE DOING. (See below.)
License: GNU General Public License v2.0
It looks like the script that generates fftw3q.f03 failed to produce an entry for the function fftwq_alloc_real on line 567.
! Unable to generate Fortran interface for fftwq_alloc_real
should probably be
type(C_PTR) function fftwq_alloc_real(n) bind(C, name='fftwq_alloc_real')
import
integer(C_SIZE_T), value :: n
end function fftwq_alloc_real
I'm opening a new issue here since the other was closed and I am unable to reopen it.
As stated in the manual, given the nature of FTTW_ESTIMATE you would expect a deterministic solution (i.e. bit reproducibility). However, it seems that this is not always the case, as observed for some cases comparing a plan_many with howmany=1 vs plan_many with howmany>1. Is this expected, that the same plan is not determined with variable howmany, and so bit reproducibility is not supported for these cases?
I submit the following inplace R2C/C2R example using fftw_plan_many_dft_r2c/c2r, with plan flags FFTW_ESTIMATE used (although I have experimented with FFTW_NO_SIMD and FFTW_UNALIGNED, and see similar failures). There are many particular combinations of howmany and N which produce a "failure."
I transform a "reference" 1D array of length N, is compared to the results of Ny transforms on a "test" set of identical 1D arrays (i.e., each of the Ny "rows" is initially identical to the reference row, laid out in memory as a 2D array). For the reference array I plan using howmany=1, and for the Ny test arrays I plan using howmany=Ny. It is observed that for some values of N, Ny the transformed test arrays are not identical to the transformed reference array. Looking further, I see that the plans returned during these "failures" are not equal.
I won't paste the full output, which shows the results of the checks, but they differ in the LSB. Instead I paste the plans generated, allowing for comparison.
Here is the result for Ny=2, N=2522 (although, as you say, results may vary compiler to compiler), where the mismatches occur after the backward (c2r) transform only. (I have seen cases where r2c failed as well):
Reference Plan:
(rdft2-ct-dif/2
(hc2c-direct-2/4/0 "hc2cbdftv_2_avx"
(rdft2-hc2r-direct-2 "r2cb_2")
(rdft2-nop))
(dft-ct-dif/13
(dftw-generic-dif-13-97
(dft-direct-13-x97 "n1bv_13_avx"))
(dft-buffered-97-x13/13-5
(dft-vrank>=1-x13/1
(dft-rader-97/is=2/os=2
(dft-ct-dit/16
(dftw-direct-16/16 "t3bv_16_avx")
(dft-direct-6-x16 "n1_6"))
(dft-ct-dit/8
(dftw-direct-8/12 "t3fv_8_avx")
(dft-directbuf/14-12-x8 "n2fv_12_avx"))
(dft-ct-dit/8
(dftw-direct-8/12 "t3fv_8_avx")
(dft-directbuf/14-12-x8 "n2fv_12_avx"))))
(dft-r2hc-1
(rdft-rank0-tiledbuf/2-x13-x97))
(dft-nop))))
Test Plan:
(rdft2-ct-dif/2
(hc2c-direct-2/4/0-x2 "hc2cbdftv_2_avx"
(rdft2-hc2r-direct-2 "r2cb_2")
(rdft2-nop))
(dft-buffered-1261-x2/2-1
(dft-vrank>=1-x2/1
(dft-ct-dif/13
(dftw-generic-dif-13-97
(dft-direct-13-x97 "n1bv_13_avx"))
(dft-vrank>=1-x13/1
(dft-rader-97/is=2/os=26
(dft-ct-dit/8
(dftw-direct-8/28 "t1buv_8_avx")
(dft-direct-12-x8 "n1_12"))
(dft-ct-dit/8
(dftw-direct-8/12 "t3fv_8_avx")
(dft-directbuf/14-12-x8 "n2fv_12_avx"))
(dft-ct-dit/8
(dftw-direct-8/12 "t3fv_8_avx")
(dft-directbuf/14-12-x8 "n2fv_12_avx"))))))
(dft-r2hc-1
(rdft-rank0-iter-ci/2522-x2))
(dft-nop)))
Notice that the plans are similar, but not identical -- likely accounting for the slight difference in the transformed values (which are of the order 1e-10, but again, only for some of the elements). Is there something about the plan_many that itself doesn't guarantee bit reproducibility, seemingly variable with howmany?
A test code follows (sorry if it's a bit long, I have a lot of checks in there):
int main(void) {
double *ref, *test;
void *in, *out;
fftw_plan fpref, fptest, bpref, bptest;
int i, j, Npass, Nfail, Ndims, N, Nx, Ny, stride, rdist, cdist, inembed, onembed;
Ndims = 1;
stride = 1;
inembed = onembed = 0;
Ny = 2;
N = 2522;
Nx = (N/2+1)*2;
rdist = Nx;
cdist = N/2+1;
// Allocate reference and test arrays
ref = fftw_malloc(Nxsizeof(double));
test = fftw_malloc(NxNy*sizeof(double));
// Plan reference (Nx)
in = out = ref;
fpref = fftw_plan_many_dft_r2c(Ndims, &N, 1, in, &inembed, stride, rdist, out, &onembed, stride, cdist, FFTW_ESTIMATE);
bpref = fftw_plan_many_dft_c2r(Ndims, &N, 1, out, &onembed, stride, cdist, in, &inembed, stride, rdist, FFTW_ESTIMATE);
// Plan test (Nx*Ny)
in = out = test;
fptest = fftw_plan_many_dft_r2c(Ndims, &N, Ny, in, &inembed, stride, rdist, out, &onembed, stride, cdist, FFTW_ESTIMATE);
bptest = fftw_plan_many_dft_c2r(Ndims, &N, Ny, out, &onembed, stride, cdist, in, &inembed, stride, rdist, FFTW_ESTIMATE);
// printf("Filling ref array\n");
for (j=0; j<N; ++j){ ref[j] = (double) (j+1); }
// printf("Filling test array\n");
for (i=0; i<Ny; ++i){ for (j=0; j<N; ++j){ test[i,j] = ref[j]; } }
// printf("Executing forward plans\n");
fftw_execute(fpref);
fftw_execute(fptest);
// printf("Evaluating arrays after forward transform\n");
Nfail = 0;
Npass = 0;
for (i=0; i printf("FWD: N = %d, Nx = %d, Ny = %d, Npass = %lld, Nfail = %lld\n", N, Nx, Ny, Npass, Nfail);
if ( Nfail > 0 ){
printf("Reference Plan:\n");
fftw_print_plan(fpref);
printf("\nTest Plan:\n");
fftw_print_plan(fptest);
printf("\n\n");
}
// printf("Executing backward plans\n");
fftw_execute(bpref);
fftw_execute(bptest);
// printf("Evaluating arrays after backward transform\n");
Nfail = 0;
Npass = 0;
for (i=0; i printf("BWD: N = %d, Nx = %d, Ny = %d, Npass = %lld, Nfail = %lld\n", N, Nx, Ny, Npass, Nfail);
if ( Nfail > 0 ){
printf("Reference Plan:\n");
fftw_print_plan(bpref);
printf("\nTest Plan:\n");
fftw_print_plan(bptest);
printf("\n\n");
}
// printf("Freeing resources\n");
fftw_destroy_plan(fpref);
fftw_destroy_plan(fptest);
fftw_destroy_plan(bpref);
fftw_destroy_plan(bptest);
fftw_free(ref);
fftw_free(test);
return 0;
}
http://www.fftw.org/fftw3_doc/Thread-safety.html mentions
We do not think this should be an important restriction
I hope I can convince you otherwise.
fftw3 is used by a variety of audio plugins.
Those plugins are loaded into the host's memory-space (usually an audio workstation). The host has limited control of what the plugin does internally, and the plugins do not know about each other.
There is no way to ensure that two independent plugins which are linked against libfftw do not run the shared planner simultaneously. Nor is there a possibility to control this on host application level.
When two independent plugins create fftw plans the application usually segfaults or similar undesired effects manifest.
Possible solutions for this include:
The issue at hand is not limited to audio-application, there are likely other applications with similar problems out there (gnu-octave comes to mind, but I don't know for certain).
As the Thread-safety page mentions, it's as simple as
wrap a semaphore lock around any calls to the planner
Is there some good reason why libfftw does not do this by default?
Existing applications should not be affected by this (they're not supposed to call the planner from different threads), but that change would make all the difference for multi-threaded plugin hosts.
I suppose it could be a bit of work to wrap all planner entry-points with a semaphore, yet there may be a neat simple solution using #define
.
I'll be happy to look into this, but before going that way, I'd like to ask if such a change would be accepted by fftw or if there is an even better solution planned for future version that will make fftw's planner thread-safe.
yours truly,
robin - for the linux-audio community and for himself
Notable audio plugins using fftw3:
http://calf.sourceforge.net/
http://factorial.hu/plugins/lv2/ir
http://guitarix.sourceforge.net/
http://breakfastquay.com/rubberband/
http://plugin.org.uk/
http://zynaddsubfx.sourceforge.net/
https://github.com/x42/meters.lv2
...
Notable affected plugin hosts:
http://ardour.org/
http://qtractor.sourceforge.net/
https://github.com/falkTX/Carla/
...
The portland pgc++ compiler is not able to link to FFTW-3.3.4 since the compiler pretends to be gcc-4.8, although it doesn’t support __float128. Thus, it chokes on line 373 of fftw3.h.
Obviously this is not a bug in FFTW, and we'll file it with Portland group, but since older compilers will still be around and it is trivial to fix in the header you might want to include this in the next release.
The PGI compilers can always be identified with "defined(__PGI)", so I would suggest modifying the define starting on line 361 of fftw3.h to
&& !(defined(ICC) || defined(__INTEL_COMPILER) || defined(__PGI))
&& (defined(__i386) || defined(x86_64) || defined(ia64))
Dear FFTW authors,
We are trying to compile code that uses fftw and CUDA using
Cuda 6.0.1 on a 64Bit Debian/Wheezy. Compiling the minimum example
// file: test.cu
#include <fftw3.h>
int main()
{
fftwf_complex a;
return 0;
}
with
$nvcc test.cu -lfftw3 -I/usr/local/cuda/include
results in several error lines :
/usr/include/fftw3.h(371): error: identifier "__float128" is undefined
The issue is probably due to an incompatibility of nvcc and gcc regarding quadmath. For the intel compiler this is handled in line 359ff. of fftw3.h. Compiling the code and providing the intel flag manually
$nvcc test.cu -lfftw3 -I/usr/local/cuda/include -D__INTEL_COMPILER
works.
Cheers,
Marvin and Tobias
Hi guys, at bukka/php-fftw we have two doubts
1)
in php we can clone a object, the problem is, how to clone? i was thinking about export wisdom from the first object and import wisdom at the second object, could it solve the problem?
i'm taking this idea from here:
http://www.fftw.org/fftw3_doc/Wisdom-Export.html#Wisdom-Export
http://www.fftw.org/fftw3_doc/Wisdom-Import.html#Wisdom-Import
what's the best place to ask question about fftw lib? i probably will have more doubt :)
Hello,
I'm trying to make a X-Compile from debian 8 to AARCH64 with neon using aarch64-linux-gnu-gcc. Without NEON I have no problem's. But with setting the --enable-neon flag it fails.
Here is my build-log:
fftw-build.txt
I really don't know what I'm doing wrong.
I'm looking at the documentation at http://www.fftw.org/doc/Multi_002dDimensional-DFTs-of-Real-Data.html#Multi_002dDimensional-DFTs-of-Real-Data
This page seems to say that, given a WxH of real data array, the r2c transformation will lead to a (W/2+1)xH complex array. Unfortunately, it seems the actual output is (W/2+1)x(H/2) complex values.
The following code allows to verify it: https://gist.github.com/ubitux/5442675
Output of this code is: https://gist.github.com/ubitux/5442713
If you attempt to change the number of displayed rows in the print_fft_block() function (by H like the doc seems to say, or even H/2+1, it will cause various invalid reads.
Assuming the code is correct, I believe something is wrong in the documentation, or at least not clear.
Hey folks,
I came across this strange behavior of the fftw_execute_dft_c2r
method. It should be reproduced by the following code:
#include <fftw3.h>
int
main()
{
double x1[256], x2[256];
fftw_complex z[129], ztest[129];
double intest[256];
int n = 256, i;
for (i = 0; i < n; i++) {
x1[i] = (i < 86 ? 0 : (i < 172 ? 1 : 0));
}
fftw_plan plan_r2c = fftw_plan_many_dft_r2c(1, &n, 1,
intest, NULL, 1, 0,
z, NULL, 1, 0,
FFTW_MEASURE | FFTW_UNALIGNED);
fftw_execute_dft_r2c(plan_r2c, x1, z);
for (i = 0; i < 129; i++) {
printf("%g + %gi\n", z[i][0], z[i][1]);
}
printf("\n\n\n");
fftw_plan plan_c2r = fftw_plan_many_dft_c2r(1, &n, 1,
ztest, NULL, 1, 0,
x2, NULL, 1, 0,
FFTW_MEASURE | FFTW_UNALIGNED);
fftw_execute_dft_c2r(plan_c2r, z, x2);
for (i = 0; i < 129; i++) {
printf("%g + %gi\n", z[i][0], z[i][1]);
}
}
As I understand, when a plan is properly initialized (as non-in-place trafo) then the input should be not modified upon execution. This is true for r2c and normal dft but not the c2r case. The plan creation doesn't seem to differ (in comparison to r2c and normal dft) so I am wondering if this could be a bug?
PS: This was reproduced by me on a Mac (OSX 10.9.2 with FFTW3) and on a Linux/Ubuntu machine independently.
Can a header including this class be released in public domain, while its implementation is released with GPL 2 or later?
class PlanFloat_1dR2C
{
public:
typedef std::complex<float> OutputType;
typedef float InputType;
static size_t sizeOut(size_t size_in)
{return size_in/2+1;}
PlanFloat_1dR2C(InputType* buffer_in
,OutputType* buffer_out, size_t n_elem);
void execute();
~PlanFloat_1dR2C();
private:
void* plan;
};
As you may notice, it is wrapper to an interface similar to FFTW3 (GPL), but since the header itself does not explicitly refer to that library, the implementation may lay in another library. Or does the copyright of FFTW3 also cover the create-execute-destroy idiom?
I'm getting weird compiler errors when trying to run the bootstrap.sh file. I've tried gcc 4.4.7, 4.8.2, and 5.2.0 and they all give me strange errors when running make. Here's one example:
libtool: compile: gcc -DHAVE_CONFIG_H -I. -I.. -I../simd -std=c99 -MT timer.lo -MD -MP -MF .deps/timer.Tpo -c timer.c -o timer.o In file included from timer.c:29:0: cycle.h: In function âgetticksâ: cycle.h:226:6: error: âasmâ undeclared (first use in this function) asm volatile("rdtsc" : "=a" (a), "=d" (d)); ^ cycle.h:226:6: note: each undeclared identifier is reported only once for each function it appears in cycle.h:226:10: error: expected â;â before âvolatileâ asm volatile("rdtsc" : "=a" (a), "=d" (d)); ^ make[2]: Leaving directory
/scratch/mkg52/_MONSOON_SOFTWARE/fftw3/kernel'
make[2]: *** [timer.lo] Error 1
make[1]: Leaving directory /scratch/mkg52/_MONSOON_SOFTWARE/fftw3' make[1]: *** [all-recursive] Error 1 make: *** [all] Error 2
The config.log has a bunch of errors in it as well. Most of them look like this: conftest.c:17:7: error: 'thisisanerror' undeclared (first use in this function)
.
I've gotten fftw-3.3.4 to compile without any problems, but this current version is not working at all. Does anyone have any ideas?
In OSX 10.9.2 with all the dependencies: autoconf, autolib, ocaml, ...
Downloading the repo and running:
./bootstrap.sh
make
Produces an error:
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: .libs/libsimd_sse2_nonportable.a(libsimd_sse2_nonportable_la-sse2-nonportable.o) has no symbols
libtool: link: ranlib .libs/libsimd_sse2_nonportable.a
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: .libs/libsimd_sse2_nonportable.a(libsimd_sse2_nonportable_la-sse2-nonportable.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: warning for library: .libs/libsimd_sse2_nonportable.a the table of contents is empty (no object file members in the library define global symbols)
libtool: link: ( cd ".libs" && rm -f "libsimd_sse2_nonportable.la" && ln -s "../libsimd_sse2_nonportable.la" "libsimd_sse2_nonportable.la" )
Making all in dft
Making all in scalar
Making all in codelets
(cat ../../../COPYRIGHT ../../../support/codelet_prelude.dft; sh ../../../support/twovers.sh ../../../genfft/gen_notw.native -compact -variables 4 -pipeline-latency 4 -n 2 -name n1_2 -include "n.h") | sed -e s/@DATE@/"`date`"/ | indent -kr -cs -i5 -l800 -fca -nfc1 -sc -sob -cli4 -TR -Tplanner -TV >n1_2.c
indent: Command line: unknown parameter "-kr"
make[4]: *** [n1_2.c] Error 1
make[3]: *** [all-recursive] Error 1
make[2]: *** [all-recursive] Error 1
make[1]: *** [all-recursive] Error 1
make: *** [all] Error 2
Hi.
I'm trying to use snd2fftw (http://snd2fftw.sourceforge.net/) to aplly DFT on audio file, but I receive the error: Fail to initialize FFTW plan.
The author has no more support to the software. What can be this?
I'm using FFTW 3.3.3, and I'm getting this error on some of the clusters on OSG:
alloc.c:269: assertion failed: p
which corresponds to the #ifdef MIN_ALIGNMENT block:
void *X(malloc_plain)(size_t n)
{
void *p;
if (n == 0)
n = 1;
p = X(kernel_malloc)(n);
CK(p);
#ifdef MIN_ALIGNMENT
A((((uintptr_t)p) % MIN_ALIGNMENT) == 0);
#endif
return p;
}
These are my build options:
./configure --enable-single --enable-sse2 --enable-avx
--enable-threads --prefix=/home/yutong/fftw_install/
Any ideas?
https://github.com/FFTW/fftw3/blob/master/tools/fftw-wisdom.c#L43
When running fftw-wisdom -v it outputs text like this:
PLANNING PROBLEM: cif1024
PLANNING PROBLEM: cib1024
PLANNING PROBLEM: cif2048
PLANNING PROBLEM: cib2048
I kept trying to work out what the problem was, and annoyed that it wasn't giving more details on this error message. It turns out that despite the SHOUTING it is actually not an error, and everything is good. I suggest changing the text to be far less ambiguous. For example remove the word 'problem' and make PLANNING be Planning.
$ ./configure --enable-float --enable-fma --enable-vsx && make -j96 && make check
Executing "/tmp/fftw-3.3.5/tests/bench --verbose=1 --verify 'ok5o11x4o10x6o10x11o11_4' --verify 'ik5o11x4o10x6o10x11o11_4' --verify '//obr9x5x24' --verify '//ofr9x5x24' --verify 'obr9x5x24' --verify 'ibr9x5x24' --verify 'ofr9x5x24' --verify 'ifr9x5x24' --verify '//obc9x5x24' --verify '//ibc9x5x24' --verify '//ofc9x5x24' --verify '//ifc9x5x24' --verify 'obc9x5x24' --verify 'ibc9x5x24' --verify 'ofc9x5x24' --verify 'ifc9x5x24' --verify 'ok12hx4hx13o01x13e10' --verify 'ik12hx4hx13o01x13e10' --verify 'obrd4x2x4x10v8' --verify 'ibrd4x2x4x10v8' --verify 'ofrd4x2x4x10v8' --verify 'ifrd4x2x4x10v8' --verify '//obcd4x2x4x10v8' --verify '//ibcd4x2x4x10v8' --verify '//ofcd4x2x4x10v8' --verify '//ifcd4x2x4x10v8' --verify 'obcd4x2x4x10v8' --verify 'ibcd4x2x4x10v8' --verify 'ofcd4x2x4x10v8' --verify 'ifcd4x2x4x10v8' --verify 'okd11088o11' --verify 'ikd11088o11' --verify 'obr8x4x8x4_6' --verify 'ibr8x4x8x4_6' --verify 'ofr8x4x8x4_6' --verify 'ifr8x4x8x4_6' --verify '//obc8x4x8x4_6' --verify '//ibc8x4x8x4_6' --verify '//ofc8x4x8x4_6' --verify '//ifc8x4x8x4_6' --verify 'obc8x4x8x4_6' --verify 'ibc8x4x8x4_6' --verify 'ofc8x4x8x4_6' --verify 'ifc8x4x8x4_6'"
ok5o11x4o10x6o10x11o11_4 2.13067e-07 3.24252e-06 2.67794e-07
ik5o11x4o10x6o10x11o11_4 2.16295e-07 3.14798e-06 2.42186e-07
//obr9x5x24 1.76873e-07 5.80388e-07 1.90459e-07
//ofr9x5x24 2.16065e-07 5.80388e-07 1.864e-07
Found relative error 3.978176e+13 (impulse 1)
[...]
Reprodced on Power8 with gcc 5.2 and 5.3, with gcc 4.8.4 and 4.9.1 the tests pass.
Grrr.
Basically the title. Does the plan generation for a DFT depend on the actual content of the input array, or just the size and whether the input and output are co-located?
I'm working in a environment where I have many worker threads doing parallel DFTs, and moving the planning out of the threads (that was a fun bug to track down) means I don't have ready access to a example instance of the data to pass into the planner. Can I just use a empty array of the same size as the data?
I understand that the position of those working on FFTW is to minimize library dependencies (at least with regards to pthreads
and fftw3*_threads
), but when building FFTW with MPI enabled, it may be desirable to include -lfftw3*_mpi
in the pkg-config libraries portion. I'm not sure if this may cause problems on other systems, but at least in Fedora, the fftw3*.pc
files will be installed in their respective MPI PKG_CONFIG_PATH
(i.e. /usr/lib64/mpich/lib/pkgconfig
or similar) and will not conflict with those non-MPI pc files when the user loads the corresponding MPI module (i.e. module load mpi/mpich-x86_64
) for configuring and building with MPI.
In any case, an easy way to conditionally include -lfftw3*_mpi
is in the configure.ac
file add under the if test "$enable_mpi" = "yes"; then
test the lines
LIBFFTW3MPI=-lfftw3${PREC_SUFFIX}_mpi
AC_SUBST(LIBFFTW3MPI)
and change the Libs:
in fftw.pc.in
to
Libs: -L${libdir} @LIBFFTW3MPI@ -lfftw3@PREC_SUFFIX@ @LIBQUADMATH@
.
if (/bin/sh /usr/src/fftw-3.3.3/fftw-single-3.3.3/missing --run makeinfo --version) >/dev/null 2>&1; then
for f in fftw3.info fftw3.info-[0-9] fftw3.info-[0-9][0-9] fftw3.i[0-9] fftw3.i[0-9][0-9]; do
if test -f $f; then mv $f $backupdir; restore=mv; else :; fi;
done;
else :; fi &&
cd "$am__cwd";
if /bin/sh /usr/src/fftw-3.3.3/fftw-single-3.3.3/missing --run makeinfo -I .
-o fftw3.info fftw3.texi;
then
rc=0;
CDPATH="${ZSH_VERSION+.}:" && cd .;
else
rc=$?;
CDPATH="${ZSH_VERSION+.}:" && cd . &&
$restore $backupdir/* echo "./fftw3.info" | sed 's|[^/]*$||'
;
fi;
rm -rf $backupdir; exit $rc
fftw3.texi:159: misplaced {
fftw3.texi:159: misplaced }
fftw3.texi:160: misplaced {
fftw3.texi:160: misplaced }
./intro.texi:16: warning: @ifinfo should only appear at a line beginning (possibly involving @onlogn)
./intro.texi:16: superfluous argument to @EnD tex: algorithms for all lengths, including (possibly involving @onlogn)
./tutorial.texi:100: warning: @ifinfo should only appear at a line beginning (possibly involving @onlogn)
./tutorial.texi:100: superfluous argument to @EnD tex: algorithm). (possibly involving @onlogn)
./tutorial.texi:358: warning: @ifinfo should only appear at a line beginning (possibly involving @onlogn)
./tutorial.texi:358: superfluous argument to @EnD tex: algorithm is used even for prime sizes. (possibly involving @onlogn)
./tutorial.texi:417: warning: @ifinfo should only appear at a line beginning (possibly involving @ndims)
./tutorial.texi:417: superfluous argument to @EnD tex: (in row-major order). (possibly involving @ndims)
./tutorial.texi:418: warning: @ifinfo should only appear at a line beginning (possibly involving @ndimshalf)
./tutorial.texi:418: superfluous argument to @EnD tex: array of (possibly involving @ndimshalf)
./tutorial.texi:425: warning: @ifinfo should only appear at a line beginning (possibly involving @ndims)
./tutorial.texi:425: superfluous argument to @EnD tex: and the complex (possibly involving @ndims)
./tutorial.texi:426: warning: @ifinfo should only appear at a line beginning (possibly involving @ndimshalf)
./tutorial.texi:426: superfluous argument to @EnD tex: . (possibly involving @ndimshalf)
./tutorial.texi:566: warning: @ifinfo should only appear at a line beginning (possibly involving @onlogn)
./tutorial.texi:566: superfluous argument to @EnD tex: algorithm is used even for prime sizes. (possibly involving @onlogn)
./other.texi:97: warning: @ifinfo should only appear at a line beginning (possibly involving @ndims)
./other.texi:97: superfluous argument to @EnD tex: . Now, we specify a location in the array by a (possibly involving @ndims)
./other.texi:209: superfluous argument to @EnD tex: rank-3 array: (possibly involving @threedims)
./reference.texi:384: warning: @ifinfo should only appear at a line beginning (possibly involving @onlogn)
./reference.texi:384: superfluous argument to @EnD tex: performance even for prime sizes). It is possible to customize FFTW (possibly involving @onlogn)
./reference.texi:632: warning: @ifinfo should only appear at a line beginning (possibly involving @onlogn)
./reference.texi:632: superfluous argument to @EnD tex: performance even for prime sizes). (It is possible to customize FFTW (possibly involving @onlogn)
./reference.texi:711: warning: @ifinfo should only appear at a line beginning (possibly involving @ndims)
./reference.texi:711: warning: @ifinfo should only appear at a line beginning (possibly involving @ndimshalf)
./reference.texi:711: superfluous argument to @EnD tex: , the complex data is an n[0] x n[1] x n[2] x ... x (n[d-1]/2 + 1) (possibly involving @ndims)
./reference.texi:711: superfluous argument to @EnD tex: array of (possibly involving @ndimshalf)
./reference.texi:721: warning: @ifinfo should only appear at a line beginning (possibly involving @ndims)
./reference.texi:721: superfluous argument to @EnD tex: in row-major order. (possibly involving @ndims)
./reference.texi:841: warning: @ifinfo should only appear at a line beginning (possibly involving @onlogn)
./reference.texi:841: superfluous argument to @EnD tex: performance even for prime sizes). (It is possible to customize FFTW (possibly involving @onlogn)
./reference.texi:2376: warning: @ifinfo should only appear at a line beginning (possibly involving @ndims)
./reference.texi:2376: superfluous argument to @EnD tex: multi-dimensional real-input DFT, the full (logical) complex output array (possibly involving @ndims)
./mpi.texi:262: warning: @ifinfo should only appear at a line beginning (possibly involving @twodims)
./mpi.texi:262: superfluous argument to @EnD tex: complex DFT, distributed over 4 (possibly involving @twodims)
./mpi.texi:263: warning: @ifinfo should only appear at a line beginning (possibly involving @twodims)
./mpi.texi:263: superfluous argument to @EnD tex: slice of the data. (possibly involving @twodims)
./mpi.texi:278: warning: @ifinfo should only appear at a line beginning (possibly involving @twodims)
./mpi.texi:278: superfluous argument to @EnD tex: array on three processes, you can (possibly involving @twodims)
./mpi.texi:318: warning: @ifinfo should only appear at a line beginning (possibly involving @twodims)
./mpi.texi:318: superfluous argument to @EnD tex: complex-DFT example, above, we would find (possibly involving @twodims)
./mpi.texi:472: warning: @ifinfo should only appear at a line beginning (possibly involving @ndims)
./mpi.texi:472: superfluous argument to @EnD tex: . As always, it is distributed along (possibly involving @ndims)
./mpi.texi:473: warning: @ifinfo should only appear at a line beginning (possibly involving @dimk)
./mpi.texi:473: superfluous argument to @EnD tex: . Now, if we compute its DFT with the (possibly involving @dimk)
./mpi.texi:475: warning: @ifinfo should only appear at a line beginning (possibly involving @ndimstrans)
./mpi.texi:475: superfluous argument to @EnD tex: , (possibly involving @ndimstrans)
./mpi.texi:476: warning: @ifinfo should only appear at a line beginning (possibly involving @dimk)
./mpi.texi:476: superfluous argument to @EnD tex: dimension. Conversely, if we take the (possibly involving @dimk)
./mpi.texi:477: superfluous argument to @EnD tex: data and transform it with the (possibly involving @ndimstrans)
./mpi.texi:479: warning: @ifinfo should only appear at a line beginning (possibly involving @ndims)
./mpi.texi:479: superfluous argument to @EnD tex: array. (possibly involving @ndims)
./mpi.texi:483: warning: @ifinfo should only appear at a line beginning (possibly involving @ndimstrans)
./mpi.texi:483: superfluous argument to @EnD tex: (the (possibly involving @ndimstrans)
./mpi.texi:586: warning: @ifinfo should only appear at a line beginning (possibly involving @ndims)
./mpi.texi:586: superfluous argument to @EnD tex: real (possibly involving @ndims)
./mpi.texi:587: warning: @ifinfo should only appear at a line beginning (possibly involving @ndimshalf)
./mpi.texi:587: superfluous argument to @EnD tex: complex data: the last dimension of the (possibly involving @ndimshalf)
./mpi.texi:590: warning: @ifinfo should only appear at a line beginning (possibly involving @ndims)
./mpi.texi:590: superfluous argument to @EnD tex: dimensions of the real data. (possibly involving @ndims)
./mpi.texi:594: warning: @ifinfo should only appear at a line beginning (possibly involving @ndims)
./mpi.texi:594: superfluous argument to @EnD tex: , it is (possibly involving @ndims)
./mpi.texi:595: warning: @ifinfo should only appear at a line beginning (possibly involving @ndimspad)
./mpi.texi:595: superfluous argument to @EnD tex: array, where the last (possibly involving @ndimspad)
./mpi.texi:608: warning: @ifinfo should only appear at a line beginning (possibly involving @ndimshalf)
./mpi.texi:616: warning: @ifinfo should only appear at a line beginning (possibly involving @threedims)
./mpi.texi:616: superfluous argument to @EnD tex: real data [padded to L x M x 2(N/2+1) (possibly involving @threedims)
./mpi.texi:616: superfluous argument to @EnD tex: ], (possibly involving @threedims)
./mpi.texi:617: warning: @ifinfo should only appear at a line beginning (possibly involving @threedims)
./mpi.texi:617: superfluous argument to @EnD tex: complex data. Similar to the (possibly involving @threedims)
./mpi.texi:665: superfluous argument to @EnD tex: in row-major order, so its (possibly involving @threedims)
./mpi.texi:674: warning: @ifinfo should only appear at a line beginning (possibly involving @threedims)
./mpi.texi:674: superfluous argument to @EnD tex: r2c (possibly involving @threedims)
./mpi.texi:676: warning: @ifinfo should only appear at a line beginning (possibly involving @threedims)
./mpi.texi:676: superfluous argument to @EnD tex: real array (possibly involving @threedims)
./mpi.texi:678: superfluous argument to @EnD tex: complex array distributed over the @code{M} (possibly involving @threedims)
./mpi.texi:699: warning: @ifinfo should only appear at a line beginning (possibly involving @twodims)
./mpi.texi:699: superfluous argument to @EnD tex: that is (possibly involving @twodims)
./mpi.texi:868: warning: @ifinfo should only appear at a line beginning (possibly involving @twodims)
./mpi.texi:868: superfluous argument to @EnD tex: transpose on @code{P} processes, (possibly involving @twodims)
./mpi.texi:1295: warning: @ifinfo should only appear at a line beginning (possibly involving @ndims)
./mpi.texi:1295: superfluous argument to @EnD tex: array that is stored on the local (possibly involving @ndims)
./mpi.texi:1299: warning: @ifinfo should only appear at a line beginning (possibly involving @ndimstrans)
./mpi.texi:1299: warning: @Xref should not appear in @EnD
./mpi.texi:1299: @Xref missing close brace
./mpi.texi:1299: superfluous argument to @EnD tex: transposed output. @Xref{Transposed
} (possibly involving @ndimstrans)
./mpi.texi:1300: misplaced }
./mpi.texi:1414: warning: @ifinfo should only appear at a line beginning (possibly involving @ndims)
./mpi.texi:1414: superfluous argument to @EnD tex: input data and the first dimension (possibly involving @ndims)
./mpi.texi:1415: warning: @ifinfo should only appear at a line beginning (possibly involving @ndimstrans)
./mpi.texi:1415: superfluous argument to @EnD tex: transposed data (at intermediate (possibly involving @ndimstrans)
./mpi.texi:1451: warning: @ifinfo should only appear at a line beginning (possibly involving @ndims)
./mpi.texi:1451: superfluous argument to @EnD tex: transform is (possibly involving @ndims)
./mpi.texi:1452: warning: @ifinfo should only appear at a line beginning (possibly involving @ndimstrans)
./mpi.texi:1452: warning: @Xref should not appear in @EnD
./mpi.texi:1452: superfluous argument to @EnD tex: . @Xref{Transposed distributions}. (possibly involving @ndimstrans)
./mpi.texi:1498: warning: @ifinfo should only appear at a line beginning (possibly involving @ndims)
./mpi.texi:1498: warning: @ifinfo should only appear at a line beginning (possibly involving @ndimshalf)
./mpi.texi:1498: superfluous argument to @EnD tex: real data to/from n[0] x n[1] x n[2] x ... x (n[d-1]/2 + 1) (possibly involving @ndims)
./mpi.texi:1498: superfluous argument to @EnD tex: complex (possibly involving @ndimshalf)
./mpi.texi:1501: warning: @ifinfo should only appear at a line beginning (possibly involving @ndimspad)
./mpi.texi:1699: superfluous argument to @EnD tex: complex DFT in-place. (This assumes you have already (possibly involving @twodims)
./mpi.texi:1738: warning: @ifinfo should only appear at a line beginning (possibly involving @twodims)
./mpi.texi:1738: superfluous argument to @EnD tex: Fortran array is viewed by FFTW in C as a (possibly involving @twodims)
./mpi.texi:1739: superfluous argument to @EnD tex: array. This means that the array was distributed over (possibly involving @twodims)
./mpi.texi:1741: superfluous argument to @EnD tex: array in Fortran. (You must @Emph{not} use an (possibly involving @twodims)
./mpi.texi:1742: warning: @ifinfo should only appear at a line beginning (possibly involving @twodims)
./mpi.texi:1742: superfluous argument to @EnD tex: array, (possibly involving @twodims)
./mpi.texi:1752: warning: @ifinfo should only appear at a line beginning (possibly involving @twodims)
./mpi.texi:1752: superfluous argument to @EnD tex: array, associated with the @Emph{same} (possibly involving @twodims)
./modern-fortran.texi:177: warning: @ifinfo should only appear at a line beginning (possibly involving @threedims)
./modern-fortran.texi:177: superfluous argument to @EnD tex: ) arrays: (possibly involving @threedims)
./modern-fortran.texi:190: warning: @ifinfo should only appear at a line beginning (possibly involving @threedims)
./modern-fortran.texi:190: superfluous argument to @EnD tex: array. (possibly involving @threedims)
./modern-fortran.texi:209: warning: @ifinfo should only appear at a line beginning (possibly involving @threedims)
./modern-fortran.texi:209: superfluous argument to @EnD tex: real input (possibly involving @threedims)
./modern-fortran.texi:210: warning: @ifinfo should only appear at a line beginning (possibly involving @threedims)
./modern-fortran.texi:210: superfluous argument to @EnD tex: complex output). In Fortran, because (possibly involving @threedims)
./modern-fortran.texi:213: warning: @ifinfo should only appear at a line beginning (possibly involving @threedims)
./modern-fortran.texi:213: superfluous argument to @EnD tex: real input in Fortran: (possibly involving @threedims)
./modern-fortran.texi:232: warning: @ifinfo should only appear at a line beginning (possibly involving @threedims)
./modern-fortran.texi:232: superfluous argument to @EnD tex: array, even though only (possibly involving @threedims)
./modern-fortran.texi:233: superfluous argument to @EnD tex: of it is actually used. In this example, we will (possibly involving @threedims)
./modern-fortran.texi:471: warning: @ifinfo should only appear at a line beginning (possibly involving @threedims)
./modern-fortran.texi:471: superfluous argument to @EnD tex: array. (possibly involving @threedims)
./modern-fortran.texi:484: warning: @ifinfo should only appear at a line beginning (possibly involving @threedims)
./modern-fortran.texi:484: superfluous argument to @EnD tex: array. (Alternatively, you can (possibly involving @threedims)
./modern-fortran.texi:495: warning: @ifinfo should only appear at a line beginning (possibly involving @twodims)
./modern-fortran.texi:495: superfluous argument to @EnD tex: 2d real array: (possibly involving @twodims)
./modern-fortran.texi:506: warning: @ifinfo should only appear at a line beginning (possibly involving @threedims)
./modern-fortran.texi:506: superfluous argument to @EnD tex: 3d complex array: (possibly involving @threedims)
make[3]: *** [fftw3.info] Error 1
make[3]: Leaving directory /usr/src/fftw-3.3.3/fftw-single-3.3.3/doc' make[2]: *** [all-recursive] Error 1 make[2]: Leaving directory
/usr/src/fftw-3.3.3/fftw-single-3.3.3/doc'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/usr/src/fftw-3.3.3/fftw-single-3.3.3'
make: *** [all] Error 2
I have a problem when I compile the current version (cd2b27d) with --enable-avx (./configure --enable-shared --enable-sse2 --enable-avx --enable-avx2 --enable-fma --enable-maintainer-mode):
In file included from ../../../dft/simd/n1f.h:21:0,
from ../common/n1fv_3.c:35,
from n1fv_3.c:3:
../../../simd-support/simd-avx.h:254:27: error: incompatible type for argument 2 of 'ST'
#define VFMAI(b, c) SUFF(_mm_addsub256_p)(c,FLIP_RI(b))
the manpage gives this example:
fftw-wisdom -n cof1024 cob1024 -o wisdom
which will fail with;
bench: problem.c:96: assertion failed: isdigit(*s)
correct seems to be:
fftw-wisdom -n -o wisdom cof1024 cob1024
then the example works (if one applies f230f8c)
This FFTW test crashes on ARM. Running on a chromebook, with an ubuntu 12.04 chroot. GCC version is 4.8.2.
make[3]: Entering directory /home/viral/julia/deps/fftw-3.3.4-single/tests' perl -w ./check.pl -r -c=30 -v
pwd`/bench
Executing "/home/viral/julia/deps/fftw-3.3.4-single/tests/bench --verbose=1 --verify 'ok88o01_152' --verify 'ik88o01_152' --verify 'obr5x14x2x11_5' --verify 'ibr5x14x2x11_5' --verify 'ofr5x14x2x11_5' --verify 'ifr5x14x2x11_5' --verify '//obc5x14x2x11_5' --verify '//ibc5x14x2x11_5' --verify '//ofc5x14x2x11_5' --verify '//ifc5x14x2x11_5' --verify 'obc5x14x2x11_5' --verify 'ibc5x14x2x11_5' --verify 'ofc5x14x2x11_5' --verify 'ifc5x14x2x11_5' --verify 'ok8e01_29' --verify 'ik8e01_29' --verify 'obr4x2v15' --verify 'ibr4x2v15' --verify 'ofr4x2v15' --verify 'ifr4x2v15' --verify '//obc4x2v15' --verify '//ibc4x2v15' --verify '//ofc4x2v15' --verify '//ifc4x2v15' --verify 'obc4x2v15' --verify 'ibc4x2v15' --verify 'ofc4x2v15' --verify 'ifc4x2v15' --verify 'ok8e10x7bx11e00x10e10' --verify 'ik8e10x7bx11e00x10e10' --verify '//obr40x11' --verify '//ofr40x11' --verify 'obr40x11' --verify 'ibr40x11' --verify 'ofr40x11' --verify 'ifr40x11' --verify '//obc40x11' --verify '//ibc40x11' --verify '//ofc40x11' --verify '//ifc40x11' --verify 'obc40x11' --verify 'ibc40x11' --verify 'ofc40x11' --verify 'ifc40x11'"
Segmentation fault (core dumped)
Ubuntu.
v3.3.4 release.
I compiled exemple fortran file from internet…
And, this is error report….
Undefined symbols for architecture x86_64:
"dfftw_destroy_plan", referenced from:
MAIN_ in ccgelF5d.o
"dfftw_execute", referenced from:
MAIN_ in ccgelF5d.o
"dfftw_plan_dft_2d", referenced from:
MAIN_ in ccgelF5d.o
ld: symbol(s) not found for architecture x86_64
collect2: error: ld returned 1 exit status
[Finished in 0.0s with exit code 1]
[cmd: ['/usr/local/bin/gfortran', '/Users/hyuna917/Dropbox/Kono Lab/Computer/数値計算演習/fortran/fftw.f', '-o', '/Users/hyuna917/Dropbox/Kono Lab/Computer/数値計算演習/fortran/fftw', '-I/usr/local/include', '-L/usr/local/lib', '-lfftw3', '-lm']]
[dir: /Users/hyuna917/Dropbox/Kono Lab/Computer/数値計算演習/fortran]
[path: /usr/bin:/bin:/usr/sbin:/sbin]
I compiled ‘gfortran fftw.f -o fftw -I/usr/local/include -L/usr/local/lib -lfftw3 -lm
Why occurred this error?
When the 3.3.4 configure script is executed on OS X with MACOSX_DEPLOYMENT_TARGET=10.10
in its environment, it incorrectly selects -flat_namespace -undefined suppress
for linking, instead of -undefined dynamic_lookup
. This is due to a bug in Libtool 2.4.2 and earlier.
We've patched the script in MacPorts; the fix itself is trivial. The permanent solution is to regenerate the script using the just-released Libtool 2.4.3.
What is the current level of support for bit reproducibility in FFTW3?
This is a test issue.
Hi -
I need to use fftw 3.3.4 with the pgi compiler 16.1. fftw does build with the following flags"
$ CC=pgcc CFLAGS="-O2 -fPIC" F77=pgfortran FFLAGS="-O2" ./configure --enable-avx --enable-openmp --enable-shared --prefix=/home/steinba/software/fftw/3.3.4/pgi161-nompi
fftw builds, the checks fail and I see that the bench util runs into a segmentation fault when called like
$ ./bench -s 64
Segmentation fault (core dumped)
The code can be compiled alright with pgi 15.9 and the bench utility runs just fine.
Any idea?
Thanks,
P
$ lsb_release -a
LSB Version: :base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch
Distributor ID: RedHatEnterpriseServer
Description: Red Hat Enterprise Linux Server release 6.4 (Santiago)
Release: 6.4
Codename: Santiago
$ uname -a
Linux tauruslogin4 2.6.32-504.3.3.el6.x86_64 #1 SMP Fri Dec 12 16:05:43 EST 2014 x86_64 x86_64 x86_64 GNU/Linux
The instructions for building from git are ambiguous, resulting in people reproducing #9.
Please change the wording to something like:
"If you are using the git repository, install ocaml, autoconf,
automake, indent and libtool, and execute the bootstrap.sh script instead of running configure directly."
make[4]: Entering directory /home/xxx/src/fftw3/dft/scalar/codelets' make[4]: *** No rule to make target
n1_3.c', needed by `all'. Stop.
I am using the git repository.
Kali 2.0 64 / gcc-4.8.2
/usr/bin/ld: /usr/local/lib/libfftw3f.a(mapflags.o): relocation R_X86_64_32 against `.rodata' can not be used when making a shared object; recompile with -fPIC
/usr/local/lib/libfftw3f.a: error adding symbols: Bad value
collect2: error: ld returned 1 exit status
Makefile:396: recipe for target 'libosmodsp.la' failed
make[2]: *** [libosmodsp.la] Error 1
make[2]: Leaving directory '/root/libosmo-dsp/src'
Makefile:475: recipe for target 'all-recursive' failed
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory '/root/libosmo-dsp'
Makefile:360: recipe for target 'all' failed
make: *** [all] Error 2
It appears that gcc-4.8.0 generates incorrect code when compiling file dft/scalar/codelets/q1_6.c on amd64 in long-double precision. gcc-4.8.1 and newer generate correct code.
To reproduce the bug:
./configure --enable-long-double CC=/tmp/local/bin/x86_64-unknown-linux-gnu-gcc-4.8.0
make
tests/bench -oparanoid --verify 'i72'
If the output starts with "Found relative error 3.333333e-01 (impulse 1)" then you encountered the bug; otherwise you are ok.
This is not a fftw bug, but a gcc bug that was fixed a long time ago, so I will close this issue. The solution for gcc-4.8.0 users (if any) is to upgrade to 4.8.1 or later.
The build -- both in 3.3.3 and 3.3.4 -- attempts to compile the content of the sse2-directory even when configure
was explicitly asked to disable sse2:
--enable-shared --enable-threads --disable-fortran --disable-openmp --enable-float --enable-sse --disable-sse2
When the -march
argument is set to a CPU, that has no SSE2 instructions (such as "athlon-xp"), some compilers -- such as clang
-- fail:
cc -DHAVE_CONFIG_H -I. -I../../.. -I../../../kernel -I../../../dft -I../../../dft/simd -I../../../simd-support -msse -O2 -pipe -march=athlon-xp -fstack-protector -fno-strict-aliasing -MT n1fv_2.lo -MD -MP -MF .deps/n1fv_2.Tpo -c n1fv_2.c -fPIC -DPIC -o .libs/n1fv_2.o
fatal error: error in backend: Do not know how to split the result of this operator!
cc: error: clang frontend command failed with exit code 70 (use -v to see invocation)
FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final 208032) 20140512
Target: i386-unknown-freebsd10.1
Looking into configure
script I see the following line:
if test "$have_sse" = "yes"; then have_sse2=yes; fi
Huh?
Hello!
There has been news that IBM would (help) optimize FFTW3 to their new H/W architecture, how is it going? Would the altivec code for older POWER H/W work in this new architecture ?
TIA,
Fabricio
My distribution recently ran manpage-versus-help tool over all the packages. fftw came almost clean:
"""
In man of fftw-wisdom and fftw{f,l,q}-wisdom is missing (compared to help):
-T --threads
"""
Hi guys, there's some info about plans here:
http://www.fftw.org/fftw3_doc/Real_002dto_002dReal-Transform-Kinds.html#Real_002dto_002dReal-Transform-Kinds
the point is, there's a function to get the N and the n value? could we create it?
Hello all!
I use MPI FFTW3 (3.3.4) in my project and found that for some number of processes it fails.
After some study I get the following testing code::
#include <mpi.h>
#include <fftw3-mpi.h>
int main(int argc, char **argv){
const ptrdiff_t N[2]={448,352};
ptrdiff_t Nz=354;
fftw_plan plan;
fftw_complex *data;
ptrdiff_t alloc_local, local_n0, local_0_start, block;
int me,np,err;
MPI_Init(&argc, &argv);
fftw_mpi_init();
MPI_Comm_rank(MPI_COMM_WORLD, &me);
MPI_Comm_size(MPI_COMM_WORLD,&np);
block=FFTW_MPI_DEFAULT_BLOCK;
alloc_local = fftw_mpi_local_size_many(2,N,Nz,block ,MPI_COMM_WORLD,&local_n0, &local_0_start);
data = fftw_alloc_complex(alloc_local);
if (me==0){
printf("Test is running at %d processes \n",np);
printf("Before plan calculation\n" );
}
plan = fftw_mpi_plan_many_dft(2,N,Nz,block,block, data, data, MPI_COMM_WORLD, FFTW_FORWARD, FFTW_PATIENT);
if (me==0){
printf("After plan calculation\n" );
}
MPI_Finalize();
}
This is 2D Fourier transform over the overlayed data. The transform dimensions are 448 to 352 and howmany parameter is 354.
This code works fine if it is run at 2,4,7,8,14,28,32,64 and 448 processes.
But for 56,112 and 224 processes I get segmentation failed during the plan calculation.
I checked it at following configurations:
The code behavior is the same.
in mpi/transpose_alltoall.c, line 65 and 78, could it happen that I and O are the same buffer? If that happens, then on MPICH2 (eg. XC30) the code would trigger an error that requires setting MPICH_NO_BUFFER_ALIAS_CHECK
Is there a way to disable, e.g., AVX, during plan creation, i.e., have the planner ignore specific SIMD codelets during plan generation?
FFTW uses the normal libtool mechanism for versioning shared libraries. Unfortunately, we forgot to update the libtool version string when shipping fftw-3.3.3, and consequently the 3.3.3 shared library has the same version number as fftw-3.3.2. This may cause difficulties to people upgrading FFTW while the FFTW shared library is in use, and may cause confusion about which version is installed.
The purpose of this "issue" is to document the problem. No fix is planned for 3.3.3. Future releases will (hopefully) have the correct version number.
I'm trying to save MPI wisdom, and I'm getting strange behavior.
Since my understanding is that fftw wisdom accumulates, I figured the easiest thing I can do is create an empty file, then read wisdom from there, run my code, save wisdom in the file, and then the next time I would run my code, it would no longer spend any time with plans.
However, that's not what happens.
In my tests I am using an MPI code on my 8 core machine, and I am doing an inverse FFT for 3 interleaved 256x256x256 arrays.
After the first run with "FFTW_MEASURE" the wisdom file is big (64 lines), and it takes ~100 seconds for this first run.
Second run takes 5 seconds, and the wisdom file is smaller (18 lines).
Third run takes ~100 seconds, and the wisdom file remains the same (I guess, I didn't check all the codes individually).
All subsequent runs are of ~100 seconds.
If I change my code so that it no longer overwrites the wisdom file after the first run, all subsequent runs are fast, so I can work around this issue.
Also, for this particular size of the transform, FFTW_ESTIMATE seems to be just as fast as FFTW_MEASURE (if I don't overwrite the wisdom file), so it's not a big deal anyway.
However, when I'll be running my production jobs, the transforms will be a lot bigger, and I'd like to understand what's happening.
By the way, I'm using the recommended way of saving the files from http://www.fftw.org/doc/FFTW-MPI-Wisdom.html (more on that in the next issue though).
I get sigsegv when using fftw compiled using
g++ (tdm-1) 4.7.1
Copyright (C) 2012 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
on 32-bit windows.
Is this related to these release notes?
Removed an archaic stack-alignment hack that was failing with gcc-4.7/i386. Added stack-alignment hack necessary for gcc on Windows/i386. We will regret this in ten years (see previous change).
How should the library be compiled on this platform?
Hi guy, we are creating a php binding of fftw3 ( https://github.com/bukka/php-fftw ), and i have some doubts about constant meaning:
and these constants too, i don't know what it is, i think these are plan flags, but not sure what each one do:
/* undocumented beyond-guru flags /
GCC does not support -mfpu flag for AArch64.
Hi,
I want to use the fftw3 library in C++ project, so I call some functions such as "fftwf_plan_dft_r2c","fftwf_plan_dft_c2r", but why errors like " undefined reference to `fftwf_plan_dft_r2c" occurs?
[Bug report from Damon Farnsworth from Cray:]
While running the mpi checks on the stock fftw 3.3.4 (I also tested against version 3.3 and saw the same failure) with
your bench tester I came across a failure (relative error) for certain problems. It seems to be isolated (as far as
I’ve seen) to complex problems (single or double precision; inplace or out-of-place; forward or backward) of sizes Nx1
where N is roughly 25 or greater, although some values of N will pass. This seems to only happen when the number of
ranks is greater than one. I give a couple of examples below.
Here’s a successful test, one mpi rank:
aprun -n1 ./mpi-bench.double.static.exe -v2 --verify obc98x1
planner time: 0.005592 s
(mpi-dft-serial
(dft-ct-dit/7
(dftw-direct-7/24 "t1bv_7_avx")
(dft-direct-14-x7 "n1bv_14_avx")))
flops: 318 add, 162 mul, 156 fma
estimated cost: 792.000000, pcost = 0.000000
obc98x1 4.05599e-16 5.38317e-16 9.21329e-16
Here’s the same test but with two mpi ranks:
aprun -n2 ./mpi-bench.double.static.exe -v2 --verify obc98x1
planner time: 0.013054 s
(mpi-dft-rank1/2/last
(mpi-dft-rank1-bigvec/contig
(mpi-transpose-pairwise
(null)
(rdft-transpose-cut-2x25-x2
(rdft-vrank>=1-x2/1
(rdft-rank0-iter-ci/2-x23))
(rdft-rank0-ip-sq/2-x2-x2))
(null)
(null))
(dft-direct-2-x25 "n2bv_2_sse2")
(mpi-transpose-pairwise
(rdft-transpose-cut-25x2-x2
(rdft-rank0-ip-sq/2-x2-x2)
(rdft-rank0-iter-co/2-x23-x2))
(rdft-nop)
(rdft-nop)
(null)))
(dft-ct-dit/7
(dftw-direct-7/12 "t1buv_7_sse2")
(dft-indirect-before
(dft-direct-7-x7 "n1bv_7_avx")
(dft-r2hc-1
(rdft-rank0-ip-sq/2-x7-x7))))
(mpi-transpose-pairwise
(rdft-rank0-iter-co/2-x25-x2)
(rdft-nop)
(rdft-nop)
(null)))
flops: 430 add, 232 mul, 192 fma
estimated cost: 1933.283180, pcost = 8989.000000
Found relative error 1.030928e-02 (impulse 1)
Found relative error 1.030928e-02 (impulse)
Found relative error 1.030928e-02 (impulse)
Found relative error 1.030928e-02 (impulse)
Found relative error 1.030928e-02 (impulse)
Found relative error 1.030928e-02 (impulse)
Found relative error 1.030928e-02 (impulse)
Found relative error 1.030928e-02 (impulse)
Found relative error 1.030928e-02 (impulse)
Found relative error 1.030928e-02 (impulse)
Found relative error 1.030928e-02 (impulse)
Found relative error 1.261234e-01 (time shift)
Found relative error 1.225574e-01 (time shift)
Found relative error 1.386730e-01 (time shift)
Found relative error 1.259797e-01 (time shift)
Found relative error 9.564516e-02 (time shift)
Found relative error 1.640549e-01 (time shift)
Found relative error 1.029969e-01 (time shift)
Found relative error 1.051039e-01 (time shift)
Found relative error 1.261746e-01 (time shift)
Found relative error 1.214751e-01 (time shift)
Found relative error 4.337277e-02 (freq shift)
Found relative error 4.390274e-02 (freq shift)
Found relative error 4.652282e-02 (freq shift)
Found relative error 5.435396e-02 (freq shift)
Found relative error 4.417120e-02 (freq shift)
Found relative error 4.410327e-02 (freq shift)
Found relative error 4.349692e-02 (freq shift)
Found relative error 4.195425e-02 (freq shift)
Found relative error 4.465815e-02 (freq shift)
Found relative error 3.965394e-02 (freq shift)
obc98x1 3.2374e-16 0.0103093 0.164055
I've just submitted issue #33, and I have a related bug report.
In short, I'm trying to save mpi wisdom, and I call fftw_mpi_gather_wisdom(), as per http://www.fftw.org/doc/FFTW-MPI-Wisdom.html.
However, I'm only using single precision in my code, and every other FFTW function I call is prefixed by fftwf. For mpi_gather_wisdom, if I try to prefix it with fftwf, I get an invalid communicator error.
Any ideas?
I have a reproducible issue with performing a 1D FFT/IFFT along the outer dimension of a 3D matrix when FFTW_MEASURE or FFTW_PATIENT are used. FFTW_ESTIMATE works fine and produces the correct results. Transforms along the other two dimensions (inner and middle) also work fine, regardless which of the three flags are used.
I have provided sample test code to reproduce the issue in this repo. The transform in question is on line 106 of the fftw_code.c file.
There is also an IPython notebook which verifies the output data. Please let me know if I can clarify anything or provide more information.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.