Comments (7)
I tested compiling with different march
settings with the GCC 13.1 compiler (the one I had easily available on that AMD system):
I can indeed reproduce the problem with -march=knl
and -march=knm
. However, I do not see it with -march=skylake-avx512
or (of course) -march=znver4
.
@maxim-masterov could you please tell us what you set for -march
in your case? Could you try with -march=skylake-avx512
with your GCC 11.3?
from elpa.
Hi @maxim-masterov ,
on which AMD CPU are you trying to compile ELPA and which compiler do you use?
We do not see this issue on AMD EPYC 9654 with the Intel compiler.
from elpa.
Hi @terminationshock ,
The CPU is AMD EPYC 9654 96-Core Processor and the compiler I use is GCC-11.3.0
.
The error occurs because GCC11 doesn't support zen4
architecture. As a result, the evaluation of existence of AVX512 instructions by the configure.ac
script results in the following output:
checking whether we compile for Xeon... no
checking whether we compile for Xeon PHI... yes
This leads to execution of this branch in the configure.ac
script:
Lines 2430 to 2435 in c394aed
which defines the
HAVE_AVX512_XEON_PHI
variable. As a result, the following part of the source code is getting executed:elpa/src/elpa2/kernels/complex_128bit_256bit_512bit_BLOCK_template.c
Lines 2574 to 2575 in c394aed
The _XOR_EPI
macro is undefined (or I haven't found a place with its definition). So, the above snippet of code is getting converted to a casting of an integer into a packed double instead of invocation of the XOR operation.
from elpa.
I think something happened between the two releases 2020.05.001 and 2020.11.001
In 2020.05.001 the XOR operation looks correct:
elpa/src/elpa2/kernels/complex_128bit_256bit_512bit_BLOCK_template.c
Lines 1729 to 1732 in 907b033
whereas in 2020.11.001 the
_SIMD
prefix is removed:elpa/src/elpa2/kernels/complex_128bit_256bit_512bit_BLOCK_template.c
Lines 2573 to 2576 in e676706
from elpa.
I used -march=znver2
. I can try -march=skylake-avx512
, but I don't have access to Genoa CPUs at the moment. Should get it back next week. I will report as soon as I get some results.
Although, -march=skylake-avx512
(I'm sure) will help building the code, it won't resolve the problem with an undefined macro, right?
from elpa.
Yes, we will look at the macro anyway.
However, I am a bit puzzled that the configure detects AVX-512 instructions anyhow with -march=znver2
. In the gcc 11 manpage, it does not list AVX-512 for this flag:
znver2
AMD Family 17h core based CPUs with x86-64 instruction set support. (This supersets BMI, BMI2, CLWB, F16C, FMA, FSGSBASE, AVX, AVX2, ADCX, RDSEED, MWAITX, SHA, CLZERO, AES, PCLMUL, CX16,
MOVBE, MMX, SSE, SSE2, SSE3, SSE4A, SSSE3, SSE4.1, SSE4.2, ABM, XSAVEC, XSAVES, CLFLUSHOPT, POPCNT, RDPID, WBNOINVD, and 64-bit instruction set extensions.)
from elpa.
@terminationshock Just to confirm, the -march=skylake-avx512
indeed allowed the build to finish
from elpa.
Related Issues (20)
- Compilation error: "cannot find -ludev" HOT 1
- Could improve the compatiblity of elpa on AMD Zen4 cpus machines HOT 2
- Error in executing copy_double_complex_a_tmatc_kernel HOT 3
- Error in gcc < 7.4: initializer element is not constant
- How to run c test? HOT 3
- Is cholesky decomposition with gpusolver not implemented yet? HOT 2
- Checking configure option to disable non-implemented gpuBLAS with redistribute matrix HOT 3
- Inconsistent checks for HAVE_SKEWSYMMETRIC in C headers HOT 2
- Problem With mpiifort and mpiicc HOT 5
- Does amd GPU on multi-node still be tested? HOT 4
- ifort and icc HOT 7
- Compile error: generated manually_preprossessed file has Syntax error HOT 16
- How to compile with AMD GPU and MPI? HOT 4
- Eigenvector Check: what do check on errmax=0 do? HOT 2
- Compile error when with AMD GPU and MPI: uncomplete fortran_constants.F90 file and "Symbol 'elpa_2stage_complex_nvidia_gpu' has no IMPLICIT type" HOT 3
- complex_double_eigenvectors_2stage check fails with AMD gpu stream enabled HOT 2
- inconsistent result among 1step\2step and scalapack HOT 6
- Inconsistency in deprecated options HOT 1
- Can ELPA only use half of symmetric Matrix? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from elpa.