Giter VIP home page Giter VIP logo

Comments (7)

fdintino avatar fdintino commented on September 26, 2024

One small correction to the PR description: it looks as though the runners are actually advertising their SIMD support correctly, and AOM is using the correct SIMD optimizations for the hardware (I was mistaken about that). The -mavx2 compile flag is added to aom regardless of the host machine so long as the compiler supports it. The "Illegal instruction" errors appear to be happening in avx2-optimized glibc functions like round, not in the actual AV1 encoding.

from libavif.

jzern avatar jzern commented on September 26, 2024

One small correction to the PR description: it looks as though the runners are actually advertising their SIMD support correctly, and AOM is using the correct SIMD optimizations for the hardware (I was mistaken about that). The -mavx2 compile flag is added to aom regardless of the host machine so long as the compiler supports it.

This flag should only be used with files containing avx2 code. The availability of avx2 is detected at runtime by libaom. Do you see -mavx2 being universally added or is the round() crash in a libaom avx2 module?

from libavif.

fdintino avatar fdintino commented on September 26, 2024

This is the line that usually triggers the illegal instruction error:

https://aomedia.googlesource.com/aom/+/d0f3147e5242edf52b896a65b23af47006e9e84f/av1/encoder/ratectrl.c#2506

rc->avg_frame_bandwidth =
  (int)round(oxcf->rc_cfg.target_bandwidth / cpi->framerate);

Its pretty prosaic, not part of a libaom avx2 module. Just incidentally because the round function has an avx2-optimized version in glibc.

To drive home the fact that aom's own optimizations aren't the cause, here is a minimal script that will cause the illegal instruction error on one of these macos runners:

#include <math.h>
#include <stdio.h>

int main(int argc, char * argv[]) {
  int x = 2, y = 1;
  printf("%u\n", (int)round(x / y));
}

compile that with clang -o out -O0 -g -mavx2 src.c , then run lldb with the commands:

lldb --file ~/out
(lldb) disassemble -n main

And you'll see a bunch of AVX2-specific instructions.

The HAVE_AVX2 flag in AOM, which controls the addition of the -mavx2 CFLAG, doesn't check the host CPU. It just checks whether the compiler supports the flag. And clang supports that flag regardless of the host's capabilities.

from libavif.

wantehchang avatar wantehchang commented on September 26, 2024

In my experiment, I saw no evidence of libaom's non-avx2 source files being compiled with -mavx2. Here is the compiler command line for ratectrl.c:

[134/370] /Applications/Xcode_14.2.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc -I../ -I. -I../apps -I../common -I../examples -I../stats -I../third_party/libyuv/include -O3 -DNDEBUG -std=c99 -Wall -Wdisabled-optimization -Wextra -Wextra-semi -Wextra-semi-stmt -Wfloat-conversion -Wformat=2 -Wimplicit-function-declaration -Wpointer-arith -Wshadow -Wshorten-64-to-32 -Wsign-compare -Wstring-conversion -Wtype-limits -Wuninitialized -Wunreachable-code-aggressive -Wunused -Wvla -Wundef -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=0 -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -isysroot /Applications/Xcode_14.2.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX13.1.sdk -mmacosx-version-min=12.7 -fPIC -MD -MT CMakeFiles/aom_av1_encoder.dir/av1/encoder/ratectrl.c.o -MF CMakeFiles/aom_av1_encoder.dir/av1/encoder/ratectrl.c.o.d -o CMakeFiles/aom_av1_encoder.dir/av1/encoder/ratectrl.c.o -c ../av1/encoder/ratectrl.c

Note: glibc is an implementation of the Standard C Library commonly used on Linux. glibc is unlikely to be used in macOS. I guess you're using "glibc" loosely to refer to the Standard C Library (libc).

from libavif.

wantehchang avatar wantehchang commented on September 26, 2024

When I look at this failed CI run:
https://github.com/AOMediaCodec/libavif/actions/runs/6895552528/job/18765325818#step:23:147

I would conclude it was SVT rather than AOM that caused the ILLEGAL errors. Did I miss something?

from libavif.

wantehchang avatar wantehchang commented on September 26, 2024

Here is an evidence that the ILLEGAL errors might be caused by SVT.

In my experiment, I see the following compiler command line for a typical file in SVT:

[ 0%] Building C object Source/Lib/Common/C_DEFAULT/CMakeFiles/COMMON_C_DEFAULT.dir/EbBlend_a64_mask_c.c.o
cd /Users/runner/work/libavif/libavif/ext/SVT-AV1/Build/linux/Release/Source/Lib/Common/C_DEFAULT && /Applications/Xcode_14.2.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang -DARCH_X86_64=1 -DEN_AVX512_SUPPORT=0 -DEXCLUDE_HASH=0 -DHAVE_BUILTIN_EXPECT=1 -DHAVE_VALGRIND_H=0 -DREPRODUCIBLE_BUILDS=0 -DSAFECLIB_STR_NULL_SLACK=1 -I/Users/runner/work/libavif/libavif/ext/SVT-AV1/. -I/Users/runner/work/libavif/libavif/ext/SVT-AV1/Source/API -I/Users/runner/work/libavif/libavif/ext/SVT-AV1/Source/Lib/Common/Codec -I/Users/runner/work/libavif/libavif/ext/SVT-AV1/Source/Lib/Common/C_DEFAULT -Wall -Wextra -Wformat -Wformat-security -march=native -fstack-protector-strong -mno-avx -O3 -DNDEBUG -isysroot /Applications/Xcode_14.2.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX13.1.sdk -mmacosx-version-min=12.7 -fPIC -fvisibility=hidden -std=gnu99 -o CMakeFiles/COMMON_C_DEFAULT.dir/EbBlend_a64_mask_c.c.o -c /Users/runner/work/libavif/libavif/ext/SVT-AV1/Source/Lib/Common/C_DEFAULT/EbBlend_a64_mask_c.c

Note the use of -march=native -mno-avx. -march=native is problematic when we cache built code for use on other machines.

from libavif.

fdintino avatar fdintino commented on September 26, 2024

Note: glibc is an implementation of the Standard C Library commonly used on Linux. glibc is unlikely to be used in macOS. I guess you're using "glibc" loosely to refer to the Standard C Library (libc).

Yes I plainly meant libc, I'm aware that the XNU libc isn't gnu. I'm glad you were able to puzzle your way through my mistake.

Your diagnosis about SVT and cached builds with mismatched cpu capabilities is correct. I was mistaken, and was conflating this issue with a different one I was running into—one that was my own doing, actually.

I had arrived at my original conclusion by making a debug build, triggering the error condition, and then using tmate to shell into the runner. Doing that here (without any of my mucking about with the build) you get, for avifcodectest:

* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
    frame #0: 0x000000010041f59c avifcodectest`get_memory_usage_and_scale(amount=25639199, usage=0x00007ff7bfefc058, scale="M&Q\xb1\U00000001\U00000001") at EbMalloc.c:195:50
   194      *scale = scales[--i];
-> 195      *usage = (double)amount / (double)((size_t)1 << (i * 10));
   196  }
Target 0: (avifcodectest) stopped.
(lldb) disassemble -l

avifcodectest`get_memory_usage_and_scale:
    0x10041f58e <+174>: addl   %ecx, %ecx
    ...
->  0x10041f59c <+188>: shlxq  %rcx, %rax, %rax

And the avifchangesettingtest:

* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
    frame #0: 0x000000010041c8b3 avifchangesettingtest`svt_aom_whole_frame_rect(frm_size=0x000000010a705060, sub_x=1, sub_y=1, is_uv=0) at EbRestoration.c:111:19
Target 0: (avifchangesettingtest) stopped.
(lldb) disassemble -l

   104  Av1PixelRect svt_aom_whole_frame_rect(FrameSize *frm_size, int32_t sub_x, int32_t sub_y, int32_t is_uv) {
   105      Av1PixelRect rect;
   ...
** 111      rect.bottom = ROUND_POWER_OF_TWO(frm_size->frame_height, ss_y);

avifchangesettingtest`svt_aom_whole_frame_rect:
    0x10041c8a1 <+97>:  movq   -0x18(%rbp), %rax
    ...
->  0x10041c8b3 <+115>: shlxl  %ecx, %edx, %esi
    0x10041c8b8 <+120>: sarl   %esi

from libavif.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.