Giter VIP home page Giter VIP logo

Comments (13)

kleisauke avatar kleisauke commented on June 14, 2024 1

... it looks like a bug in the SSE2 path of reduceh, I was able to reproduce this using:

$ VIPS_VECTOR="-4097" vips reduceh zebra.jpg x.jpg 1.5

reducev seems fine, I'll investigate.

As a possible workaround, you could force the specific Highway paths with these environment variables:

# ~HWY_SSE4
$ export VIPS_VECTOR="-2049"

# ~HWY_AVX2
$ export VIPS_VECTOR="-513"

# ~HWY_AVX3
$ export VIPS_VECTOR="-257"

# ~HWY_AVX3_DL
$ export VIPS_VECTOR="-128"

# ~HWY_AVX3_ZEN4
$ export VIPS_VECTOR="-64"

Or just use disable these paths entirely using:

$ export VIPS_NOVECTOR=1

(this comes at the expense of performance)

from imgproxy.

DarthSim avatar DarthSim commented on June 14, 2024 1

Opened a PR to libvips that fixes the issue: libvips/libvips#3763. A fixed build of imgproxy will be available today.

from imgproxy.

DarthSim avatar DarthSim commented on June 14, 2024

Hi @ymotton!

This is interesting. I believe this is related to Highway that is used by libvips starting 8.15. Yet it's pretty weird seeing it malfunctioning like this.

Could you tell me more about the environment where you run imgproxy? If you could also show the output of lscpu, this would be helpful.

/cc @kleisauke you may want to take a look at this

from imgproxy.

kleisauke avatar kleisauke commented on June 14, 2024

It appears that the dependencies of imgproxy were compiled with S-SSE3 as the minimum requirement. However, the function x86::DetectTargets() reports only SSE2, which is always present on x64. Hence, this discrepancy would result in the following warning:

WARNING: CPU supports 0x6000000000004000, software requires 0x4000000000005000
Details
#include <cstdio>
#include <cstdint>

#define HWY_EMU128 (1LL << 61)
#define HWY_SCALAR (1LL << 62)

#define HWY_SSE2 (1LL << 14)
#define HWY_SSSE3 (1LL << 12)  // S-SSE3

int main() {
  const uint64_t bits_u = static_cast<uint64_t>(HWY_SCALAR | HWY_EMU128 | HWY_SSE2);
  const uint64_t enabled = static_cast<uint64_t>(HWY_SCALAR | HWY_SSSE3 | HWY_SSE2);
  fprintf(stderr,
          "WARNING: CPU supports 0x%08x%08x, software requires 0x%08x%08x\n",
          static_cast<uint32_t>(bits_u >> 32),
          static_cast<uint32_t>(bits_u & 0xFFFFFFFF),
          static_cast<uint32_t>(enabled >> 32),
          static_cast<uint32_t>(enabled & 0xFFFFFFFF));
  // WARNING: CPU supports 0x6000000000004000, software requires 0x4000000000005000

  return 0;
}

I suspect that your CPU doesn't support S-SSE3, which Highway always assumes to be available as it was compiled with -mssse3.

from imgproxy.

DarthSim avatar DarthSim commented on June 14, 2024

Yeah, we do build all the deps with -mssse3, but we have been doing this for a long time, yet it didn't cause any problem (if the CPU wasn't supporting SSSE3, imgproxy wouldn't be able to run).

May this be happening because SSSE3 is disabled in vips?

from imgproxy.

ymotton avatar ymotton commented on June 14, 2024

Could you tell me more about the environment where you run imgproxy? If you could also show the output of lscpu, this would be helpful.

Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 40 bits physical, 48 bits virtual
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1
Vendor ID: AuthenticAMD
CPU family: 15
Model: 6
Model name: Common KVM processor
Stepping: 1
CPU MHz: 3500.006
BogoMIPS: 7000.01
Hypervisor vendor: KVM
Virtualization type: full
L1d cache: 256 KiB
L1i cache: 256 KiB
L2 cache: 2 MiB
L3 cache: 16 MiB
NUMA node0 CPU(s): 0-3
Vulnerability Gather data sampling: Not affected
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Mmio stale data: Not affected
Vulnerability Retbleed: Not affected
Vulnerability Spec store bypass: Not affected
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Retpolines, STIBP disabled, RSB filling, PBRSB-eIBRS Not affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx lm rep_good nopl cpuid extd_apicid tsc_known_freq pni cx16
x2apic hypervisor cmp_legacy 3dnowprefetch vmmcall

It's running on a QEMU VM hosted on a Ryzen R9 3950X Proxmox host. Most likely the VM is not inheriting the host cpu flags causing SSE3 not being available. I'll investigate this further.

from imgproxy.

DarthSim avatar DarthSim commented on June 14, 2024

@ymotton Thanks for the info!

May this be happening because SSSE3 is disabled in vips?

I played with QEMU a bit, and yep, this is the case. @kleisauke Is there a reason for disabling SSSE3 in vips? Seems like this causes problems.

@ymotton If I make a test build for you, would you be able to run it in your environment to ensure it works smoothly?

from imgproxy.

kleisauke avatar kleisauke commented on June 14, 2024

Is there a reason for disabling SSSE3 in vips? Seems like this causes problems.

Highway's S-SSE3 target was disabled in libvips for the same reason as mentioned in PR libjxl/libjxl#2627.

According to https://store.steampowered.com/hwsurvey/Steam-Hardware-Software-Survey-Welcome-to-Steam (see "Other Settings"), 99.18% of desktop hardware has SSE4 (and 100% has SSE2), so it is a bit unnecessary to have the highway target for SSSE3 enabled since it will only bring a small speedup to < 1% of people while the binary size bloat is for everyone.

It's supposed to automatically use the SSE2 paths if SSE4 is not supported by the CPU, but I never tested it when compiling Highway with -mssse3. Perhaps that causes issues with dynamic dispatch?

from imgproxy.

ymotton avatar ymotton commented on June 14, 2024

@ymotton If I make a test build for you, would you be able to run it in your environment to ensure it works smoothly?

Yes

from imgproxy.

ymotton avatar ymotton commented on June 14, 2024

... it looks like a bug in the SSE2 path of reduceh, I was able to reproduce this using:

$ VIPS_VECTOR="-4097" vips reduceh zebra.jpg x.jpg 1.5

reducev seems fine, I'll investigage.

As as possible workaround, you could force the specific Highway paths with these environment variables:

# ~HWY_SSE4
$ export VIPS_VECTOR="-2049"

# ~HWY_AVX2
$ export VIPS_VECTOR="-513"

# ~HWY_AVX3
$ export VIPS_VECTOR="-257"

# ~HWY_AVX3_DL
$ export VIPS_VECTOR="-128"

# ~HWY_AVX3_ZEN4
$ export VIPS_VECTOR="-64"

Or just use disable these paths entirely using:

$ export VIPS_NOVECTOR=1

(this comes at the expense of performance)

In my specific case, the hypervisor allows me to set a number of different cpu-types.
The one I was running on was kvm64. I assume that running with the 'host' cpu-type will also solve the issue.

from imgproxy.

DarthSim avatar DarthSim commented on June 14, 2024

it looks like a bug in the SSE2 path of reduceh

Yeah, started to suspect this myself. The warning is not a big deal after all.

In my specific case, the hypervisor allows me to set a number of different cpu-types.
The one I was running on was kvm64. I assume that running with the 'host' cpu-type will also solve the issue.

This will be the best solution as your CPU supports instruction sets up to AVX2, it would be a shame to not use them.

from imgproxy.

DarthSim avatar DarthSim commented on June 14, 2024

I patched vips in the latest build, so the issue is fixed. However, I strongly recommend you use a proper CPU type in QEMU so the CPU dispatcher can detect CPU flags.

from imgproxy.

DarthSim avatar DarthSim commented on June 14, 2024

The PR was also merged, so I guess I can close this

from imgproxy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.