Giter VIP home page Giter VIP logo

portable-simd's Introduction

The Rust standard library's portable SIMD API

Build Status

Code repository for the Portable SIMD Project Group. Please refer to CONTRIBUTING.md for our contributing guidelines.

The docs for this crate are published from the main branch. You can read them here.

If you have questions about SIMD, we have begun writing a guide. We can also be found on Zulip.

If you are interested in support for a specific architecture, you may want stdarch instead.

Hello World

Now we're gonna dip our toes into this world with a small SIMD "Hello, World!" example. Make sure your compiler is up to date and using nightly. We can do that by running

rustup update -- nightly

or by setting up rustup default nightly or else with cargo +nightly {build,test,run}. After updating, run

cargo new hellosimd

to create a new crate. Finally write this in src/main.rs:

#![feature(portable_simd)]
use std::simd::f32x4;
fn main() {
    let a = f32x4::splat(10.0);
    let b = f32x4::from_array([1.0, 2.0, 3.0, 4.0]);
    println!("{:?}", a + b);
}

Explanation: We construct our SIMD vectors with methods like splat or from_array. Next, we can use operators like + on them, and the appropriate SIMD instructions will be carried out. When we run cargo run you should get [11.0, 12.0, 13.0, 14.0].

Supported vectors

Currently, vectors may have up to 64 elements, but aliases are provided only up to 512-bit vectors.

Depending on the size of the primitive type, the number of lanes the vector will have varies. For example, 128-bit vectors have four f32 lanes and two f64 lanes.

The supported element types are as follows:

  • Floating Point: f32, f64
  • Signed Integers: i8, i16, i32, i64, isize (i128 excluded)
  • Unsigned Integers: u8, u16, u32, u64, usize (u128 excluded)
  • Pointers: *const T and *mut T (zero-sized metadata only)
  • Masks: 8-bit, 16-bit, 32-bit, 64-bit, and usize-sized masks

Floating point, signed integers, unsigned integers, and pointers are the primitive types you're already used to. The mask types have elements that are "truthy" values, like bool, but have an unspecified layout because different architectures prefer different layouts for mask types.

portable-simd's People

Contributors

bjorn3 avatar calebzulawski avatar cuishuang avatar dpaoliello avatar dtolnay avatar est31 avatar heiher avatar howjmay avatar joboet avatar kodraus avatar konradhoeffner avatar lokathor avatar mark-simulacrum avatar miguelraz avatar mulimoen avatar nvzqz avatar oli-obk avatar petrochenkov avatar pro465 avatar programmerjake avatar ralfjung avatar sp00ph avatar sstangl avatar t-o-r-u-s avatar taiki-e avatar tako8ki avatar thomcc avatar urgau avatar weihanglo avatar workingjubilee avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

portable-simd's Issues

Support non-power-of-two vector lengths.

In rust-lang/rust#80652 I disabled non-power-of-two vector lengths to deconflict stdsimd and cg_clif development. Power-of-two vectors are typically sufficient, but there was at least one crate using length-3 vectors (yoanlcq/vek#66). It would be nice to reenable these at some point, which would require support from cranelift. Perhaps a feature request should be filed with cranelift?

cc @bjorn3

ICE with unusual vector sizes

When trying to make a vector with only 3 lanes, the following occurs:

error: internal compiler error: compiler/rustc_codegen_llvm/src/context.rs:448:55: unknown intrinsic 'llvm.ceil.v3f64'

A couple options:

  • Only support power-of-two vector lengths. This is probably fine (at least for the MVP) but would probably require a new attribute for restricting the const generic implementation.
  • Patch rustc to use the next power of two for ceil etc. Most other ops seem to work fine with unusual vector sizes.

Replace bitxor in float Neg impl with an intrinsic

Issue #31 replaces/replaced a non-ieee754-compliant implementation of negate, but while it's more-or-less optimal on some platforms, on ARM it actually constructs a mask and does the XOR, even though there's a relevant instruction it should use.

In general this is all pretty hairy, and ideally rustc would just give us an intrinsic that will negate the float in optimal (but correct) way for the platform. This isn't urgent (the slowdown is minor) but isn't something we should just forget.

Impl special functions for SIMD

Need all of:

  • div_euclid/rem_euclid
  • clamp
  • max/min
  • rotate_left/rotate_right
  • swap_bytes/reverse_bits
  • saturating_add/saturating_sub
  • saturating_neg/saturating_abs
  • saturating_mul
  • wrapping_add/wrapping_sub/wrapping_mul/wrapping_pow
  • wrapping_div/wrapping_rem/wrapping_div_euclid/wrapping_rem_euclid
  • wrapping_neg/wrapping_abs
  • overflowing_add/overflowing_sub
  • overflowing_mul
  • overflowing_div/overflowing_div_euclid
  • overflowing_rem/overflowing_rem_euclid
  • overflowing_neg/overflowing_abs
  • overflowing_shl/overflowing_shr
  • from_be/from_le/to_be/to_le
  • to_be_bytes/to_le_bytes/from_be_bytes/from_le_bytes
  • {to,from}_ne_bytes

for integers:

  • leading_zeros/trailing_zeros
  • leading_ones/trailing_ones
  • count_ones/count_zeros
  • pow
  • overflowing_pow
  • saturating_pow
  • wrapping_shl/wrapping_shr

for floats:

  • trig./hyperbolic functions: #6
  • recip
  • mul_add
  • powi/powf
  • to_int_unchecked
  • to_degrees/to_radians
  • sqrt
  • cbrt
  • hypot
  • exp/exp2/ln/log/log2/log10
  • exp_m1/ln_1p

for signed integers and floats:

  • abs
  • signum
  • copysign
  • is_positive/is_negative

See also #109

Writing a guide to SIMD

We are going to want to produce a guide on SIMD in Rust that doesn't just cover the API but also covers a standard vocabulary about this, so as to hopefully standardize on a vendor-ambivalent vocabulary. We are also going to want to link to vendors, of course, but we want a way to discuss our own portable types that meshes reasonably well with Rust.

It's important to meet people where they are at here. I believe that people mostly either:

  1. know much of what there is to say about SIMD but may mostly have experience with particular vendor intrinsics or even a more abstract model like GPGPU... for them, we want an option to basically read the "front and back pages" because they may be used to certain terms that are not reflected in our API
  2. have an idea of what SIMD is but mostly as an awareness of e.g. autovectorization, they may know some basics but don't have experience with things like (GP)GPU or intrinsic function programming
  3. have no idea whatsoever, and may not even know why they might care!
  • Standard vocabulary will likely include these terms or alternates to them:

    • SIMD
    • vector
    • vector register
    • scalar (by contrast)
    • vectorize / autovectorization
    • "lane"?
    • "field"?
    • "element"?
    • "intrinsics"?
  • We will likely need to also cover subtle differences (or the absence thereof)

    • intrinsics vs. instructions
    • instructions vs. operations
    • lane vs. field vs. element
    • scalar vs. vector
  • Differences between CPU SIMD and GPU SIMT

  • Differences between SIMD vectors and SPMD threads

  • Differences between "wide vectors" (x86 SSE/AVX/AVX512, Arm Neon, etc.) and "vector architectures" (Cray, RISCV-V, Arm SVE)

  • How to get the most out of Rust SIMD code

    • feature levels, including is_x86_feature_detected!
    • #[target_feature]
    • #[cfg(target_feature)]
    • how to do "multiversioning"
    • multiversioning, why or why not?
    • when do you use std::arch instead?

And we might want to talk briefly about (abstract) tensors vs. (SIMD) vectors due to matrix multiplication being related to this.

Some floating point test failures on Raspberry Pi 4 with Neon

Probably because #39 ๐Ÿ™‚

I've been testing stdsimd a bit on a Raspberry Pi 4 and ran into a couple float test failures with +neon:

cargo test --target armv7-unknown-linux-gnueabihf

succeeds, but:

RUSTFLAGS="-C target-feature=+neon" cargo test --target armv7-unknown-linux-gnueabihf

fails with:

failures:

---- ops_impl::f32::f32x16::fract_odd_floats stdout ----
thread 'ops_impl::f32::f32x16::fract_odd_floats' panicked at 'assertion failed: `(left == right)`
  left: `[0.0 (0), 0.0 (0), 0.0 (0), 0.0 (0), 0.0 (0), 0.0 (0), NaN (7fc00000), NaN (7fc00000), 0.000000000000000000000000000000000000011754944 (800000), -0.000000000000000000000000000000000000011754944 (80800000), 0.00000011920929 (34000000), -0.00000011920929 (b4000000), NaN (7fc00000), NaN (7fc00000), 0.33333206 (3eaaaa80), -0.33333206 (beaaaa80)]`,
 right: `[0.0 (0), 0.0 (0), 0.0 (0), 0.0 (0), 0.0 (0), 0.0 (0), NaN (7fc00000), NaN (7fc00000), 0.000000000000000000000000000000000000011754944 (800000), -0.000000000000000000000000000000000000011754944 (80800000), 0.00000011920929 (34000000), -0.00000011920929 (b4000000), NaN (7fc00000), NaN (ffc00000), 0.33333206 (3eaaaa80), -0.33333206 (beaaaa80)]`', crates/core_simd/tests/ops_impl/f32.rs:6:1
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

---- ops_impl::f32::f32x2::fract_odd_floats stdout ----
thread 'ops_impl::f32::f32x2::fract_odd_floats' panicked at 'assertion failed: `(left == right)`
  left: `[NaN (7fc00000), NaN (7fc00000)]`,
 right: `[NaN (7fc00000), NaN (ffc00000)]`', crates/core_simd/tests/ops_impl/f32.rs:3:1

---- ops_impl::f32::f32x4::fract_odd_floats stdout ----
thread 'ops_impl::f32::f32x4::fract_odd_floats' panicked at 'assertion failed: `(left == right)`
  left: `[NaN (7fc00000), NaN (7fc00000), 0.33333206 (3eaaaa80), -0.33333206 (beaaaa80)]`,
 right: `[NaN (7fc00000), NaN (ffc00000), 0.33333206 (3eaaaa80), -0.33333206 (beaaaa80)]`', crates/core_simd/tests/ops_impl/f32.rs:4:1

---- ops_impl::f32::f32x8::fract_odd_floats stdout ----
thread 'ops_impl::f32::f32x8::fract_odd_floats' panicked at 'assertion failed: `(left == right)`
  left: `[0.000000000000000000000000000000000000011754944 (800000), -0.000000000000000000000000000000000000011754944 (80800000), 0.00000011920929 (34000000), -0.00000011920929 (b4000000), NaN (7fc00000), NaN (7fc00000), 0.33333206 (3eaaaa80), -0.33333206 (beaaaa80)]`,
 right: `[0.000000000000000000000000000000000000011754944 (800000), -0.000000000000000000000000000000000000011754944 (80800000), 0.00000011920929 (34000000), -0.00000011920929 (b4000000), NaN (7fc00000), NaN (ffc00000), 0.33333206 (3eaaaa80), -0.33333206 (beaaaa80)]`', crates/core_simd/tests/ops_impl/f32.rs:5:1


failures:
    ops_impl::f32::f32x16::fract_odd_floats
    ops_impl::f32::f32x2::fract_odd_floats
    ops_impl::f32::f32x4::fract_odd_floats
    ops_impl::f32::f32x8::fract_odd_floats

test result: FAILED. 2242 passed; 4 failed; 0 ignored; 0 measured; 0 filtered out
cat /proc/cpuinfo

processor       : 0
model name      : ARMv7 Processor rev 3 (v7l)
BogoMIPS        : 108.00
Features        : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x0
CPU part        : 0xd08
CPU revision    : 3

processor       : 1
model name      : ARMv7 Processor rev 3 (v7l)
BogoMIPS        : 108.00
Features        : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x0
CPU part        : 0xd08
CPU revision    : 3

processor       : 2
model name      : ARMv7 Processor rev 3 (v7l)
BogoMIPS        : 108.00
Features        : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x0
CPU part        : 0xd08
CPU revision    : 3

processor       : 3
model name      : ARMv7 Processor rev 3 (v7l)
BogoMIPS        : 108.00
Features        : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x0
CPU part        : 0xd08
CPU revision    : 3

Hardware        : BCM2711
Revision        : c03112
Serial          : 10000000421d9bd8
rustc --version

rustc 1.49.0-nightly (e160e5cb8 2020-10-14)
uname -a

Linux raspberrypi 5.4.51-v7l+ #1333 SMP Mon Aug 10 16:51:40 BST 2020 armv7l GNU/Linux

I haven't investigated this much yet, but just thought I'd drop it here as a start.

Use rust feature flags

There has been some discussion about a std_simd_arithmetic feature. However, I think it may make sense to just provide a feature flag for everything we consider "stable" (which includes arithmetic, among other things) and then separate flags for things that may be less stable (like the const generic shuffles)

Get Cranelifted

Per the conversation on Zulip in t-compiler and project-portable-simd, we are going to find Cranelift landing a direct challenge to our landing in std, because the last one in is a rotten egg has to implement support for the other... but while we could "race" them, it seems integration might require a total redesign anyways. So, collaboration! But we need to

  1. Figure out what Cranelift needs from us
  2. Figure out what we need from Cranelift
  3. Figure out what additional consequences landing std::simd has for Cranelift support
  4. Figure out what additional consequences landing Cranelift support has for std::simd
  5. Solve all that

@bjorn3 said:

I hope most if not all of the operations used by portable simd could use (newly introduced) simd_* platform intrinsics that are architecture and vector size independent. This would allow easy emulation of them implemented once per operation.

This has led to some immediate questions:

  1. What exactly are "platform-intrinsics" supposed to be? Our current understanding is that they're LLVM ops... but Cranelift isn't LLVM, obviously. So...
  2. What changes about them when moving from LLVM to Cranelift?
  3. What does this do for SIMD FFI?
  4. Can we find a way around the SIMD FFI question?

Unsound Div, Rem

The following should panic for integers:

  • neg(MIN)
  • div(MIN, -1)
  • rem(MIN, -1)
  • div(x, 0)
  • rem(x, 0)

Bounding const generic bitmasks

You can do this just fine:

#[repr(simd)]
pub struct SimdU8<const LANES: usize>([u8; LANES]);

But to make a bitmask you need to do this:

#[repr(simd)]
pub struct SimdBitMask<const LANES: usize>([u8; (LANES + 7) / 8]) where [(); (LANES + 7) / 8]: Sized;

โ€”@calebzulawski in (https://rust-lang.zulipchat.com/#narrow/stream/257879-project-portable-simd/topic/Mask.20API/near/219272371)

As a reminder, because frankly I have lost track of this in a conversation at least once, a "bitmask" is a mask type that uses 1 bit per vector element (like a bitset) so that a single tightly-packed number defines a mask for an entire vector. Bitmasks can be more efficient, but in order to make an efficient bitmask, you need to know the size of the bitmasks you're supporting ahead of time. This is something we would prefer to not say, given that current architectures that are actually in use on commodity hardware (AVX512) already support 64b bitmasks, and future implementations of RISCV-V also will use bitmasks with pretty interestingly widthsโ€ , as I currently understand it.

This problem infects our attempt to design an "opaque", arch-agnostic mask type because we have to impose the same bound at that site as well. That makes it quite unpretty and non-ergonomic, even though we have some clear bounds that you would think we could prove to the compiler pretty effectively.

"Full masks" are comparatively simple... they're the size of a vector's elements, and so, are the size of a vector register. Which is a bit convenient, really. Caleb mentioned It might be nice to say that a mask uses an entire vector register and so is Many bits wide regardless, in terms of layout, and hope the optimizer handles it intelligently.

โ€  hypothetically, unless I misunderstand the spec, it requires a mask register to fit inside VLEN, the size of an individual vector register, but that means it could allow a VLEN-sized mask to mask byte-wide structures over a "vector register group" using that 1 vector register... or 128 elements! @programmerjake am I reading this correctly?

Tests to check for incompatibilities

Incompatibilities we're aware of

  • SSE2 changes a lot related to MMX and SSE, so test...
    • i586 vs. i686
    • i586 vs. x86_64
    • i686 vs. x86_64
  • ARMv7 vs. ARMv8 ("aarch64") Neon implementations can differ, even with compatibility modes, so test...
    • aarch64 vs thumbv7neon
    • aarch64 vs armv7 + neon

any() and all() methods for mask types

It appears that mask types don't currently have any() and all() methods that packed_simd and simd had.

Filing this to keep track of adding those methods.

Implement SIMD-specific functions

An incomplete list (motivated by a reddit comment)

  • scatter/gather
  • [ ] nontemporal load/store (not really SIMD, but maybe. see nontemporal_store)
  • lookup tables (possibly related to #11)
  • masked load/stores
  • reductions/horizontal fns, such as horizontal add

extern "platform-intrinsics" float functions often call libm

I tried this code (Godbolt):

Rust Code

#![no_std]
#![allow(non_camel_case_types)]
#![feature(repr_simd, platform_intrinsics)]

#[repr(simd)]
#[derive(Debug)]
pub struct f32x4(f32, f32, f32, f32);

extern "platform-intrinsic" {
    fn simd_fsqrt<T>(x: T) -> T;
    fn simd_fabs<T>(x: T) -> T;
    fn simd_fsin<T>(x: T) -> T;
    fn simd_fcos<T>(x: T) -> T;
    fn simd_ceil<T>(x: T) -> T;
    fn simd_fexp<T>(x: T) -> T;
    fn simd_fexp2<T>(x: T) -> T;
    fn simd_floor<T>(x: T) -> T;
    fn simd_fma<T>(x: T, y: T, z: T) -> T;
    fn simd_flog<T>(x: T) -> T;
    fn simd_flog10<T>(x: T) -> T;
    fn simd_flog2<T>(x: T) -> T;
    fn simd_fpow<T>(x: T, y: T) -> T;
    fn simd_fpowi<T>(x: T, y: i32) -> T;
    fn simd_trunc<T>(x: T) -> T;
    fn simd_round<T>(x: T) -> T;
}

impl f32x4 {
    // Rounding
    pub fn ceil(self) -> Self {
        unsafe { simd_ceil(self) }
    }
    pub fn floor(self) -> Self {
        unsafe { simd_floor(self) }
    }
    pub fn round(self) -> Self {
        unsafe { simd_round(self) }
    }
    pub fn trunc(self) -> Self {
        unsafe { simd_trunc(self) }
    }

    // Arithmetic
    pub fn mul_add(self, y: Self, z: Self) -> Self {
        unsafe { simd_fma(self, y, z) }
    }
    pub fn abs(self) -> Self {
        unsafe { simd_fabs(self) }
    }
    pub fn sqrt(self) -> Self {
        unsafe { simd_fsqrt(self) }
    }
    pub fn powi(self, exp: i32) -> Self {
        unsafe { simd_fpowi(self, exp) }
    }
    pub fn powf(self, exp: Self) -> Self {
        unsafe { simd_fpow(self, exp) }
    }

    // Calculus
    pub fn flog2(self) -> Self {
        unsafe { simd_flog2(self) }
    }
    pub fn flog10(self) -> Self {
        unsafe { simd_flog10(self) }
    }
    pub fn flog(self) -> Self {
        unsafe { simd_flog(self) }
    }
    pub fn fexp(self) -> Self {
        unsafe { simd_fexp(self) }
    }
    pub fn fexp2(self) -> Self {
        unsafe { simd_fexp2(self) }
    }

    // Trigonometry
    pub fn cos(self) -> Self {
        unsafe { simd_fcos(self) }
    }
    pub fn sin(self) -> Self {
        unsafe { simd_fsin(self) }
    }
}

I expected to see this happen: Compilations to "pure assembly".

Instead, this happened: Mostly compiled to calls to libm!

When sufficient vector features are enabled, these do compile to vectorized assembly instructions. However, the problem is that compilation without those features enabled means code that depends on libm... which is not allowed in core. We are going to have to either solve this or push our implementation of SimdF32 and SimdF64 mostly into std, not core.

Notable winners on x64: simd_fsqrt, simd_fabs become vector instructions just fine. I'm worried about them on x86_32 or Arm architectures, though.

Meta

rustc --version --verbose:

rustc 1.52.0-nightly (d1206f950 2021-02-15)
binary: rustc
commit-hash: d1206f950ffb76c76e1b74a19ae33c2b7d949454
commit-date: 2021-02-15
host: x86_64-unknown-linux-gnu
release: 1.52.0-nightly
LLVM version: 11.0.1
x86 Assembly

<&T as core::fmt::Debug>::fmt:
        movq    (%rdi), %rdi
        jmpq    *_ZN4core3fmt5float50_$LT$impl$u20$core..fmt..Debug$u20$for$u20$f32$GT$3fmt17hf2084266ae57b528E@GOTPCREL(%rip)

core::ptr::drop_in_place<&f32>:
        retq

example::f32x4::ceil:
        pushq   %r14
        pushq   %rbx
        subq    $56, %rsp
        movq    %rdi, %r14
        movaps  (%rsi), %xmm0
        movaps  %xmm0, 16(%rsp)
        shufps  $255, %xmm0, %xmm0
        movq    ceilf@GOTPCREL(%rip), %rbx
        callq   *%rbx
        movaps  %xmm0, (%rsp)
        movaps  16(%rsp), %xmm0
        movhlps %xmm0, %xmm0
        callq   *%rbx
        unpcklps        (%rsp), %xmm0
        movaps  %xmm0, (%rsp)
        movaps  16(%rsp), %xmm0
        callq   *%rbx
        movaps  %xmm0, 32(%rsp)
        movaps  16(%rsp), %xmm0
        shufps  $85, %xmm0, %xmm0
        callq   *%rbx
        movaps  32(%rsp), %xmm1
        unpcklps        %xmm0, %xmm1
        unpcklpd        (%rsp), %xmm1
        movaps  %xmm1, (%r14)
        movq    %r14, %rax
        addq    $56, %rsp
        popq    %rbx
        popq    %r14
        retq

example::f32x4::floor:
        pushq   %r14
        pushq   %rbx
        subq    $56, %rsp
        movq    %rdi, %r14
        movaps  (%rsi), %xmm0
        movaps  %xmm0, 16(%rsp)
        shufps  $255, %xmm0, %xmm0
        movq    floorf@GOTPCREL(%rip), %rbx
        callq   *%rbx
        movaps  %xmm0, (%rsp)
        movaps  16(%rsp), %xmm0
        movhlps %xmm0, %xmm0
        callq   *%rbx
        unpcklps        (%rsp), %xmm0
        movaps  %xmm0, (%rsp)
        movaps  16(%rsp), %xmm0
        callq   *%rbx
        movaps  %xmm0, 32(%rsp)
        movaps  16(%rsp), %xmm0
        shufps  $85, %xmm0, %xmm0
        callq   *%rbx
        movaps  32(%rsp), %xmm1
        unpcklps        %xmm0, %xmm1
        unpcklpd        (%rsp), %xmm1
        movaps  %xmm1, (%r14)
        movq    %r14, %rax
        addq    $56, %rsp
        popq    %rbx
        popq    %r14
        retq

example::f32x4::round:
        pushq   %r14
        pushq   %rbx
        subq    $56, %rsp
        movq    %rdi, %r14
        movaps  (%rsi), %xmm0
        movaps  %xmm0, 16(%rsp)
        shufps  $255, %xmm0, %xmm0
        movq    roundf@GOTPCREL(%rip), %rbx
        callq   *%rbx
        movaps  %xmm0, (%rsp)
        movaps  16(%rsp), %xmm0
        movhlps %xmm0, %xmm0
        callq   *%rbx
        unpcklps        (%rsp), %xmm0
        movaps  %xmm0, (%rsp)
        movaps  16(%rsp), %xmm0
        callq   *%rbx
        movaps  %xmm0, 32(%rsp)
        movaps  16(%rsp), %xmm0
        shufps  $85, %xmm0, %xmm0
        callq   *%rbx
        movaps  32(%rsp), %xmm1
        unpcklps        %xmm0, %xmm1
        unpcklpd        (%rsp), %xmm1
        movaps  %xmm1, (%r14)
        movq    %r14, %rax
        addq    $56, %rsp
        popq    %rbx
        popq    %r14
        retq

example::f32x4::trunc:
        pushq   %r14
        pushq   %rbx
        subq    $56, %rsp
        movq    %rdi, %r14
        movaps  (%rsi), %xmm0
        movaps  %xmm0, 16(%rsp)
        shufps  $255, %xmm0, %xmm0
        movq    truncf@GOTPCREL(%rip), %rbx
        callq   *%rbx
        movaps  %xmm0, (%rsp)
        movaps  16(%rsp), %xmm0
        movhlps %xmm0, %xmm0
        callq   *%rbx
        unpcklps        (%rsp), %xmm0
        movaps  %xmm0, (%rsp)
        movaps  16(%rsp), %xmm0
        callq   *%rbx
        movaps  %xmm0, 32(%rsp)
        movaps  16(%rsp), %xmm0
        shufps  $85, %xmm0, %xmm0
        callq   *%rbx
        movaps  32(%rsp), %xmm1
        unpcklps        %xmm0, %xmm1
        unpcklpd        (%rsp), %xmm1
        movaps  %xmm1, (%r14)
        movq    %r14, %rax
        addq    $56, %rsp
        popq    %rbx
        popq    %r14
        retq

example::f32x4::mul_add:
        movaps  (%rsi), %xmm0
        mulps   (%rdx), %xmm0
        movq    %rdi, %rax
        addps   (%rcx), %xmm0
        movaps  %xmm0, (%rdi)
        retq

.LCPI7_0:
        .long   0x7fffffff
        .long   0x7fffffff
        .long   0x7fffffff
        .long   0x7fffffff
example::f32x4::abs:
        movq    %rdi, %rax
        movaps  (%rsi), %xmm0
        andps   .LCPI7_0(%rip), %xmm0
        movaps  %xmm0, (%rdi)
        retq

.LCPI8_0:
        .long   0xbf000000
        .long   0xbf000000
        .long   0xbf000000
        .long   0xbf000000
.LCPI8_1:
        .long   0xc0400000
        .long   0xc0400000
        .long   0xc0400000
        .long   0xc0400000
.LCPI8_2:
        .long   0x7fffffff
        .long   0x7fffffff
        .long   0x7fffffff
        .long   0x7fffffff
.LCPI8_3:
        .long   0x00800000
        .long   0x00800000
        .long   0x00800000
        .long   0x00800000
example::f32x4::sqrt:
        movaps  (%rsi), %xmm0
        rsqrtps %xmm0, %xmm1
        movaps  %xmm0, %xmm2
        mulps   %xmm1, %xmm2
        movaps  .LCPI8_0(%rip), %xmm3
        mulps   %xmm2, %xmm3
        mulps   %xmm1, %xmm2
        addps   .LCPI8_1(%rip), %xmm2
        movq    %rdi, %rax
        mulps   %xmm3, %xmm2
        andps   .LCPI8_2(%rip), %xmm0
        movaps  .LCPI8_3(%rip), %xmm1
        cmpleps %xmm0, %xmm1
        andps   %xmm2, %xmm1
        movaps  %xmm1, (%rdi)
        retq

example::f32x4::powi:
        pushq   %rbp
        pushq   %r14
        pushq   %rbx
        subq    $48, %rsp
        movl    %edx, %ebp
        movq    %rdi, %r14
        movaps  (%rsi), %xmm0
        movaps  %xmm0, 16(%rsp)
        shufps  $255, %xmm0, %xmm0
        movq    __powisf2@GOTPCREL(%rip), %rbx
        movl    %edx, %edi
        callq   *%rbx
        movaps  %xmm0, (%rsp)
        movaps  16(%rsp), %xmm0
        movhlps %xmm0, %xmm0
        movl    %ebp, %edi
        callq   *%rbx
        unpcklps        (%rsp), %xmm0
        movaps  %xmm0, (%rsp)
        movaps  16(%rsp), %xmm0
        movl    %ebp, %edi
        callq   *%rbx
        movaps  %xmm0, 32(%rsp)
        movaps  16(%rsp), %xmm0
        shufps  $85, %xmm0, %xmm0
        movl    %ebp, %edi
        callq   *%rbx
        movaps  32(%rsp), %xmm1
        unpcklps        %xmm0, %xmm1
        unpcklpd        (%rsp), %xmm1
        movaps  %xmm1, (%r14)
        movq    %r14, %rax
        addq    $48, %rsp
        popq    %rbx
        popq    %r14
        popq    %rbp
        retq

example::f32x4::powf:
        pushq   %r14
        pushq   %rbx
        subq    $72, %rsp
        movq    %rdi, %r14
        movaps  (%rsi), %xmm0
        movaps  %xmm0, 32(%rsp)
        movaps  (%rdx), %xmm1
        movaps  %xmm1, 16(%rsp)
        shufps  $255, %xmm0, %xmm0
        shufps  $255, %xmm1, %xmm1
        movq    powf@GOTPCREL(%rip), %rbx
        callq   *%rbx
        movaps  %xmm0, (%rsp)
        movaps  32(%rsp), %xmm0
        movhlps %xmm0, %xmm0
        movaps  16(%rsp), %xmm1
        movhlps %xmm1, %xmm1
        callq   *%rbx
        unpcklps        (%rsp), %xmm0
        movaps  %xmm0, (%rsp)
        movaps  32(%rsp), %xmm0
        movaps  16(%rsp), %xmm1
        callq   *%rbx
        movaps  %xmm0, 48(%rsp)
        movaps  32(%rsp), %xmm0
        shufps  $85, %xmm0, %xmm0
        movaps  16(%rsp), %xmm1
        shufps  $85, %xmm1, %xmm1
        callq   *%rbx
        movaps  48(%rsp), %xmm1
        unpcklps        %xmm0, %xmm1
        unpcklpd        (%rsp), %xmm1
        movaps  %xmm1, (%r14)
        movq    %r14, %rax
        addq    $72, %rsp
        popq    %rbx
        popq    %r14
        retq

example::f32x4::flog2:
        pushq   %r14
        pushq   %rbx
        subq    $56, %rsp
        movq    %rdi, %r14
        movaps  (%rsi), %xmm0
        movaps  %xmm0, 16(%rsp)
        shufps  $255, %xmm0, %xmm0
        movq    log2f@GOTPCREL(%rip), %rbx
        callq   *%rbx
        movaps  %xmm0, (%rsp)
        movaps  16(%rsp), %xmm0
        movhlps %xmm0, %xmm0
        callq   *%rbx
        unpcklps        (%rsp), %xmm0
        movaps  %xmm0, (%rsp)
        movaps  16(%rsp), %xmm0
        callq   *%rbx
        movaps  %xmm0, 32(%rsp)
        movaps  16(%rsp), %xmm0
        shufps  $85, %xmm0, %xmm0
        callq   *%rbx
        movaps  32(%rsp), %xmm1
        unpcklps        %xmm0, %xmm1
        unpcklpd        (%rsp), %xmm1
        movaps  %xmm1, (%r14)
        movq    %r14, %rax
        addq    $56, %rsp
        popq    %rbx
        popq    %r14
        retq

example::f32x4::flog10:
        pushq   %r14
        pushq   %rbx
        subq    $56, %rsp
        movq    %rdi, %r14
        movaps  (%rsi), %xmm0
        movaps  %xmm0, 16(%rsp)
        shufps  $255, %xmm0, %xmm0
        movq    log10f@GOTPCREL(%rip), %rbx
        callq   *%rbx
        movaps  %xmm0, (%rsp)
        movaps  16(%rsp), %xmm0
        movhlps %xmm0, %xmm0
        callq   *%rbx
        unpcklps        (%rsp), %xmm0
        movaps  %xmm0, (%rsp)
        movaps  16(%rsp), %xmm0
        callq   *%rbx
        movaps  %xmm0, 32(%rsp)
        movaps  16(%rsp), %xmm0
        shufps  $85, %xmm0, %xmm0
        callq   *%rbx
        movaps  32(%rsp), %xmm1
        unpcklps        %xmm0, %xmm1
        unpcklpd        (%rsp), %xmm1
        movaps  %xmm1, (%r14)
        movq    %r14, %rax
        addq    $56, %rsp
        popq    %rbx
        popq    %r14
        retq

example::f32x4::flog:
        pushq   %r14
        pushq   %rbx
        subq    $56, %rsp
        movq    %rdi, %r14
        movaps  (%rsi), %xmm0
        movaps  %xmm0, 16(%rsp)
        shufps  $255, %xmm0, %xmm0
        movq    logf@GOTPCREL(%rip), %rbx
        callq   *%rbx
        movaps  %xmm0, (%rsp)
        movaps  16(%rsp), %xmm0
        movhlps %xmm0, %xmm0
        callq   *%rbx
        unpcklps        (%rsp), %xmm0
        movaps  %xmm0, (%rsp)
        movaps  16(%rsp), %xmm0
        callq   *%rbx
        movaps  %xmm0, 32(%rsp)
        movaps  16(%rsp), %xmm0
        shufps  $85, %xmm0, %xmm0
        callq   *%rbx
        movaps  32(%rsp), %xmm1
        unpcklps        %xmm0, %xmm1
        unpcklpd        (%rsp), %xmm1
        movaps  %xmm1, (%r14)
        movq    %r14, %rax
        addq    $56, %rsp
        popq    %rbx
        popq    %r14
        retq

example::f32x4::fexp:
        pushq   %r14
        pushq   %rbx
        subq    $56, %rsp
        movq    %rdi, %r14
        movaps  (%rsi), %xmm0
        movaps  %xmm0, 16(%rsp)
        shufps  $255, %xmm0, %xmm0
        movq    expf@GOTPCREL(%rip), %rbx
        callq   *%rbx
        movaps  %xmm0, (%rsp)
        movaps  16(%rsp), %xmm0
        movhlps %xmm0, %xmm0
        callq   *%rbx
        unpcklps        (%rsp), %xmm0
        movaps  %xmm0, (%rsp)
        movaps  16(%rsp), %xmm0
        callq   *%rbx
        movaps  %xmm0, 32(%rsp)
        movaps  16(%rsp), %xmm0
        shufps  $85, %xmm0, %xmm0
        callq   *%rbx
        movaps  32(%rsp), %xmm1
        unpcklps        %xmm0, %xmm1
        unpcklpd        (%rsp), %xmm1
        movaps  %xmm1, (%r14)
        movq    %r14, %rax
        addq    $56, %rsp
        popq    %rbx
        popq    %r14
        retq

example::f32x4::fexp2:
        pushq   %r14
        pushq   %rbx
        subq    $56, %rsp
        movq    %rdi, %r14
        movaps  (%rsi), %xmm0
        movaps  %xmm0, 16(%rsp)
        shufps  $255, %xmm0, %xmm0
        movq    exp2f@GOTPCREL(%rip), %rbx
        callq   *%rbx
        movaps  %xmm0, (%rsp)
        movaps  16(%rsp), %xmm0
        movhlps %xmm0, %xmm0
        callq   *%rbx
        unpcklps        (%rsp), %xmm0
        movaps  %xmm0, (%rsp)
        movaps  16(%rsp), %xmm0
        callq   *%rbx
        movaps  %xmm0, 32(%rsp)
        movaps  16(%rsp), %xmm0
        shufps  $85, %xmm0, %xmm0
        callq   *%rbx
        movaps  32(%rsp), %xmm1
        unpcklps        %xmm0, %xmm1
        unpcklpd        (%rsp), %xmm1
        movaps  %xmm1, (%r14)
        movq    %r14, %rax
        addq    $56, %rsp
        popq    %rbx
        popq    %r14
        retq

example::f32x4::cos:
        pushq   %r14
        pushq   %rbx
        subq    $56, %rsp
        movq    %rdi, %r14
        movaps  (%rsi), %xmm0
        movaps  %xmm0, 16(%rsp)
        shufps  $255, %xmm0, %xmm0
        movq    cosf@GOTPCREL(%rip), %rbx
        callq   *%rbx
        movaps  %xmm0, (%rsp)
        movaps  16(%rsp), %xmm0
        movhlps %xmm0, %xmm0
        callq   *%rbx
        unpcklps        (%rsp), %xmm0
        movaps  %xmm0, (%rsp)
        movaps  16(%rsp), %xmm0
        callq   *%rbx
        movaps  %xmm0, 32(%rsp)
        movaps  16(%rsp), %xmm0
        shufps  $85, %xmm0, %xmm0
        callq   *%rbx
        movaps  32(%rsp), %xmm1
        unpcklps        %xmm0, %xmm1
        unpcklpd        (%rsp), %xmm1
        movaps  %xmm1, (%r14)
        movq    %r14, %rax
        addq    $56, %rsp
        popq    %rbx
        popq    %r14
        retq

example::f32x4::sin:
        pushq   %r14
        pushq   %rbx
        subq    $56, %rsp
        movq    %rdi, %r14
        movaps  (%rsi), %xmm0
        movaps  %xmm0, 16(%rsp)
        shufps  $255, %xmm0, %xmm0
        movq    sinf@GOTPCREL(%rip), %rbx
        callq   *%rbx
        movaps  %xmm0, (%rsp)
        movaps  16(%rsp), %xmm0
        movhlps %xmm0, %xmm0
        callq   *%rbx
        unpcklps        (%rsp), %xmm0
        movaps  %xmm0, (%rsp)
        movaps  16(%rsp), %xmm0
        callq   *%rbx
        movaps  %xmm0, 32(%rsp)
        movaps  16(%rsp), %xmm0
        shufps  $85, %xmm0, %xmm0
        callq   *%rbx
        movaps  32(%rsp), %xmm1
        unpcklps        %xmm0, %xmm1
        unpcklpd        (%rsp), %xmm1
        movaps  %xmm1, (%r14)
        movq    %r14, %rax
        addq    $56, %rsp
        popq    %rbx
        popq    %r14
        retq

<example::f32x4 as core::fmt::Debug>::fmt:
        pushq   %rbp
        pushq   %r15
        pushq   %r14
        pushq   %r13
        pushq   %r12
        pushq   %rbx
        subq    $40, %rsp
        movq    %rdi, %rbx
        leaq    4(%rdi), %r12
        leaq    8(%rdi), %r13
        leaq    12(%rdi), %r15
        leaq    .L__unnamed_1(%rip), %rdx
        leaq    16(%rsp), %r14
        movl    $5, %ecx
        movq    %r14, %rdi
        callq   *core::fmt::Formatter::debug_tuple@GOTPCREL(%rip)
        movq    %rbx, 8(%rsp)
        leaq    .L__unnamed_2(%rip), %rbx
        movq    core::fmt::builders::DebugTuple::field@GOTPCREL(%rip), %rbp
        leaq    8(%rsp), %rsi
        movq    %r14, %rdi
        movq    %rbx, %rdx
        callq   *%rbp
        movq    %r12, 8(%rsp)
        leaq    8(%rsp), %rsi
        movq    %r14, %rdi
        movq    %rbx, %rdx
        callq   *%rbp
        movq    %r13, 8(%rsp)
        leaq    8(%rsp), %rsi
        movq    %r14, %rdi
        movq    %rbx, %rdx
        callq   *%rbp
        movq    %r15, 8(%rsp)
        leaq    8(%rsp), %rsi
        movq    %r14, %rdi
        movq    %rbx, %rdx
        callq   *%rbp
        movq    %r14, %rdi
        callq   *core::fmt::builders::DebugTuple::finish@GOTPCREL(%rip)
        addq    $40, %rsp
        popq    %rbx
        popq    %r12
        popq    %r13
        popq    %r14
        popq    %r15
        popq    %rbp
        retq

.L__unnamed_1:
        .ascii  "f32x4"

.L__unnamed_2:
        .quad   core::ptr::drop_in_place<&f32>
        .quad   8
        .quad   8
        .quad   <&T as core::fmt::Debug>::fmt

AArch64 Assembly

<&T as core::fmt::Debug>::fmt:
        ldr     x0, [x0]
        b       _ZN4core3fmt5float50_$LT$impl$u20$core..fmt..Debug$u20$for$u20$f32$GT$3fmt17h68f66863527610f0E

core::ptr::drop_in_place<&f32>:
        ret

example::f32x4::ceil:
        ldr     q0, [x0]
        frintp  v0.4s, v0.4s
        str     q0, [x8]
        ret

example::f32x4::floor:
        ldr     q0, [x0]
        frintm  v0.4s, v0.4s
        str     q0, [x8]
        ret

example::f32x4::round:
        ldr     q0, [x0]
        frinta  v0.4s, v0.4s
        str     q0, [x8]
        ret

example::f32x4::trunc:
        ldr     q0, [x0]
        frintz  v0.4s, v0.4s
        str     q0, [x8]
        ret

example::f32x4::mul_add:
        ldr     q0, [x0]
        ldr     q1, [x1]
        ldr     q2, [x2]
        fmla    v2.4s, v1.4s, v0.4s
        str     q2, [x8]
        ret

example::f32x4::abs:
        ldr     q0, [x0]
        fabs    v0.4s, v0.4s
        str     q0, [x8]
        ret

example::f32x4::sqrt:
        ldr     q0, [x0]
        fsqrt   v0.4s, v0.4s
        str     q0, [x8]
        ret

example::f32x4::powi:
        sub     sp, sp, #64
        str     x30, [sp, #32]
        stp     x20, x19, [sp, #48]
        ldr     q0, [x0]
        mov     w0, w1
        mov     w19, w1
        mov     x20, x8
        str     q0, [sp, #16]
        mov     s0, v0.s[1]
        bl      __powisf2
        str     d0, [sp]
        ldr     q0, [sp, #16]
        mov     w0, w19
        bl      __powisf2
        ldr     q1, [sp]
        mov     w0, w19
        mov     v0.s[1], v1.s[0]
        str     q0, [sp]
        ldr     q0, [sp, #16]
        mov     s0, v0.s[2]
        bl      __powisf2
        ldr     q1, [sp]
        mov     w0, w19
        mov     v1.s[2], v0.s[0]
        ldr     q0, [sp, #16]
        str     q1, [sp]
        mov     s0, v0.s[3]
        bl      __powisf2
        ldr     q1, [sp]
        ldr     x30, [sp, #32]
        mov     v1.s[3], v0.s[0]
        str     q1, [x20]
        ldp     x20, x19, [sp, #48]
        add     sp, sp, #64
        ret

example::f32x4::powf:
        sub     sp, sp, #64
        stp     x30, x19, [sp, #48]
        ldr     q0, [x0]
        ldr     q1, [x1]
        mov     x19, x8
        stp     q1, q0, [sp, #16]
        mov     s0, v0.s[1]
        mov     s1, v1.s[1]
        bl      powf
        str     d0, [sp]
        ldp     q1, q0, [sp, #16]
        bl      powf
        ldr     q1, [sp]
        mov     v0.s[1], v1.s[0]
        str     q0, [sp]
        ldp     q1, q0, [sp, #16]
        mov     s0, v0.s[2]
        mov     s1, v1.s[2]
        bl      powf
        ldr     q1, [sp]
        mov     v1.s[2], v0.s[0]
        str     q1, [sp]
        ldp     q1, q0, [sp, #16]
        mov     s0, v0.s[3]
        mov     s1, v1.s[3]
        bl      powf
        ldr     q1, [sp]
        mov     v1.s[3], v0.s[0]
        str     q1, [x19]
        ldp     x30, x19, [sp, #48]
        add     sp, sp, #64
        ret

example::f32x4::flog2:
        sub     sp, sp, #48
        stp     x30, x19, [sp, #32]
        ldr     q0, [x0]
        mov     x19, x8
        str     q0, [sp, #16]
        mov     s0, v0.s[1]
        bl      log2f
        str     d0, [sp]
        ldr     q0, [sp, #16]
        bl      log2f
        ldr     q1, [sp]
        mov     v0.s[1], v1.s[0]
        str     q0, [sp]
        ldr     q0, [sp, #16]
        mov     s0, v0.s[2]
        bl      log2f
        ldr     q1, [sp]
        mov     v1.s[2], v0.s[0]
        ldr     q0, [sp, #16]
        str     q1, [sp]
        mov     s0, v0.s[3]
        bl      log2f
        ldr     q1, [sp]
        mov     v1.s[3], v0.s[0]
        str     q1, [x19]
        ldp     x30, x19, [sp, #32]
        add     sp, sp, #48
        ret

example::f32x4::flog10:
        sub     sp, sp, #48
        stp     x30, x19, [sp, #32]
        ldr     q0, [x0]
        mov     x19, x8
        str     q0, [sp, #16]
        mov     s0, v0.s[1]
        bl      log10f
        str     d0, [sp]
        ldr     q0, [sp, #16]
        bl      log10f
        ldr     q1, [sp]
        mov     v0.s[1], v1.s[0]
        str     q0, [sp]
        ldr     q0, [sp, #16]
        mov     s0, v0.s[2]
        bl      log10f
        ldr     q1, [sp]
        mov     v1.s[2], v0.s[0]
        ldr     q0, [sp, #16]
        str     q1, [sp]
        mov     s0, v0.s[3]
        bl      log10f
        ldr     q1, [sp]
        mov     v1.s[3], v0.s[0]
        str     q1, [x19]
        ldp     x30, x19, [sp, #32]
        add     sp, sp, #48
        ret

example::f32x4::flog:
        sub     sp, sp, #48
        stp     x30, x19, [sp, #32]
        ldr     q0, [x0]
        mov     x19, x8
        str     q0, [sp, #16]
        mov     s0, v0.s[1]
        bl      logf
        str     d0, [sp]
        ldr     q0, [sp, #16]
        bl      logf
        ldr     q1, [sp]
        mov     v0.s[1], v1.s[0]
        str     q0, [sp]
        ldr     q0, [sp, #16]
        mov     s0, v0.s[2]
        bl      logf
        ldr     q1, [sp]
        mov     v1.s[2], v0.s[0]
        ldr     q0, [sp, #16]
        str     q1, [sp]
        mov     s0, v0.s[3]
        bl      logf
        ldr     q1, [sp]
        mov     v1.s[3], v0.s[0]
        str     q1, [x19]
        ldp     x30, x19, [sp, #32]
        add     sp, sp, #48
        ret

example::f32x4::fexp:
        sub     sp, sp, #48
        stp     x30, x19, [sp, #32]
        ldr     q0, [x0]
        mov     x19, x8
        str     q0, [sp, #16]
        mov     s0, v0.s[1]
        bl      expf
        str     d0, [sp]
        ldr     q0, [sp, #16]
        bl      expf
        ldr     q1, [sp]
        mov     v0.s[1], v1.s[0]
        str     q0, [sp]
        ldr     q0, [sp, #16]
        mov     s0, v0.s[2]
        bl      expf
        ldr     q1, [sp]
        mov     v1.s[2], v0.s[0]
        ldr     q0, [sp, #16]
        str     q1, [sp]
        mov     s0, v0.s[3]
        bl      expf
        ldr     q1, [sp]
        mov     v1.s[3], v0.s[0]
        str     q1, [x19]
        ldp     x30, x19, [sp, #32]
        add     sp, sp, #48
        ret

example::f32x4::fexp2:
        sub     sp, sp, #48
        stp     x30, x19, [sp, #32]
        ldr     q0, [x0]
        mov     x19, x8
        str     q0, [sp, #16]
        mov     s0, v0.s[1]
        bl      exp2f
        str     d0, [sp]
        ldr     q0, [sp, #16]
        bl      exp2f
        ldr     q1, [sp]
        mov     v0.s[1], v1.s[0]
        str     q0, [sp]
        ldr     q0, [sp, #16]
        mov     s0, v0.s[2]
        bl      exp2f
        ldr     q1, [sp]
        mov     v1.s[2], v0.s[0]
        ldr     q0, [sp, #16]
        str     q1, [sp]
        mov     s0, v0.s[3]
        bl      exp2f
        ldr     q1, [sp]
        mov     v1.s[3], v0.s[0]
        str     q1, [x19]
        ldp     x30, x19, [sp, #32]
        add     sp, sp, #48
        ret

example::f32x4::cos:
        sub     sp, sp, #48
        stp     x30, x19, [sp, #32]
        ldr     q0, [x0]
        mov     x19, x8
        str     q0, [sp, #16]
        mov     s0, v0.s[1]
        bl      cosf
        str     d0, [sp]
        ldr     q0, [sp, #16]
        bl      cosf
        ldr     q1, [sp]
        mov     v0.s[1], v1.s[0]
        str     q0, [sp]
        ldr     q0, [sp, #16]
        mov     s0, v0.s[2]
        bl      cosf
        ldr     q1, [sp]
        mov     v1.s[2], v0.s[0]
        ldr     q0, [sp, #16]
        str     q1, [sp]
        mov     s0, v0.s[3]
        bl      cosf
        ldr     q1, [sp]
        mov     v1.s[3], v0.s[0]
        str     q1, [x19]
        ldp     x30, x19, [sp, #32]
        add     sp, sp, #48
        ret

example::f32x4::sin:
        sub     sp, sp, #48
        stp     x30, x19, [sp, #32]
        ldr     q0, [x0]
        mov     x19, x8
        str     q0, [sp, #16]
        mov     s0, v0.s[1]
        bl      sinf
        str     d0, [sp]
        ldr     q0, [sp, #16]
        bl      sinf
        ldr     q1, [sp]
        mov     v0.s[1], v1.s[0]
        str     q0, [sp]
        ldr     q0, [sp, #16]
        mov     s0, v0.s[2]
        bl      sinf
        ldr     q1, [sp]
        mov     v1.s[2], v0.s[0]
        ldr     q0, [sp, #16]
        str     q1, [sp]
        mov     s0, v0.s[3]
        bl      sinf
        ldr     q1, [sp]
        mov     v1.s[3], v0.s[0]
        str     q1, [x19]
        ldp     x30, x19, [sp, #32]
        add     sp, sp, #48
        ret

<example::f32x4 as core::fmt::Debug>::fmt:
        sub     sp, sp, #80
        str     x30, [sp, #32]
        stp     x22, x21, [sp, #48]
        stp     x20, x19, [sp, #64]
        mov     x9, x1
        adrp    x1, .L__unnamed_1
        mov     x19, x0
        add     x20, x0, #4
        add     x21, x0, #8
        add     x22, x0, #12
        add     x1, x1, :lo12:.L__unnamed_1
        add     x8, sp, #8
        mov     w2, #5
        mov     x0, x9
        bl      core::fmt::Formatter::debug_tuple
        str     x19, [sp, #40]
        adrp    x19, .L__unnamed_2
        add     x19, x19, :lo12:.L__unnamed_2
        add     x0, sp, #8
        add     x1, sp, #40
        mov     x2, x19
        bl      core::fmt::builders::DebugTuple::field
        add     x0, sp, #8
        add     x1, sp, #40
        mov     x2, x19
        str     x20, [sp, #40]
        bl      core::fmt::builders::DebugTuple::field
        add     x0, sp, #8
        add     x1, sp, #40
        mov     x2, x19
        str     x21, [sp, #40]
        bl      core::fmt::builders::DebugTuple::field
        add     x0, sp, #8
        add     x1, sp, #40
        mov     x2, x19
        str     x22, [sp, #40]
        bl      core::fmt::builders::DebugTuple::field
        add     x0, sp, #8
        bl      core::fmt::builders::DebugTuple::finish
        ldp     x20, x19, [sp, #64]
        ldp     x22, x21, [sp, #48]
        ldr     x30, [sp, #32]
        add     sp, sp, #80
        ret

.L__unnamed_1:
        .ascii  "f32x4"

.L__unnamed_2:
        .xword  core::ptr::drop_in_place<&f32>
        .xword  8
        .xword  8
        .xword  <&T as core::fmt::Debug>::fmt

Additional rounding functions

  • to_int (requires comparisons in #36)
  • rount_to_int (initially proposed in #23)
  • trunc (removed in #47)
  • fract (removed in #47)
  • round (removed in #47)

round_to_int can't be directly implemented in LLVM, since llvm.lround doesn't work on vectors. I'm not sure this function is actually necessary since you can do x.round().to_int().

Improve rustdoc views

The implementation is somewhat hard to read when viewed with rustdoc, to a degree that isn't really acceptable for external users. Some of this can be improved just by rearranging things.

Floats are problematic

It is well known that floating point code often merits extra caution due to the details of rounding math, but the fact that floating point numbers are more interesting to handle applies not just to software interacting with floating point numbers and the floating point environment (all of which can affect our API significantly), but also applies to hardware implementations of both vector and scalar floating point units, so to both our SIMD code and our scalar fallbacks. These aren't bugs per se because it's not a bug, it's a feature, but they are features requiring enhanced attention to detail. This is related to rust-lang/unsafe-code-guidelines#237 and rust-lang/rust#73328 as well.

Most domains involving SIMD code use floats extensively, so this library has a "front-row seat" to problems with floating point operations. Thus, in order for SIMD code to be reasonably portable, we should try to discover where these... unique... implementations lie and decide how to work around them, or at least inform our users of the Fun Facts we learn.

Arch-specific concerns remain for:

  • 32-bit x86
  • 32-bit ARM
  • MIPS
  • Wasm

32-bit x86

The x87 80-bit "long double" float registers can do interesting things to NaN, and in general if a floating point value ends up in them and experiences an operation, this can introduce extra precision that may lead to incorrect mathematical conclusions later on. Further, Rust no longer supports MMX code at all because their interaction with the x87 registers were just entirely too much trouble. Altogether, this is probably why the x86-64 System V ABI specifies usage of the XMM registers for handling floating point values, which do not have these problems.

32-bit ARM

There are so many different floating point implementations on 32-bit ARM that armclang includes a compiler flag for FPU architecture and another one for float ABI.

  • VFP AKA Vector Floating Point: VFP units that appear on ARMv7 seem to default to flushing denormals to zero unless the appropriate control register has the "FZ bit" set appropriately.
  • Neon AKA Advanced SIMD: Vector registers flush denormals to zero always. This is not true of aarch64's Enhanced Neon AKA Advanced SIMD v2.
  • Aarch32: Lest we imagine that ARMv8-A is completely free of problems, the "aarch32" execution mode has an unspecified default value for the FZ bit even if Neon v2 is available.

MIPS

NaNs sometimes do weird things on this platform, resulting in some of the packed_simd tests getting the very interesting number -3.0.

Wasm

Technically Wasm is IEEE754 compliant but it is very "...technically!" here because it specifies a canonicalizing behavior on NaNs that constitutes an interesting choice amongst architectures and may not be expected by a programmer that is used to e.g. being able to rely on NaN bitfields being fairly stable, so we will want to watch out for it when implementing our initial floating point test suite.

The Good News

We have reasonable confidence in float behavior in x86's XMM-and-later registers (so from SSE2 onwards), aarch64's Neon v2 registers, PowerPC, and z/Architecture (s390x). As far as we know, on these architectures ordinary binary32 and binary64s are what they say they are, support basic operations in a reasonably consistent fashion, and nothing particularly weird occurs even with NaNs. So we are actually in a pretty good position for actually using most vector ISAs! It's just all the edge cases that pile up.

Main Takeaway

We want to extensively test even "simple" scalar operations on floating point numbers, especially casts and bitwhacking, or anything else that might possibly be affected by denormal numbers or NaN, so as to surface known and unknown quirks in floating point architectures and how LLVM handles them.

How can i use this "stdsimd" in my project?

When i search stdsimd, there is a 2 yeas old crate in crate.io. But that is not this repo. So, should i use this source code in my project? Or use the 2 years old crate?

Cautiously link simd_insert and simd_extract

My hidden talent is breaking APIs! It's awesome. I discovered problems in simd_insert and simd_extract that I documented in rust-lang/rust#77477! I should probably be the one to link those intrinsics since I now also know the most about the dangers of them, but in the event I can't get around to it, anyone who implements them should be aware of that bug's status! Isn't finding bugs fun?

Rounding and Type Change methods

Before we can begin implementing a few of the more advanced functions, we'll need a few more of the basic utility functions for manipulating values.

For floats:

  • round to nearest whole number (output is the same float type)
  • ceiling / round to infinity
  • floor / round to negative infinity
  • truncate / round to zero
  • fraction (remove the whole-number part)
  • round to integer (output is the same-bit signed integer type)

For signed ints:

  • round to float (output is the same-bit float type)

Improve wasm32 to_array performance

It's not clear why these are happening, they seem to be fixed in recent PRs to the mask API but not on purpose? This is being opened to make sure they get addressed eventually anyways.

Build fails on AMD CPU

I do not have any code. I simply added the dependency in Cargo.toml and tried to build the project.

[package]
name = "tasd"
version = "0.1.0"
authors = ["Ishan Jain"]
edition = "2018"

[dependencies]
stdsimd = "0.1.2"

I expected to see this happen:

It should have built successfully.

Instead, this happened:

I get a lot of errors when it tries to build the project.

Meta

rustc --version --verbose:

rustc 1.53.0-nightly (07e0e2ec2 2021-03-24)
binary: rustc
commit-hash: 07e0e2ec268c140e607e1ac7f49f145612d0f597
commit-date: 2021-03-24
host: x86_64-unknown-linux-gnu
release: 1.53.0-nightly
LLVM version: 12.0.0

crate version in Cargo.toml:

[dependencies]
stdsimd = "0.1.2"
Build Logs

   Compiling coresimd v0.1.2
error: cannot find attribute `derive` in this scope
  --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:10:11
   |
10 |         #[derive(Copy, Clone, Debug, PartialEq)]
   |           ^^^^^^
...
77 | simd_ty!(u8x2[u8]: u8, u8 | x0, x1);
   | ------------------------------------ in this macro invocation
   |
   = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
  --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:10:11
   |
10 |         #[derive(Copy, Clone, Debug, PartialEq)]
   |           ^^^^^^
...
78 | simd_ty!(i8x2[i8]: i8, i8 | x0, x1);
   | ------------------------------------ in this macro invocation
   |
   = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
  --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:10:11
   |
10 |         #[derive(Copy, Clone, Debug, PartialEq)]
   |           ^^^^^^
...
82 | simd_ty!(u8x4[u8]: u8, u8, u8, u8 | x0, x1, x2, x3);
   | ---------------------------------------------------- in this macro invocation
   |
   = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
  --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:10:11
   |
10 |         #[derive(Copy, Clone, Debug, PartialEq)]
   |           ^^^^^^
...
83 | simd_ty!(u16x2[u16]: u16, u16 | x0, x1);
   | ---------------------------------------- in this macro invocation
   |
   = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
  --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:10:11
   |
10 |         #[derive(Copy, Clone, Debug, PartialEq)]
   |           ^^^^^^
...
85 | simd_ty!(i8x4[i8]: i8, i8, i8, i8 | x0, x1, x2, x3);
   | ---------------------------------------------------- in this macro invocation
   |
   = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
  --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:10:11
   |
10 |         #[derive(Copy, Clone, Debug, PartialEq)]
   |           ^^^^^^
...
86 | simd_ty!(i16x2[i16]: i16, i16 | x0, x1);
   | ---------------------------------------- in this macro invocation
   |
   = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
  --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:10:11
   |
10 |           #[derive(Copy, Clone, Debug, PartialEq)]
   |             ^^^^^^
...
90 | / simd_ty!(u8x8[u8]:
91 | |          u8, u8, u8, u8, u8, u8, u8, u8
92 | |          | x0, x1, x2, x3, x4, x5, x6, x7);
   | |___________________________________________- in this macro invocation
   |
   = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
  --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:10:11
   |
10 |         #[derive(Copy, Clone, Debug, PartialEq)]
   |           ^^^^^^
...
93 | simd_ty!(u16x4[u16]: u16, u16, u16, u16 | x0, x1, x2, x3);
   | ---------------------------------------------------------- in this macro invocation
   |
   = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
  --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:10:11
   |
10 |         #[derive(Copy, Clone, Debug, PartialEq)]
   |           ^^^^^^
...
94 | simd_ty!(u32x2[u32]: u32, u32 | x0, x1);
   | ---------------------------------------- in this macro invocation
   |
   = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
  --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:10:11
   |
10 |         #[derive(Copy, Clone, Debug, PartialEq)]
   |           ^^^^^^
...
95 | simd_ty!(u64x1[u64]: u64 | x1);
   | ------------------------------- in this macro invocation
   |
   = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
  --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:10:11
   |
10 |           #[derive(Copy, Clone, Debug, PartialEq)]
   |             ^^^^^^
...
97 | / simd_ty!(i8x8[i8]:
98 | |          i8, i8, i8, i8, i8, i8, i8, i8
99 | |          | x0, x1, x2, x3, x4, x5, x6, x7);
   | |___________________________________________- in this macro invocation
   |
   = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:10:11
    |
10  |         #[derive(Copy, Clone, Debug, PartialEq)]
    |           ^^^^^^
...
100 | simd_ty!(i16x4[i16]: i16, i16, i16, i16 | x0, x1, x2, x3);
    | ---------------------------------------------------------- in this macro invocation
    |
    = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:10:11
    |
10  |         #[derive(Copy, Clone, Debug, PartialEq)]
    |           ^^^^^^
...
101 | simd_ty!(i32x2[i32]: i32, i32 | x0, x1);
    | ---------------------------------------- in this macro invocation
    |
    = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:10:11
    |
10  |         #[derive(Copy, Clone, Debug, PartialEq)]
    |           ^^^^^^
...
102 | simd_ty!(i64x1[i64]: i64 | x1);
    | ------------------------------- in this macro invocation
    |
    = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:10:11
    |
10  |         #[derive(Copy, Clone, Debug, PartialEq)]
    |           ^^^^^^
...
104 | simd_ty!(f32x2[f32]: f32, f32 | x0, x1);
    | ---------------------------------------- in this macro invocation
    |
    = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:10:11
    |
10  |           #[derive(Copy, Clone, Debug, PartialEq)]
    |             ^^^^^^
...
108 | / simd_ty!(u8x16[u8]:
109 | |          u8, u8, u8, u8, u8, u8, u8, u8,
110 | |          u8, u8, u8, u8, u8, u8, u8, u8
111 | |          | x0, x1, x2, x3, x4, x5, x6, x7, x8, x9, x10, x11, x12, x13, x14, x15
112 | | );
    | |__- in this macro invocation
    |
    = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:10:11
    |
10  |           #[derive(Copy, Clone, Debug, PartialEq)]
    |             ^^^^^^
...
113 | / simd_ty!(u16x8[u16]:
114 | |          u16, u16, u16, u16, u16, u16, u16, u16
115 | |          | x0, x1, x2, x3, x4, x5, x6, x7);
    | |___________________________________________- in this macro invocation
    |
    = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:10:11
    |
10  |         #[derive(Copy, Clone, Debug, PartialEq)]
    |           ^^^^^^
...
116 | simd_ty!(u32x4[u32]: u32, u32, u32, u32 | x0, x1, x2, x3);
    | ---------------------------------------------------------- in this macro invocation
    |
    = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:10:11
    |
10  |         #[derive(Copy, Clone, Debug, PartialEq)]
    |           ^^^^^^
...
117 | simd_ty!(u64x2[u64]: u64, u64 | x0, x1);
    | ---------------------------------------- in this macro invocation
    |
    = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:10:11
    |
10  |           #[derive(Copy, Clone, Debug, PartialEq)]
    |             ^^^^^^
...
119 | / simd_ty!(i8x16[i8]:
120 | |          i8, i8, i8, i8, i8, i8, i8, i8,
121 | |          i8, i8, i8, i8, i8, i8, i8, i8
122 | |          | x0, x1, x2, x3, x4, x5, x6, x7, x8, x9, x10, x11, x12, x13, x14, x15
123 | | );
    | |__- in this macro invocation
    |
    = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:10:11
    |
10  |           #[derive(Copy, Clone, Debug, PartialEq)]
    |             ^^^^^^
...
124 | / simd_ty!(i16x8[i16]:
125 | |          i16, i16, i16, i16, i16, i16, i16, i16
126 | |          | x0, x1, x2, x3, x4, x5, x6, x7);
    | |___________________________________________- in this macro invocation
    |
    = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:10:11
    |
10  |         #[derive(Copy, Clone, Debug, PartialEq)]
    |           ^^^^^^
...
127 | simd_ty!(i32x4[i32]: i32, i32, i32, i32 | x0, x1, x2, x3);
    | ---------------------------------------------------------- in this macro invocation
    |
    = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:10:11
    |
10  |         #[derive(Copy, Clone, Debug, PartialEq)]
    |           ^^^^^^
...
128 | simd_ty!(i64x2[i64]: i64, i64 | x0, x1);
    | ---------------------------------------- in this macro invocation
    |
    = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:10:11
    |
10  |         #[derive(Copy, Clone, Debug, PartialEq)]
    |           ^^^^^^
...
130 | simd_ty!(f32x4[f32]: f32, f32, f32, f32 | x0, x1, x2, x3);
    | ---------------------------------------------------------- in this macro invocation
    |
    = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:10:11
    |
10  |         #[derive(Copy, Clone, Debug, PartialEq)]
    |           ^^^^^^
...
131 | simd_ty!(f64x2[f64]: f64, f64 | x0, x1);
    | ---------------------------------------- in this macro invocation
    |
    = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:41:11
    |
41  |           #[derive(Copy, Clone, Debug, PartialEq)]
    |             ^^^^^^
...
133 | / simd_m_ty!(m8x16[i8]:
134 | |            i8, i8, i8, i8, i8, i8, i8, i8,
135 | |            i8, i8, i8, i8, i8, i8, i8, i8
136 | |            | x0, x1, x2, x3, x4, x5, x6, x7, x8, x9, x10, x11, x12, x13, x14, x15
137 | | );
    | |__- in this macro invocation
    |
    = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:41:11
    |
41  |           #[derive(Copy, Clone, Debug, PartialEq)]
    |             ^^^^^^
...
138 | / simd_m_ty!(m16x8[i16]:
139 | |            i16, i16, i16, i16, i16, i16, i16, i16
140 | |            | x0, x1, x2, x3, x4, x5, x6, x7);
    | |_____________________________________________- in this macro invocation
    |
    = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:41:11
    |
41  |         #[derive(Copy, Clone, Debug, PartialEq)]
    |           ^^^^^^
...
141 | simd_m_ty!(m32x4[i32]: i32, i32, i32, i32 | x0, x1, x2, x3);
    | ------------------------------------------------------------ in this macro invocation
    |
    = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:41:11
    |
41  |         #[derive(Copy, Clone, Debug, PartialEq)]
    |           ^^^^^^
...
142 | simd_m_ty!(m64x2[i64]: i64, i64 | x0, x1);
    | ------------------------------------------ in this macro invocation
    |
    = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:10:11
    |
10  |           #[derive(Copy, Clone, Debug, PartialEq)]
    |             ^^^^^^
...
146 | / simd_ty!(u8x32[u8]:
147 | |          u8, u8, u8, u8, u8, u8, u8, u8,
148 | |          u8, u8, u8, u8, u8, u8, u8, u8,
149 | |          u8, u8, u8, u8, u8, u8, u8, u8,
...   |
154 | |          x24, x25, x26, x27, x28, x29, x30, x31
155 | | );
    | |__- in this macro invocation
    |
    = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:10:11
    |
10  |           #[derive(Copy, Clone, Debug, PartialEq)]
    |             ^^^^^^
...
156 | / simd_ty!(u16x16[u16]:
157 | |          u16, u16, u16, u16, u16, u16, u16, u16,
158 | |          u16, u16, u16, u16, u16, u16, u16, u16
159 | |          | x0, x1, x2, x3, x4, x5, x6, x7, x8, x9, x10, x11, x12, x13, x14, x15
160 | | );
    | |__- in this macro invocation
    |
    = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:10:11
    |
10  |           #[derive(Copy, Clone, Debug, PartialEq)]
    |             ^^^^^^
...
161 | / simd_ty!(u32x8[u32]:
162 | |          u32, u32, u32, u32, u32, u32, u32, u32
163 | |          | x0, x1, x2, x3, x4, x5, x6, x7);
    | |___________________________________________- in this macro invocation
    |
    = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:10:11
    |
10  |         #[derive(Copy, Clone, Debug, PartialEq)]
    |           ^^^^^^
...
164 | simd_ty!(u64x4[u64]: u64, u64, u64, u64 | x0, x1, x2, x3);
    | ---------------------------------------------------------- in this macro invocation
    |
    = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:10:11
    |
10  |           #[derive(Copy, Clone, Debug, PartialEq)]
    |             ^^^^^^
...
166 | / simd_ty!(i8x32[i8]:
167 | |          i8, i8, i8, i8, i8, i8, i8, i8,
168 | |          i8, i8, i8, i8, i8, i8, i8, i8,
169 | |          i8, i8, i8, i8, i8, i8, i8, i8,
...   |
174 | |          x24, x25, x26, x27, x28, x29, x30, x31
175 | | );
    | |__- in this macro invocation
    |
    = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:10:11
    |
10  |           #[derive(Copy, Clone, Debug, PartialEq)]
    |             ^^^^^^
...
176 | / simd_ty!(i16x16[i16]:
177 | |          i16, i16, i16, i16, i16, i16, i16, i16,
178 | |          i16, i16, i16, i16, i16, i16, i16, i16
179 | |          | x0, x1, x2, x3, x4, x5, x6, x7, x8, x9, x10, x11, x12, x13, x14, x15
180 | | );
    | |__- in this macro invocation
    |
    = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:10:11
    |
10  |           #[derive(Copy, Clone, Debug, PartialEq)]
    |             ^^^^^^
...
181 | / simd_ty!(i32x8[i32]:
182 | |          i32, i32, i32, i32, i32, i32, i32, i32
183 | |          | x0, x1, x2, x3, x4, x5, x6, x7);
    | |___________________________________________- in this macro invocation
    |
    = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find attribute `derive` in this scope
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/simd.rs:10:11
    |
10  |         #[derive(Copy, Clone, Debug, PartialEq)]
    |           ^^^^^^
...
184 | simd_ty!(i64x4[i64]: i64, i64, i64, i64 | x0, x1, x2, x3);
    | ---------------------------------------------------------- in this macro invocation
    |
    = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot find macro `asm` in this scope
  --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/eflags.rs:33:5
   |
33 |     asm!("pushfq; popq $0" : "=r"(eflags) : : : "volatile");
   |     ^^^
   |
   = note: consider importing this macro:
           coresimd::x86::asm

error: cannot find macro `asm` in this scope
  --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/eflags.rs:64:5
   |
64 |     asm!("pushq $0; popfq" : : "r"(eflags) : "cc", "flags" : "volatile");
   |     ^^^
   |
   = note: consider importing this macro:
           coresimd::x86::asm

error: cannot find attribute `derive` in this scope
  --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/cpuid.rs:11:3
   |
11 | #[derive(Copy, Clone, Eq, Ord, PartialEq, PartialOrd)]
   |   ^^^^^^

error: cannot find macro `cfg` in this scope
  --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/cpuid.rs:57:8
   |
57 |     if cfg!(target_arch = "x86") {
   |        ^^^
   |
   = note: consider importing this macro:
           coresimd::x86::cfg

error: cannot find macro `asm` in this scope
  --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/cpuid.rs:58:9
   |
58 |         asm!("cpuid"
   |         ^^^
   |
   = note: consider importing this macro:
           coresimd::x86::asm

error: cannot find macro `asm` in this scope
  --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/cpuid.rs:64:9
   |
64 |         asm!("cpuid\n"
   |         ^^^
   |
   = note: consider importing this macro:
           coresimd::x86::asm

error: cannot find macro `asm` in this scope
  --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/xsave.rs:91:5
   |
91 |     asm!("xgetbv" : "={eax}"(eax), "={edx}"(edx) : "{ecx}"(xcr_no));
   |     ^^^
   |
   = note: consider importing this macro:
           coresimd::x86::asm

error[E0545]: `issue` must be a non-zero numeric string or "none"
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mod.rs:342:41
    |
342 | #[unstable(feature = "stdimd_internal", issue = "0")]
    |                                         ^^^^^^^^---
    |                                                 |
    |                                                 `issue` must not be "0", use "none" instead

error[E0545]: `issue` must be a non-zero numeric string or "none"
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mod.rs:395:41
    |
395 | #[unstable(feature = "stdimd_internal", issue = "0")]
    |                                         ^^^^^^^^---
    |                                                 |
    |                                                 `issue` must not be "0", use "none" instead

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2104:18
     |
2104 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2113:18
     |
2113 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2122:18
     |
2122 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2131:18
     |
2131 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2140:18
     |
2140 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2149:18
     |
2149 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2158:18
     |
2158 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2167:18
     |
2167 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2176:18
     |
2176 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2186:18
     |
2186 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2196:18
     |
2196 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2206:18
     |
2206 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2216:18
     |
2216 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2226:18
     |
2226 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2236:18
     |
2236 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2246:18
     |
2246 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2257:18
     |
2257 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2268:18
     |
2268 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2279:18
     |
2279 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2290:18
     |
2290 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2298:18
     |
2298 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2309:18
     |
2309 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2319:18
     |
2319 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2334:18
     |
2334 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2349:18
     |
2349 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2366:18
     |
2366 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2380:18
     |
2380 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2389:18
     |
2389 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2404:18
     |
2404 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2420:18
     |
2420 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2436:18
     |
2436 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2452:18
     |
2452 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2462:18
     |
2462 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2471:18
     |
2471 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2486:18
     |
2486 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2501:18
     |
2501 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2510:18
     |
2510 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2519:18
     |
2519 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2528:18
     |
2528 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2537:18
     |
2537 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse.rs:2550:18
     |
2550 | #[target_feature(enable = "sse,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse2.rs:2891:18
     |
2891 | #[target_feature(enable = "sse2,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse2.rs:2901:18
     |
2901 | #[target_feature(enable = "sse2,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse2.rs:2910:18
     |
2910 | #[target_feature(enable = "sse2,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse2.rs:2920:18
     |
2920 | #[target_feature(enable = "sse2,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse2.rs:2929:18
     |
2929 | #[target_feature(enable = "sse2,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse2.rs:2938:18
     |
2938 | #[target_feature(enable = "sse2,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse2.rs:2947:18
     |
2947 | #[target_feature(enable = "sse2,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse2.rs:2956:18
     |
2956 | #[target_feature(enable = "sse2,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse2.rs:2966:18
     |
2966 | #[target_feature(enable = "sse2,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse2.rs:2977:18
     |
2977 | #[target_feature(enable = "sse2,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
    --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/sse2.rs:2989:18
     |
2989 | #[target_feature(enable = "sse2,mmx")]
     |                  ^^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/ssse3.rs:305:18
    |
305 | #[target_feature(enable = "ssse3,mmx")]
    |                  ^^^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/ssse3.rs:314:18
    |
314 | #[target_feature(enable = "ssse3,mmx")]
    |                  ^^^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/ssse3.rs:323:18
    |
323 | #[target_feature(enable = "ssse3,mmx")]
    |                  ^^^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/ssse3.rs:332:18
    |
332 | #[target_feature(enable = "ssse3,mmx")]
    |                  ^^^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/ssse3.rs:341:18
    |
341 | #[target_feature(enable = "ssse3,mmx")]
    |                  ^^^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/ssse3.rs:356:18
    |
356 | #[target_feature(enable = "ssse3,mmx")]
    |                  ^^^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/ssse3.rs:365:18
    |
365 | #[target_feature(enable = "ssse3,mmx")]
    |                  ^^^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/ssse3.rs:375:18
    |
375 | #[target_feature(enable = "ssse3,mmx")]
    |                  ^^^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/ssse3.rs:384:18
    |
384 | #[target_feature(enable = "ssse3,mmx")]
    |                  ^^^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/ssse3.rs:393:18
    |
393 | #[target_feature(enable = "ssse3,mmx")]
    |                  ^^^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/ssse3.rs:404:18
    |
404 | #[target_feature(enable = "ssse3,mmx")]
    |                  ^^^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/ssse3.rs:416:18
    |
416 | #[target_feature(enable = "ssse3,mmx")]
    |                  ^^^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/ssse3.rs:426:18
    |
426 | #[target_feature(enable = "ssse3,mmx")]
    |                  ^^^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/ssse3.rs:437:18
    |
437 | #[target_feature(enable = "ssse3,mmx")]
    |                  ^^^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/ssse3.rs:448:18
    |
448 | #[target_feature(enable = "ssse3,mmx")]
    |                  ^^^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/ssse3.rs:459:18
    |
459 | #[target_feature(enable = "ssse3,mmx")]
    |                  ^^^^^^^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
  --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:20:18
   |
20 | #[target_feature(enable = "mmx")]
   |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
  --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:30:18
   |
30 | #[target_feature(enable = "mmx")]
   |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
  --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:38:18
   |
38 | #[target_feature(enable = "mmx")]
   |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
  --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:46:18
   |
46 | #[target_feature(enable = "mmx")]
   |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
  --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:54:18
   |
54 | #[target_feature(enable = "mmx")]
   |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
  --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:62:18
   |
62 | #[target_feature(enable = "mmx")]
   |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
  --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:70:18
   |
70 | #[target_feature(enable = "mmx")]
   |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
  --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:78:18
   |
78 | #[target_feature(enable = "mmx")]
   |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
  --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:86:18
   |
86 | #[target_feature(enable = "mmx")]
   |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
  --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:94:18
   |
94 | #[target_feature(enable = "mmx")]
   |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:102:18
    |
102 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:110:18
    |
110 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:118:18
    |
118 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:126:18
    |
126 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:134:18
    |
134 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:142:18
    |
142 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:150:18
    |
150 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:158:18
    |
158 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:166:18
    |
166 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:174:18
    |
174 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:182:18
    |
182 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:191:18
    |
191 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:200:18
    |
200 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:209:18
    |
209 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:218:18
    |
218 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:227:18
    |
227 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:236:18
    |
236 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:245:18
    |
245 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:254:18
    |
254 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:266:18
    |
266 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:278:18
    |
278 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:287:18
    |
287 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:296:18
    |
296 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:305:18
    |
305 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:314:18
    |
314 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:323:18
    |
323 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:332:18
    |
332 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:341:18
    |
341 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:350:18
    |
350 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:359:18
    |
359 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:367:18
    |
367 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:374:18
    |
374 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:381:18
    |
381 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:390:18
    |
390 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:397:18
    |
397 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:404:18
    |
404 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:412:18
    |
412 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:420:18
    |
420 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: the feature named `mmx` is not valid for this target
   --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/mmx.rs:427:18
    |
427 | #[target_feature(enable = "mmx")]
    |                  ^^^^^^^^^^^^^^ `mmx` is not valid for this target

error: unrecognized platform-specific intrinsic function: `x86_rdrand16_step`
 --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/rdrand.rs:6:5
  |
6 |     fn x86_rdrand16_step() -> (u16, i32);
  |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

error: unrecognized platform-specific intrinsic function: `x86_rdrand32_step`
 --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/rdrand.rs:7:5
  |
7 |     fn x86_rdrand32_step() -> (u32, i32);
  |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

error: unrecognized platform-specific intrinsic function: `x86_rdseed16_step`
 --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/rdrand.rs:8:5
  |
8 |     fn x86_rdseed16_step() -> (u16, i32);
  |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

error: unrecognized platform-specific intrinsic function: `x86_rdseed32_step`
 --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86/rdrand.rs:9:5
  |
9 |     fn x86_rdseed32_step() -> (u32, i32);
  |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

error: unrecognized platform-specific intrinsic function: `x86_rdrand64_step`
 --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86_64/rdrand.rs:6:5
  |
6 |     fn x86_rdrand64_step() -> (u64, i32);
  |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

error: unrecognized platform-specific intrinsic function: `x86_rdseed64_step`
 --> /home/ishan/.cargo/registry/src/github.com-1ecc6299db9ec823/coresimd-0.1.2/src/coresimd/x86_64/rdrand.rs:7:5
  |
7 |     fn x86_rdseed64_step() -> (u64, i32);
  |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

error: aborting due to 169 previous errors

For more information about this error, try `rustc --explain E0545`.
error: could not compile `coresimd`

To learn more, run the command again with --verbose.

Remove From splatting implementations

After some discussion on Zulip, the general consensus seems to be that From<Scalar> for Vector is likely not an appropriate conversion, and can be removed as long as arithmetic ops auto-splat scalars.

Enable triagebot

I only read the docs briefly--not sure exactly which settings we want, but may be helpful!

Do The SIMD Shuffle

A "shuffle", in SIMD terms, takes a SIMD vector (or possibly two vectors) and a pattern of source lane indexes (usually as an immediate), and then produces a new SIMD vector where the output is the source lane values in the pattern given.

Example (pseudo-code):

let a = simd(5, 6, 7, 8);
assert_eq!(shuffle(a, [2, 2, 1, 0]), simd(7, 7, 6, 5));

Shuffles are very important for particular SIMD tasks, but the requirement that the input be a compile time constant complicates the API:

  • Rust doesn't have a stable way for users to express this, so they never expect it.
  • Rather than just having a const-required arg (which totally works) a lot of people have said "you should use const generics for that!" (which actually doesn't work as well on current rust).

Still, min_const_generics is aimed to be stable by the end of the year and most likely that'll be enough to do shuffle basics on stable.

Add pretty pictures to the beginners guide

Add pictures in the spirit of https://cheats.rs/#numeric-types-ref
"These are 128 bits of memory. They represent 4 different floats of 32 bits. If we put them in a SIMD vector, the vector width is 128 bits, and the lane size is 32 bits, with 4 lanes."

"Lane sizes can vary on instruction, and so can vector widths. This is a vectorwidth of 512, but with int64s in the available 8 lanes. If they were int8s, We could have 64 lanes of them. Rust allows at a minimum a lane size of 8."

"These operations work on lanes vertically - ie, from one SIMD vector on another SIMD vector of the same vector size."

cc @workingjubilee

impl `core::ops` for `core::simd`

  • Add / AddAssign
  • BitAnd / BitAndAssign
  • BitOr / BitOrAssign
  • BitXor / BitXorAssign
  • Div / DivAssign
  • Index/IndexMut
  • Mul / MulAssign
  • Neg
  • Not
  • Rem / RemAssign
  • Shl / ShlAssign
  • Shr / ShrAssign
  • Sub / SubAssign

Most of these are self-explanitory.

  • Generally, vector op scalar should "auto-splat" the scalar value and then perform the op between the two vectors. This is just good quality of life / ergonomics.
  • I'd like if all types support bit ops (including Not), even though normally f32 / f64 don't support bit ops. I consider the lack of bit ops on the floating primitives to be a "not cool" part of core.

Missing compiler intrinsics

I think it would be helpful to collect a list of all compiler intrinsics (the extern "platform-intrinsics" ones) that need to be added.

  • simd_neg
  • simd_overflowing_add etc
  • simd_trunc
  • simd_round
  • simd_lt_packed etc (all other packed intrinsics can be emulated with this one to start)
  • simd_sat_shl <-- only supported in recent LLVM
  • simd_abs
  • simd_sext (or some kind of simd_cast for integer -> mask)
  • simd_bitmask/simd_select_bitmask (these already exist, but they should be modified to accept arrays of u8
  • simd_bswap

How to provide Lanes as const parameter?

I have a function that works on chunks of data. The chunks have a fixed size that I specify via const generics. The chunk size should be used as Lanes parameter for SimdF32::from_array; however, I fail to pass a const parameter to the from_array function.

use core_simd::{SimdF32, LanesAtMost64};

// Some function that needs a const parameter which also corresponds to SimdF32 LANES
pub fn from_array<const N: usize>(x: [f32; N]) -> SimdF32<N> {
    // How can I pass N to from_array?
    SimdF32::<N>::from_array(x)
}

#[test]
fn simd_array() {
    const LANES: usize = 8;
    let x = [0.; LANES];
    let x_simd = from_array::<LANES>(x);
    for (x_, y_) in x.iter().zip(x_simd.to_array()) {
        assert_eq!(x_, y_);
    }
}

Result (without warnings):

$ cargo +nightly test
   Compiling benches v0.0.1 (/home/hendrik/projects/rs_benches)

error[E0277]: the trait bound `SimdF32<N>: LanesAtMost64` is not satisfied
  --> src/test_simd.rs:4:51
   |
4  | pub fn from_array<const N: usize>(x: [f32; N]) -> SimdF32<N>
   |                                                   ^^^^^^^^^^ the trait `LanesAtMost64` is not implemented for `SimdF32<N>`
   |
  ::: /home/hendrik/.cargo/git/checkouts/stdsimd-26e23068d55c82a1/4e6d440/crates/core_simd/src/vector/float.rs:47:11
   |
47 |     Self: crate::LanesAtMost64;
   |           -------------------- required by this bound in `SimdF32`
   |
   = help: the following implementations were found:
             <SimdF32<16_usize> as LanesAtMost64>
             <SimdF32<1_usize> as LanesAtMost64>
             <SimdF32<2_usize> as LanesAtMost64>
             <SimdF32<32_usize> as LanesAtMost64>
           and 3 others

error[E0277]: the trait bound `SimdF32<N>: LanesAtMost64` is not satisfied
  --> src/test_simd.rs:4:51
   |
4  | pub fn from_array<const N: usize>(x: [f32; N]) -> SimdF32<N>
   |                                                   ^^^^^^^^^^ the trait `LanesAtMost64` is not implemented for `SimdF32<N>`
   |
  ::: /home/hendrik/.cargo/git/checkouts/stdsimd-26e23068d55c82a1/4e6d440/crates/core_simd/src/vector/float.rs:47:11
   |
47 |     Self: crate::LanesAtMost64;
   |           -------------------- required by this bound in `SimdF32`
   |
   = help: the following implementations were found:
             <SimdF32<16_usize> as LanesAtMost64>
             <SimdF32<1_usize> as LanesAtMost64>
             <SimdF32<2_usize> as LanesAtMost64>
             <SimdF32<32_usize> as LanesAtMost64>
           and 3 others

error: aborting due to previous error; 3 warnings emitted

For more information about this error, try `rustc --explain E0277`.
error: aborting due to previous error; 3 warnings emitted

For more information about this error, try `rustc --explain E0277`.
error: could not compile `benches`

To learn more, run the command again with --verbose.
warning: build failed, waiting for other jobs to finish...
error: build failed

And with the trait:

use core_simd::{SimdF32, LanesAtMost64};

// Some function that needs a const parameter which also corresponds to SimdF32 LANES
pub fn from_array<const N: usize>(x: [f32; N]) -> SimdF32<N> 
    where SimdF32<N>: LanesAtMost64
{
    SimdF32::<N>::from_array(x)
}

#[test]
fn simd_array() {
    const LANES: usize = 8;
    let x = [0.; LANES];
    let x_simd = from_array::<LANES>(x);
    for (x_, y_) in x.iter().zip(x_simd.to_array()) {
        assert_eq!(x_, y_);
    }
}

Results in an internal compiler error:

$ cargo +nightly test
   Compiling benches v0.0.1 (/home/hendrik/projects/rs_benches)
error: internal compiler error: compiler/rustc_middle/src/ich/impls_ty.rs:94:17: StableHasher: unexpected region '_#0r

thread 'rustc' panicked at 'Box<Any>', /rustc/c755ee4ce8cae6ea977d65a0288480940db721d9/library/std/src/panic.rs:59:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

note: the compiler unexpectedly panicked. this is a bug.

note: we would appreciate a bug report: https://github.com/rust-lang/rust/issues/new?labels=C-bug%2C+I-ICE%2C+T-compiler&template=ice.md

note: rustc 1.53.0-nightly (c755ee4ce 2021-04-04) running on x86_64-unknown-linux-gnu

note: compiler flags: -C embed-bitcode=no -C debuginfo=2 -C incremental --crate-type lib

note: some of the compiler flags provided by cargo are hidden

query stack during panic:
#0 [typeck] type-checking `vec::<impl at src/vec.rs:32:1: 58:2>::new`
#1 [typeck_item_bodies] type-checking all item bodies
end of query stack
error: internal compiler error: compiler/rustc_middle/src/ich/impls_ty.rs:94:17: StableHasher: unexpected region '_#0r

thread 'rustc' panicked at 'Box<Any>', /rustc/c755ee4ce8cae6ea977d65a0288480940db721d9/library/std/src/panic.rs:59:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

note: the compiler unexpectedly panicked. this is a bug.

note: we would appreciate a bug report: https://github.com/rust-lang/rust/issues/new?labels=C-bug%2C+I-ICE%2C+T-compiler&template=ice.md

note: rustc 1.53.0-nightly (c755ee4ce 2021-04-04) running on x86_64-unknown-linux-gnu

note: compiler flags: -C embed-bitcode=no -C debuginfo=2 -C incremental

note: some of the compiler flags provided by cargo are hidden

query stack during panic:
#0 [typeck] type-checking `vec::<impl at src/vec.rs:32:1: 58:2>::new`
#1 [typeck_item_bodies] type-checking all item bodies
end of query stack
error: aborting due to previous error; 2 warnings emitted

error: aborting due to previous error; 2 warnings emitted

error: could not compile `benches`

To learn more, run the command again with --verbose.
warning: build failed, waiting for other jobs to finish...
error: build failed

What traits abstract over what?

We've discussed potentially implementing trait abstractions over SIMD vectors in the basic API because it's pretty unwieldy to handle every single possible TxW combo, and we've mentioned we will need to leave some traits in separate semi-official crates, e.g. a simd-complex, but we need to think about what abstractions we want to offer where.

Discussion partly here: https://rust-lang.zulipchat.com/#narrow/stream/257879-project-portable-simd/topic/Semi-official.20(non-std.3A.3Asimd).20crates

Dot Product

We should, eventually, support the dot product operations. They're part of sse4.1, but the operation is very simple to express in rust as a fallback if sse4.1 is not available.

Comparison functions

Lanewise comparisons:

  • lanes_eq
  • lanes_ne
  • lanes_lt
  • lanes_le
  • lanes_gt
  • lanes_ge
  • is_positive/is_sign_positive
  • is_negative/is_sign_negative
  • is_finite
  • is_infinite
  • is_nan
  • is_normal

Other comparisons:

  • any_of/all_of
  • any_nan/all_nan

I'm open to different naming conventions of the lane comparisons, but I think it's best if they're not ambiguous with PartialEq etc. These functions may also be duplicated in traits if multiple mask widths end up being supported.

Explore implementing SIMD types as arrays in the compiler

Instantiating SIMD types as tuple literals becomes absolutely hand-cramping when those structs become wider than 2~8 elements, and even 8 is getting a bit tiresome. CTFE is now powerful enough to significantly sugar some fairly nuanced array creation, e.g.:

const ARRAY: [i8; 32] = {
  let mut arr = [-128; 32];
  arr[27] = -62
  arr[30] = -31;
  arr[31] = -15;
  arr
};

By providing, at the very least, conversions from the obvious arrays into SIMD types, we can make a lot of things easier on ourselves, so that this can just be used with some kind of method like i8x32::from_array(ARRAY).

Because arrays are a much more natural fit, I would honestly like to explore "what compiler changes would be required to let #[repr(simd)] accept arrays straight-out?" but that may quickly become too complicated, so for now it is just a (very) nice-to-have and std::simd conversions can still exist. On the other hand, it might be pretty easy, and that might simplify the overall API for us immensely, so I should at least check.

Improve contributor docs to minimum

  • Draft a CONTRIBUTING.md
  • Encourage chunking tasks
  • Create an issue template?
  • Link to Zulip
  • Link to FAQ
  • Cross-link docs / adopt orphans where feasible

Initial Steps Outline

We have the following initial things to do, in approximately this order:

  • Get the CI set up with tests running on as many targets/arches as we can muster. Besides just i686 and x86_64 we also need:
    • i586, because it doesn't have sse enabled, and that makes trouble, so we wanna spot it right away.
    • something with avx-512 support. could be emulated if we have to. The tests don't have to execute super fast as long as the results are correct.
    • An ARM target (v7 + neon), and an Aarch64 target (v8), because even though they're both "neon" there's some differences some of the time, and again we wanna be aware of that up front.
    • Every other weird target we can get the CI set for. Our claim is to be portable after all. stdarch is probably a good guide here because they're already testing on all sorts of stuff.
  • Get all our 64-bit, 128-bit, 256-bit, and 512-bit types into the crate.
    • Nothing too fancy, we just declare each struct with all the attributes it needs to compile with the right layout and all that.
    • I believe that the plan was to do this non-generically, so we'll have a big pile of types gettin' written out in this step.
    • Personally I'd like to have one type per module, with the modules as crate-private and then the content of each module re-exported from the crate root. This keeps things easy to work with on a day to day basis but also keeps the public API nice and flat. Remember that in actual usage anything in this crate is going to have core::simd or std::simd in front of it already, so we should keep a shallow public API as much as we can.
  • Get type change methods for every type.
    • From/Into with appropriate arrays (this can likely be a transmute internally)
    • From a single element ("splat"), and probably a bespoke splat method too, for API discoverability.
    • A plain old new method of course.
    • Methods for safe transmuting to/from the native SIMD format (when there is one).
  • Get formatting impls for every type.
    • This isn't just Debug, but also Binary, Octal, UpperHex, LowerHex, all that. Arguably even Display should be added.
    • It might seem silly to make this a whole point of its own, but trust me you're gonna want the flexibility when doing those unit tests.
    • Most of this can be done via macro_rules, you just change the SIMD type into an array, and then format the slice.
  • Get PartialEq for every type.
    • This does not give a SIMD output just because we're comparing SIMD types for equality, you just get a normal bool, so you actually don't want it for SIMD processing, but it is required for assert! to work, so again we need this for working on tests.

And at that point, we'll be in a position to actually begin writing the "useful" parts of the crate.

Spurious failures with Chrome wasm tests

Currently in CI we have a periodic failure to connect to our headless Chrome on port 99 in order to do Wasm SIMD testing. Obviously this puts a bit of a crimp in our style with respect to such tests! It might be preferable to use an alternative wasm runtime, perhaps by using the wasmtime crate embedded in Rust, with wasm_simd enabled. @KodrAus has expressed in an interest in addressing this (and thank you!), but I figured I'd actually open the issue so we can keep track of any interesting details we find about the failures until it's fixed.

Nontemporal (streaming) stores

Nontemporal store instructions (like movntpd in AVX, associated with intrinsics like _mm256_stream_pd), which allow writes to bypass cache, offer an important boost for some memory bandwidth-limited operations. They typically have alignment requirements. It would be useful if such memory semantics could be exposed through stdsimd.

Add "common" shuffles

The shuffle API in #62 can handle nearly any hardware shuffle, but it's a bit clumsy to use (and the API requires full const generics).
A few common cases of shuffles should be provided as a simpler API:

  • reverse
  • rotate
  • shift (like rotate but insert 0s)
  • align (or maybe rotate2?) see #78
  • interleave/deinterleave

Failing wasm tests

I tracked down the cause of the wasm test failures and it looks like it's due to the comparison ops added to tests in #80. It looks like wasmparser has been updated to support these opcodes so it's just a matter of the changes propagating to wasm-bindgen-test. I opened rustwasm/wasm-bindgen#2522 to track this.

I think we can just ignore the failures for now and they should start passing against once wasm-bindgen-test is updated.

cc @workingjubilee

Status of AVX 512 ?

Hello !

I want to ask if this crate supports AVX 512 instructions. If not, Is it in the plans to be able to support it ? This would be the definitive rate for simd in Rust ? Because I understand that the one that is in the official documentation does not have more support.

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.