Giter VIP home page Giter VIP logo

Comments (4)

Lokathor avatar Lokathor commented on July 19, 2024

Also, the permute operations should maybe be looked at and all clarified as part of this as well.

from safe_arch.

Lokathor avatar Lokathor commented on July 19, 2024

Format Proposal

  • swiz_{inputs}_{lane-size}_{source}_{data-type}
    • The inputs are a, b, and/or i
    • The lane sizes are f32, f64, iX, and possibly with a z at the end if you can zero any lane source.
    • The sources are "all" for "pick from all lanes", or "half" for "each half picks to just that half", optionally with h64 and l64 for higher 64 bits or lower 64 bits.
    • the data types are the normal SIMD types

So here's each C intrinsic and the name it would have under this scheme:

C Name safe_arch Name
_mm_permutevar_ps swiz_ab_f32_all_m128
_mm256_permutevar_ps swiz_ab_f32_half_m256
_mm_permutevar_pd swiz_ab_f64_all_m128d
_mm256_permutevar_pd swiz_ab_f64_half_m256d
_mm256_permutevar8x32_ps swiz_ab_i32_all_m256
_mm256_permutevar8x32_epi32 swiz_ab_i32_all_m256i
_mm_shuffle_epi8 swiz_ab_i8_all_m128i
_mm256_shuffle_epi8 swiz_ab_i8_half_m256i
_mm256_permute2f128_ps swiz_abi_f128z_all_m256
_mm256_permute2f128_pd swiz_abi_f128z_all_m256d
_mm256_permute2f128_si256 swiz_abi_f128z_all_m256i
_mm256_permute2x128_si256 swiz_abi_i128z_all_m256i
_mm_shuffle_ps swiz_abi_f32_all_m128
_mm256_shuffle_ps swiz_abi_f32_half_m256
_mm_shuffle_pd swiz_abi_f64_all_m128d
_mm256_shuffle_pd swiz_abi_f64_half_m256d
_mm_permute_ps swiz_ai_f32_all_m128
_mm_shuffle_epi32 swiz_ai_f32_all_m128i
_mm256_permute_ps swiz_ai_f32_half_m256
_mm_permute_pd swiz_ai_f64_all_m128d
_mm256_permute4x64_pd swiz_ai_f64_all_m256d
_mm256_permute_pd swiz_ai_f64_half_m256d
_mm_shufflehi_epi16 swiz_ai_i16_h64all_m128i
_mm256_shufflehi_epi16 swiz_ai_i16_h64half_m256i
_mm_shufflelo_epi16 swiz_ai_i16_l64all_m128i
_mm256_shufflelo_epi16 swiz_ai_i16_l64half_m256i
_mm256_shuffle_epi32 swiz_ai_i32_half_m256i
_mm256_permute4x64_epi64 swiz_ai_i64_all_m256i

from safe_arch.

Lokathor avatar Lokathor commented on July 19, 2024

Oh, also, we're using swiz for "swizzle" because sometimes Intel calls it "shuffle" and sometimes it calls it "permute" and there's seemingly no logic to why one or the other is used for each particular op/intrinsic:

  • It's not based on the number of inputs
  • It's not based on if the inputs are immediate or not
  • It's not based on the minimum CPUID
  • It's not based on the destination register also being an input register or not

So we'll simply forsake both names and then pick a third name that doesn't have any existing baggage.

from safe_arch.

Lokathor avatar Lokathor commented on July 19, 2024

Oh dag we also have to consider that some b values are the varying pattern and some b values are the 2nd register to mix in.

from safe_arch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.