Giter VIP home page Giter VIP logo

Comments (12)

workingjubilee avatar workingjubilee commented on June 18, 2024 2

@Kerollmops No x86 intrinsic per se will be "added", so in a strict sense, the answer is simply No.

...but we will probably offer general APIs that do similar things. The result may be less terse, as e.g. it is quite likely we will offer safe transmutation functions that allow you to use to_ne_bytes and then do the byte rotation (and then interleaving) on your own and then cast from_ne_bytes, and hopefully LLVM will optimize that correctly. There is not actually a whole lot we can do if it doesn't, honestly, as we have a fairly limited amount of power over codegen on this end.

A generalized byte permutation in a single function seems plausible but that's going to take Some Design, especially given the obstacles we already have w/r/t shuffle APIs.

Also that intrinsic is already supported in core::arch and this sort of request reinforces why we will allow people to cast into hardware types and use such intrinsics if they need that kind of optimization.

from portable-simd.

thomcc avatar thomcc commented on June 18, 2024 1

It's not bytewise, it's bitwise. to/from_ne_bytes doesn't really help.

from portable-simd.

Lokathor avatar Lokathor commented on June 18, 2024

Seems reasonable to put in. I'm not sure how people would want to define it for things other than 128-bit size, but a guess a general byte rotation might be fine.

from portable-simd.

bjorn3 avatar bjorn3 commented on June 18, 2024

This is a highliy specialized instruction that is only available on x86. This makes it a bad fit for stdsimd. Stdsimd is supposed to be roughly the biggest common denominator of all platforms supported by rust. Of course LLVM is allowed to optimize a sequence of functions that behaves identical to that intrinsic to a single instruction.

from portable-simd.

Lokathor avatar Lokathor commented on June 18, 2024

Naw it's got a very clear semantics though, "rotate the value by N bytes", which makes it at worst a slightly odd shuffle. It's a reasonable helper method to have i think.

from portable-simd.

bjorn3 avatar bjorn3 commented on June 18, 2024

@Lokathor It isn't a byte rotate at all as far as I know. It concatenates blocks from both arguments, shifts a given amount and then takes the lower half of each block.

from portable-simd.

thomcc avatar thomcc commented on June 18, 2024

Yeah, they're not really rotate. They're really useful where available though... I called it out a long time ago as the kind of instruction that would be useful to support but might be hard to describe semantically...

from portable-simd.

Lokathor avatar Lokathor commented on June 18, 2024

ah my mistake, i remember now, it's only a rotate if you pass the same register as both arguments.

the general two-arg form might be weird enough to be very low priority or even out of scope.

from portable-simd.

thomcc avatar thomcc commented on June 18, 2024

This kind of thing is why I was hoping we'd land on some generalization of permutation, which would handle a lot of these styles of intrinsics... but I don't really know what that would look like.

from portable-simd.

Lokathor avatar Lokathor commented on June 18, 2024

the intel guide says

Operation
tmp[255:0] := ((a[127:0] << 128)[255:0] OR b[127:0]) >> (imm8*8)
dst[127:0] := tmp[127:0]

which seems byte-wise to me.

from portable-simd.

thomcc avatar thomcc commented on June 18, 2024

Ah, right, hmm, my bad. There are some bitwise permutation operations but I'm mistaken here.

from portable-simd.

Kerollmops avatar Kerollmops commented on June 18, 2024

Thank you very much for all your fast answers, I wasn't expecting this amount of interest here 😄

The fact that we will rely on the LLVM codegen suits me and as you say I can use the core::intrinsic function on x86.

from portable-simd.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.