Giter VIP home page Giter VIP logo

Comments (10)

maleadt avatar maleadt commented on September 22, 2024

any ideas for mitigating this?

No easy ones, sorry. You could try to find a different implementation of this operation that doesn't require wider temporaries.

from metal.jl.

sotlampr avatar sotlampr commented on September 22, 2024

Thanks, @maleadt. Do you think that this will fail on CUDA/AMDGPU? I am asking because it is used by some operations in Flux, so this might be a relevant issue for them too.

from metal.jl.

maleadt avatar maleadt commented on September 22, 2024

CUDA has more log intrinsics: https://github.com/JuliaGPU/CUDA.jl/blob/38fb7071d893fad8591361d0da96149f45f7653f/src/device/intrinsics/math.jl#L102-L123

from metal.jl.

sotlampr avatar sotlampr commented on September 22, 2024

Do you think a patch similar to this would be appropriate?

from metal.jl.

maleadt avatar maleadt commented on September 22, 2024

Yes, it would.

cc @oscardssmith (I mentioned this being a problem at JuliaCon)

from metal.jl.

oscardssmith avatar oscardssmith commented on September 22, 2024

how much accuracy are you willing to lose? I can probably cook you up a FLoat32 only one that performs pretty well at the cost of slightly lower accuracy (probably in the 2-4 ULP area).

from metal.jl.

sotlampr avatar sotlampr commented on September 22, 2024

If the only problem is the implementation of log_proc2, there is a single precision implementation in the original paper (p.385-386). But I think there is another Float64 cast in log_proc1 too

from metal.jl.

oscardssmith avatar oscardssmith commented on September 22, 2024

oh, I think it's possible that the double precision casts may not be necessary at all. The biggest difference between our algorithm and Tang's is that we add an extra multiply at the end to account for the different base logs and that multiply may have a bunch of error without the extended precision. Would be good to try the version that doesn't upcast though...

from metal.jl.

sotlampr avatar sotlampr commented on September 22, 2024

I made an attempt to re-write the functions without using double precision floats #236

I tried in a REPL and the functions seem to work, however using them in Metal with MtlArray does not.

from metal.jl.

maleadt avatar maleadt commented on September 22, 2024

#239

from metal.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.