Comments (4)
I guess this is essentially a dup of #69. The question is why an exception is being generated here, as we do support sincos
:
Metal.jl/src/device/intrinsics/math.jl
Lines 128 to 130 in d6f958c
Can you see using Cthulhu how
sincos
is invoked?from metal.jl.
Ah, sorry I didn't notice this the first time. It turns out its related to the indexed_iterate
:
New MWE:
using Metal, KernelAbstractions
X = Metal.MtlArray(fill(0.3f0, 128))
Y = copy(X)
@kernel function mwe_kernel_sincos(out, a)
I = @index(Global, Linear)
s, c = sincos(a[I])
out[I] = s + c
end
kernel = mwe_kernel_sincos(Metal.MetalBackend())
kernel(Y, X, ndrange = size(Y))
ERROR: InvalidIRError: compiling MethodInstance for gpu_mwe_kernel_sincos(::KernelAbstractions.CompilerMetadata{KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicCheck, Nothing, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, KernelAbstractions.NDIteration.NDRange{1, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicSize, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}}}, ::MtlDeviceVector{Float32, 1}, ::MtlDeviceVector{Float32, 1}) resulted in invalid LLVM IR
Reason: unsupported call to an unknown function (call to gpu_malloc)
Stacktrace:
[1] malloc
@ ~/.julia/packages/GPUCompiler/YO8Uj/src/runtime.jl:88
[2] macro expansion
@ ~/.julia/packages/GPUCompiler/YO8Uj/src/runtime.jl:183
[3] macro expansion
@ ./none:0
[4] box
@ ./none:0
[5] box_int64
@ ~/.julia/packages/GPUCompiler/YO8Uj/src/runtime.jl:212
[6] indexed_iterate <--
@ ./tuple.jl:97
[7] macro expansion
@ ~/Developer/jl-forward-diff/mwe.jl:9
[8] gpu_mwe_kernel_sincos
@ ~/.julia/packages/KernelAbstractions/cWlFz/src/macros.jl:90
[9] gpu_mwe_kernel_sincos
@ ./none:0
Throws ostensibly the same error as above. So, instead trying
@kernel function mwe_kernel_sincos(out, a)
I = @index(Global, Linear)
k = sincos(a[I])
out[I] = k[1] + k[2]
end
Throws no error but only has k[1]
non-zero -- that is, k[2]
doesn't have a value at all?
from metal.jl.
It seems to me that the Metal sincos
only returns a single float, which is the sin part? @*code_warntype
confirms this with the external calls?
Edit: some examples
Kernel:
@kernel function mwe_kernel_sincos(out, a)
I = @index(Global, Linear)
k = sincos(a[I])
out[I] = k[1]
end
41 ─ %119 = $(Expr(:foreigncall, "extern air.sincos.f32", Float32, svec(Float32), 0, :(:llvmcall), :(%116), :(%116)))::Float32
└─── goto #43 if not true
42 ─ nothing::Nothing
43 ┄ goto #44
44 ─ goto #49 if not true
45 ─ %124 = Core.tuple(%92)::Tuple{UInt32}
│ %125 = Base.getfield(out, :shape)::Tuple{Int64}
│ %126 = Base.getfield(%125, 1, true)::Int64
Kernel:
@kernel function mwe_kernel_sincos(out, a)
I = @index(Global, Linear)
k = sincos(a[I])
out[I] = k[2]
end
41 ─ %119 = $(Expr(:foreigncall, "extern air.sincos.f32", Float32, svec(Float32), 0, :(:llvmcall), :(%116), :(%116)))::Float32
└─── goto #43 if not true
42 ─ Metal.throw(Metal.nothing)::Union{}
└─── unreachable
43 ─ goto #44
44 ─ goto #49 if not true
45 ─ %125 = Core.tuple(%92)::Tuple{UInt32}
│ %126 = Base.getfield(out, :shape)::Tuple{Int64}
│ %127 = Base.getfield(%126, 1, true)::Int64
from metal.jl.
From the Metal developer API:
So changing
@device_override function Base.sincos(x::Float32)
c = Ref{Cfloat}()
s = ccall("extern air.sincos.f32", llvmcall, Cfloat, (Cfloat, Ptr{Cfloat}), x, c)
(s, c[])
end
fixes everything.
I will open a PR with the fixes :)
from metal.jl.
Related Issues (20)
- `fill!` for `Int8` errors when the value is negative
- Support for macOS Sonoma HOT 1
- Long stacktrace when trying to create Float64 rand arrays HOT 2
- allowscalar equivalent for Metal.jl HOT 2
- Equivalent of cuSparse - start with sparse matvec HOT 7
- MPS - Support for Convolutional Neural Network kernels
- mapreduce allocates a lot on the CPU HOT 3
- Better error message for mixing MtlArray and Array operations
- Compilation failure due to high register usage HOT 3
- Threadgroup atomics require all-atomic operation HOT 3
- KernelAbstractions: add Atomix back-end
- Define map! ? HOT 1
- Q: How to debug kernels - KA.@print?
- Crash during MTLDispatchListApply HOT 14
- `symbol multiply defined!` Bug/crash on Julia master, fine on 1.10 HOT 1
- `log1p` fails on `MtlArray{Float32}` HOT 10
- When precompiling, UndefVarError: `CompilerConfig` not defined HOT 2
- Legalization errors with vectorized code HOT 3
- Use vkFFT for FFT support HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from metal.jl.