Comments (3)
Another one that fails similarly:
declare float @air.sincos.f32(i64)
define void @my_kernel(float addrspace(1)* %a, { i8 addrspace(1)*, [1 x i64] } addrspace(1)* %e, i32 %arg2) {
%tmp4 = bitcast { i8 addrspace(1)*, [1 x i64] } addrspace(1)* %e to float addrspace(1)* addrspace(1)*
%tmp5 = load float addrspace(1)*, float addrspace(1)* addrspace(1)* %tmp4, align 8
%b = alloca i32, align 4
%f = sext i32 %arg2 to i64
%c = ptrtoint i32* %b to i64
%1 = call float @air.sincos.f32(i64 %c)
%d = getelementptr float, float addrspace(1)* %a, i64 %f
store float 0.000000e+00, float addrspace(1)* %d, align 4
store float 0.000000e+00, float addrspace(1)* %tmp5, align 4
ret void
}
!air.kernel = !{!0}
!air.version = !{!6}
!0 = !{void (float addrspace(1)*, { i8 addrspace(1)*, [1 x i64] } addrspace(1)*, i32)* @my_kernel, !1, !2}
!1 = !{}
!2 = !{!3, !4, !5}
!3 = !{i32 0, !"air.buffer", !"air.location_index", i32 0, i32 1, !"air.arg_type_name", !"arr_sin"}
!4 = !{i32 1, !"air.buffer", !"air.location_index", i32 1, !"air.arg_type_name", !"air.arg_name", !"arr_cos"}
!5 = !{i32 2, !"air.thread_position_in_grid", !"air.arg_type_name", !"uint"}
!6 = !{i32 2, i32 6, i32 0}
"termination": {
"code": 1,
"flags": 518,
"namespace": "METAL",
"reasons": [
"unable to legalize instruction: %101:_(p0) = 224 %115:_(s64), 1",
"Context:",
"%101:_(p0) = 224 %115:_(s64), 1",
"%115:_(s64) = 120 i64 16",
"(in function: agc.main)"
]
},
from metal.jl.
%1 = call float @air.sincos.f32(i64 %c)
i64??
from metal.jl.
Yeah, ok, that's nonsensical. It's an artifact from the reduction, though. Here's the original IR:
declare float @air.sincos.f32(float, i64) local_unnamed_addr
define void @my_kernel({ i8 addrspace(1)*, [1 x i64] } addrspace(1)* %0, { i8 addrspace(1)*, [1 x i64] } addrspace(1)* %1, i32 %thread_position_in_grid) local_unnamed_addr {
conversion:
%2 = bitcast { i8 addrspace(1)*, [1 x i64] } addrspace(1)* %0 to float addrspace(1)* addrspace(1)*
%.unpack8 = load float addrspace(1)*, float addrspace(1)* addrspace(1)* %2, align 8
%3 = bitcast { i8 addrspace(1)*, [1 x i64] } addrspace(1)* %1 to float addrspace(1)* addrspace(1)*
%.unpack12 = load float addrspace(1)*, float addrspace(1)* addrspace(1)* %3, align 8
%4 = alloca i32, align 8
%5 = sext i32 %thread_position_in_grid to i64
%6 = getelementptr inbounds float, float addrspace(1)* %.unpack12, i64 %5
%7 = load float, float addrspace(1)* %6, align 4
%bitcast_coercion5 = ptrtoint i32* %4 to i64
%8 = call float @air.sincos.f32(float %7, i64 %bitcast_coercion5)
%9 = bitcast i32* %4 to float*
%10 = load float, float* %9, align 8
%11 = getelementptr inbounds float, float addrspace(1)* %.unpack8, i64 %5
store float %8, float addrspace(1)* %11, align 4
store float %10, float addrspace(1)* %6, align 4
ret void
}
attributes #0 = { argmemonly nocallback nofree nosync nounwind willreturn }
!air.kernel = !{!41}
!air.version = !{!48}
!air.language_version = !{!49}
!9 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!10 = !DIFile(filename: "julia", directory: ".")
!41 = !{void ({ i8 addrspace(1)*, [1 x i64] } addrspace(1)*, { i8 addrspace(1)*, [1 x i64] } addrspace(1)*, i32)* @my_kernel, !42, !43}
!42 = !{}
!43 = !{!44, !45, !46}
!44 = !{i32 0, !"air.buffer", !"air.location_index", i32 0, i32 1, !"air.read_write", !"air.address_space", i32 1, !"air.arg_type_size", i32 16, !"air.arg_type_align_size", i32 8, !"air.arg_type_name", !"MtlDeviceVector{Float32, 1}", !"air.arg_name", !"arr_sin"}
!45 = !{i32 1, !"air.buffer", !"air.location_index", i32 1, i32 1, !"air.read_write", !"air.address_space", i32 1, !"air.arg_type_size", i32 16, !"air.arg_type_align_size", i32 8, !"air.arg_type_name", !"MtlDeviceVector{Float32, 1}", !"air.arg_name", !"arr_cos"}
!46 = !{i32 2, !"air.thread_position_in_grid", !"air.arg_type_name", !"uint"}
!48 = !{i32 2, i32 5, i32 0}
!49 = !{!"Metal", i32 3, i32 1, i32 0}
"termination": {
"code": 1,
"flags": 518,
"namespace": "METAL",
"reasons": [
"unable to legalize instruction: %259:_(p0) = 224 %258:_(s64), 1",
"Context:",
"%259:_(p0) = 224 %258:_(s64), 1",
"%258:_(s64) = 120 i64 16",
"(in function: agc.main)"
]
},
EDIT: the sincos
intrinsic signature still doesn't look nice here though. It comes from ccall("extern air.sincos.f32", llvmcall, Cfloat, (Cfloat, Ptr{Cfloat}), x, c)
, which should probably be an LLVMPtr. In any case, I don't think that's the cause of this issue, as changing the signature still reproduces the failure.
from metal.jl.
Related Issues (20)
- Autorelease changes lead to use after free with errors
- Support for Paravirtualized Graphics for Github Actions CI HOT 4
- Reductions don't work on Shared Arrays HOT 1
- Port the opportunistic synchronization from CUDA.jl HOT 1
- Register v1.1.0 HOT 4
- Tests sporadically timing out on 1.11 HOT 10
- ReshapedArray indexing broken because of Int128 operation HOT 11
- KernelAbstractions copyto! typo
- Segmentation Faults HOT 11
- Port `accmulate!` and `findall` from CUDA.jl HOT 5
- `MTL.append_copy!` silently ignores Metal documentation restriction HOT 2
- Tests failing with `GPUCompiler` v0.26.5 and `LLVM` v7.1 HOT 3
- downgrades LLVM HOT 2
- Missing public/exported docstrings HOT 1
- Audit exports/public symbols HOT 1
- Generalize `adapt` to allow specifying the storage mode but not the element type and/or number of dimensions HOT 3
- sqrt(::Complex) unsupported due to conversion exceptions HOT 3
- Compilation failure on 1.11 HOT 1
- Metal 3.1 and 3.2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from metal.jl.