Giter VIP home page Giter VIP logo

Comments (11)

christiangnrd avatar christiangnrd commented on June 20, 2024 1

Bisected to JuliaGPU/GPUArrays.jl#512

from metal.jl.

maleadt avatar maleadt commented on June 20, 2024 1

With the above:

Reason: unsupported use of i128 value
Stacktrace:
  [1] toInt128
    @ ./boot.jl:715
  [2] Int128
    @ ./boot.jl:785
  [3] convert
    @ ./number.jl:7
  [4] widen
    @ ./operators.jl:891
  [5] _mul_high
    @ ./multinverses.jl:139
  [6] div
    @ ./multinverses.jl:158
  [7] divrem
    @ ./multinverses.jl:172
  [8] _ind2sub_rs
    @ ./reshapedarray.jl:223
  [9] ind2sub_rs
    @ ./reshapedarray.jl:220
 [10] _unsafe_getindex
    @ ./reshapedarray.jl:260
 [11] getindex
    @ ./reshapedarray.jl:249
 [12] macro expansion
    @ ~/Julia/pkg/GPUArrays/src/host/indexing.jl:88
 [13] getindex_kernel
    @ ~/Julia/pkg/GPUArrays/src/host/indexing.jl:82

from metal.jl.

christiangnrd avatar christiangnrd commented on June 20, 2024

What type is elt?

from metal.jl.

kmp5VT avatar kmp5VT commented on June 20, 2024

@christiangnrd elt = Float32 sorry forgot to add that definition. Thanks!

from metal.jl.

maleadt avatar maleadt commented on June 20, 2024

I have the temp file available but cannot attach it to the github issue

You probably have to zip it.

Also, which version of Metal.jl are you using? Please ensure you're trying v1.1.0.

from metal.jl.

kmp5VT avatar kmp5VT commented on June 20, 2024

@maleadt sorry I didn't provide adequate versioning information. I am using Metal v 1.1.0. but I did not have an issue with this code in the previous release of Metal. Here is my versioninfo

Julia Version 1.10.2
Commit bd47eca2c8a (2024-03-01 10:14 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: macOS (arm64-apple-darwin22.4.0)
  CPU: 10 × Apple M1 Max
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, apple-m1)
Threads: 1 default, 0 interactive, 1 GC (on 8 virtual cores)

Thanks!
metal_error.zip

from metal.jl.

tgymnich avatar tgymnich commented on June 20, 2024
; ModuleID = 'shader.air'
source_filename = "start"
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v16:16:16-v24:32:32-v32:32:32-v48:64:64-v64:64:64-v96:128:128-v128:128:128-v192:256:256-v256:256:256-v512:512:512-v1024:1024:1024-n8:16:32"
target triple = "air64-apple-macosx14.4.1"

; Function Attrs: cold noreturn nounwind
declare void @llvm.trap() #0

declare i64 @air.abs.s.i64(i64) local_unnamed_addr

define internal fastcc void @gpu_report_exception() unnamed_addr !dbg !58 {
top:
  ret void, !dbg !61
}

define internal fastcc void @gpu_signal_exception() unnamed_addr !dbg !62 {
top:
  ret void, !dbg !64
}

; Function Attrs: nocallback nofree nosync nounwind readnone speculatable willreturn
declare i64 @llvm.smax.i64(i64, i64) #1

; Function Attrs: nocallback nofree nosync nounwind readnone speculatable willreturn
declare i8 @llvm.umin.i8(i8, i8) #1

define void @_Z15getindex_kernel16mtlKernelContext14MtlDeviceArrayI7Float32Li1ELi1EE13ReshapedArrayIS1_Li1E7AdjointIS1_S0_IS1_Li2ELi1EEE5TupleI27SignedMultiplicativeInverseI5Int64EEES4_IS6_E9UnitRangeIS6_E({ i8 addrspace(1)*, [1 x i64] } addrspace(1)* %0, { [1 x { i8 addrspace(1)*, [2 x i64] }], [1 x i64], [1 x { i64, i64, i8, i8 }] } addrspace(1)* %1, [1 x i64] addrspace(1)* %2, [2 x i64] addrspace(1)* %3, i32 %threads_per_grid, i32 %thread_position_in_grid) local_unnamed_addr !dbg !65 {
conversion:
  %4 = bitcast { i8 addrspace(1)*, [1 x i64] } addrspace(1)* %0 to float addrspace(1)* addrspace(1)*
  %.unpack12 = load float addrspace(1)*, float addrspace(1)* addrspace(1)* %4, align 8
  %5 = getelementptr inbounds { i8 addrspace(1)*, [1 x i64] }, { i8 addrspace(1)*, [1 x i64] } addrspace(1)* %0, i64 0, i32 1, i64 0
  %.unpack10.unpack = load i64, i64 addrspace(1)* %5, align 8
  %6 = bitcast { [1 x { i8 addrspace(1)*, [2 x i64] }], [1 x i64], [1 x { i64, i64, i8, i8 }] } addrspace(1)* %1 to float addrspace(1)* addrspace(1)*
  %.unpack.unpack.unpack26 = load float addrspace(1)*, float addrspace(1)* addrspace(1)* %6, align 8
  %.unpack.unpack.unpack19.elt = getelementptr inbounds { [1 x { i8 addrspace(1)*, [2 x i64] }], [1 x i64], [1 x { i64, i64, i8, i8 }] }, { [1 x { i8 addrspace(1)*, [2 x i64] }], [1 x i64], [1 x { i64, i64, i8, i8 }] } addrspace(1)* %1, i64 0, i32 0, i64 0, i32 1, i64 0
  %.unpack.unpack.unpack19.unpack = load i64, i64 addrspace(1)* %.unpack.unpack.unpack19.elt, align 8
  %7 = getelementptr inbounds { [1 x { i8 addrspace(1)*, [2 x i64] }], [1 x i64], [1 x { i64, i64, i8, i8 }] }, { [1 x { i8 addrspace(1)*, [2 x i64] }], [1 x i64], [1 x { i64, i64, i8, i8 }] } addrspace(1)* %1, i64 0, i32 2, i64 0
  %.unpack16.unpack = load { i64, i64, i8, i8 }, { i64, i64, i8, i8 } addrspace(1)* %7, align 8
  %.fca.2.0.0.extract = extractvalue { i64, i64, i8, i8 } %.unpack16.unpack, 0
  %.fca.2.0.1.extract = extractvalue { i64, i64, i8, i8 } %.unpack16.unpack, 1
  %.fca.2.0.2.extract = extractvalue { i64, i64, i8, i8 } %.unpack16.unpack, 2
  %.fca.2.0.3.extract = extractvalue { i64, i64, i8, i8 } %.unpack16.unpack, 3
  %8 = add i32 %thread_position_in_grid, 1, !dbg !67
  %9 = zext i32 %8 to i64, !dbg !84
  %.not = icmp ne i32 %8, 0, !dbg !95
  %10 = icmp sge i64 %.unpack10.unpack, %9, !dbg !97
  %narrow = select i1 %.not, i1 %10, i1 false, !dbg !97
  br i1 %narrow, label %L20, label %common.ret, !dbg !97

common.ret:                                       ; preds = %L87, %conversion
  ret void, !dbg !98

L20:                                              ; preds = %conversion
  %.elt = getelementptr inbounds [2 x i64], [2 x i64] addrspace(1)* %3, i64 0, i64 0
  %.unpack = load i64, i64 addrspace(1)* %.elt, align 8
  %11 = getelementptr inbounds { [1 x { i8 addrspace(1)*, [2 x i64] }], [1 x i64], [1 x { i64, i64, i8, i8 }] }, { [1 x { i8 addrspace(1)*, [2 x i64] }], [1 x i64], [1 x { i64, i64, i8, i8 }] } addrspace(1)* %1, i64 0, i32 1, i64 0
  %.unpack14.unpack = load i64, i64 addrspace(1)* %11, align 8
  %12 = add nsw i64 %9, -1, !dbg !99
  %13 = add i64 %12, %.unpack, !dbg !105
  %14 = call i64 @air.max.s.i64(i64 %.unpack14.unpack, i64 0), !dbg !106
  %15 = add i64 %13, -1, !dbg !134
  %.not5 = icmp ult i64 %15, %14, !dbg !137
  br i1 %.not5, label %L87, label %L84, !dbg !129

L84:                                              ; preds = %L20
  call fastcc void @gpu_report_exception(), !dbg !139
  call fastcc void @gpu_signal_exception(), !dbg !139
  call void @llvm.trap(), !dbg !139
  unreachable, !dbg !139

L87:                                              ; preds = %L20
  %16 = sext i64 %15 to i128, !dbg !143
  %17 = sext i64 %.fca.2.0.1.extract to i128, !dbg !165
  %18 = mul nsw i128 %17, %16, !dbg !168
  %19 = lshr i128 %18, 64, !dbg !170
  %20 = trunc i128 %19 to i64, !dbg !173
  %21 = sext i8 %.fca.2.0.2.extract to i64, !dbg !174
  %22 = mul i64 %15, %21, !dbg !177
  %23 = add i64 %22, %20, !dbg !179
  %24 = call i64 @air.abs.s.i64(i64 %.fca.2.0.0.extract), !dbg !180
  %.not7 = icmp eq i64 %24, 1, !dbg !184
  %25 = mul i64 %.fca.2.0.0.extract, %15, !dbg !186
  %26 = call i8 @air.min.u.i8(i8 %.fca.2.0.3.extract, i8 63), !dbg !187
  %.v = zext i8 %26 to i64, !dbg !187
  %27 = ashr i64 %23, %.v, !dbg !187
  %.lobit = lshr i64 %23, 63, !dbg !189
  %28 = add i64 %27, %.lobit, !dbg !194
  %29 = select i1 %.not7, i64 %25, i64 %28, !dbg !196
  %30 = mul i64 %29, %.fca.2.0.0.extract, !dbg !197
  %31 = sub i64 %15, %30, !dbg !199
  %32 = call i64 @air.max.s.i64(i64 %.unpack.unpack.unpack19.unpack, i64 0), !dbg !200
  %33 = mul i64 %31, %32, !dbg !220
  %34 = add i64 %33, %29, !dbg !225
  %35 = getelementptr inbounds float, float addrspace(1)* %.unpack.unpack.unpack26, i64 %34, !dbg !226
  %36 = load float, float addrspace(1)* %35, align 4, !dbg !226, !tbaa !240
  %37 = getelementptr inbounds float, float addrspace(1)* %.unpack12, i64 %12, !dbg !243
  store float %36, float addrspace(1)* %37, align 4, !dbg !243, !tbaa !240
  br label %common.ret
}

declare i64 @air.max.s.i64(i64, i64)

declare i8 @air.min.u.i8(i8, i8)

attributes #0 = { cold noreturn nounwind }
attributes #1 = { nocallback nofree nosync nounwind readnone speculatable willreturn }

!llvm.module.flags = !{!0, !1, !2, !3, !4, !5, !6, !7, !8}
!llvm.dbg.cu = !{!9, !11, !12, !13, !14, !15, !16, !17, !18, !19, !20, !21, !22, !23, !24, !25, !26, !27, !28, !29, !30, !31, !32, !33, !34, !35, !36, !37, !38, !39, !40, !41, !42, !43, !44}
!julia.kernel = !{!45}
!air.kernel = !{!46}
!llvm.ident = !{!55}
!air.version = !{!56}
!air.language_version = !{!57}

!0 = !{i32 2, !"Dwarf Version", i32 4}
!1 = !{i32 2, !"Debug Info Version", i32 3}
!2 = !{i32 7, !"air.max_device_buffers", i32 31}
!3 = !{i32 7, !"air.max_constant_buffers", i32 31}
!4 = !{i32 7, !"air.max_threadgroup_buffers", i32 31}
!5 = !{i32 7, !"air.max_textures", i32 128}
!6 = !{i32 7, !"air.max_read_write_textures", i32 8}
!7 = !{i32 7, !"air.max_samplers", i32 16}
!8 = !{i32 2, !"SDK Version", [3 x i32] [i32 14, i32 4, i32 1]}
!9 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!10 = !DIFile(filename: "julia", directory: ".")
!11 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!12 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!13 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!14 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!15 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!16 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!17 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!18 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!19 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!20 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!21 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!22 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!23 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!24 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!25 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!26 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!27 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!28 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!29 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!30 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!31 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!32 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!33 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!34 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!35 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!36 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!37 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!38 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!39 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!40 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!41 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!42 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!43 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!44 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!45 = !{void ({ i8 addrspace(1)*, [1 x i64] } addrspace(1)*, { [1 x { i8 addrspace(1)*, [2 x i64] }], [1 x i64], [1 x { i64, i64, i8, i8 }] } addrspace(1)*, [1 x i64] addrspace(1)*, [2 x i64] addrspace(1)*, i32, i32)* @_Z15getindex_kernel16mtlKernelContext14MtlDeviceArrayI7Float32Li1ELi1EE13ReshapedArrayIS1_Li1E7AdjointIS1_S0_IS1_Li2ELi1EEE5TupleI27SignedMultiplicativeInverseI5Int64EEES4_IS6_E9UnitRangeIS6_E}
!46 = !{void ({ i8 addrspace(1)*, [1 x i64] } addrspace(1)*, { [1 x { i8 addrspace(1)*, [2 x i64] }], [1 x i64], [1 x { i64, i64, i8, i8 }] } addrspace(1)*, [1 x i64] addrspace(1)*, [2 x i64] addrspace(1)*, i32, i32)* @_Z15getindex_kernel16mtlKernelContext14MtlDeviceArrayI7Float32Li1ELi1EE13ReshapedArrayIS1_Li1E7AdjointIS1_S0_IS1_Li2ELi1EEE5TupleI27SignedMultiplicativeInverseI5Int64EEES4_IS6_E9UnitRangeIS6_E, !47, !48}
!47 = !{}
!48 = !{!49, !50, !51, !52, !53, !54}
!49 = !{i32 0, !"air.buffer", !"air.location_index", i32 0, i32 1, !"air.read_write", !"air.address_space", i32 1, !"air.arg_type_size", i32 16, !"air.arg_type_align_size", i32 8, !"air.arg_type_name", !"MtlDeviceVector{Float32, 1}", !"air.arg_name", !"dest"}
!50 = !{i32 1, !"air.buffer", !"air.location_index", i32 1, i32 1, !"air.read_write", !"air.address_space", i32 1, !"air.arg_type_size", i32 56, !"air.arg_type_align_size", i32 8, !"air.arg_type_name", !"Base.ReshapedArray{Float32, 1, LinearAlgebra.Adjoint{Float32, MtlDeviceMatrix{Float32, 1}}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}", !"air.arg_name", !"src"}
!51 = !{i32 2, !"air.buffer", !"air.location_index", i32 2, i32 1, !"air.read_write", !"air.address_space", i32 1, !"air.arg_type_size", i32 8, !"air.arg_type_align_size", i32 8, !"air.arg_type_name", !"Tuple{Int64}", !"air.arg_name", !"idims"}
!52 = !{i32 3, !"air.buffer", !"air.location_index", i32 3, i32 1, !"air.read_write", !"air.address_space", i32 1, !"air.arg_type_size", i32 16, !"air.arg_type_align_size", i32 8, !"air.arg_type_name", !"UnitRange{Int64}", !"air.arg_name", !"Is"}
!53 = !{i32 4, !"air.threads_per_grid", !"air.arg_type_name", !"uint"}
!54 = !{i32 5, !"air.thread_position_in_grid", !"air.arg_type_name", !"uint"}
!55 = !{!"Julia 1.10.2 with Metal.jl"}
!56 = !{i32 2, i32 5, i32 0}
!57 = !{!"Metal", i32 3, i32 1, i32 0}
!58 = distinct !DISubprogram(name: "report_exception", linkageName: "julia_report_exception_3328", scope: null, file: !59, line: 13, type: !60, scopeLine: 13, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !20, retainedNodes: !47)
!59 = !DIFile(filename: "/Users/kpierce/.julia/packages/Metal/q9oGt/src/device/runtime.jl", directory: ".")
!60 = !DISubroutineType(cc: DW_CC_nocall, types: !47)
!61 = !DILocation(line: 18, scope: !58)
!62 = distinct !DISubprogram(name: "signal_exception", linkageName: "julia_signal_exception_3349", scope: null, file: !59, line: 9, type: !63, scopeLine: 9, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !28, retainedNodes: !47)
!63 = !DISubroutineType(types: !47)
!64 = !DILocation(line: 10, scope: !62)
!65 = distinct !DISubprogram(name: "getindex_kernel", linkageName: "julia_getindex_kernel_4165", scope: null, file: !66, line: 82, type: !63, scopeLine: 82, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!66 = !DIFile(filename: "/Users/kpierce/.julia/packages/GPUArrays/OKkAu/src/host/indexing.jl", directory: ".")
!67 = !DILocation(line: 87, scope: !68, inlinedAt: !70)
!68 = distinct !DISubprogram(name: "+;", linkageName: "+", scope: !69, file: !69, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!69 = !DIFile(filename: "int.jl", directory: ".")
!70 = !DILocation(line: 49, scope: !71, inlinedAt: !73)
!71 = distinct !DISubprogram(name: "#thread_position_in_grid_1d;", linkageName: "#thread_position_in_grid_1d", scope: !72, file: !72, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!72 = !DIFile(filename: "/Users/kpierce/.julia/packages/Metal/q9oGt/src/device/intrinsics/arguments.jl", directory: ".")
!73 = !DILocation(line: 36, scope: !74, inlinedAt: !76)
!74 = distinct !DISubprogram(name: "global_index;", linkageName: "global_index", scope: !75, file: !75, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!75 = !DIFile(filename: "/Users/kpierce/.julia/packages/Metal/q9oGt/src/gpuarrays.jl", directory: ".")
!76 = !DILocation(line: 44, scope: !77, inlinedAt: !79)
!77 = distinct !DISubprogram(name: "linear_index;", linkageName: "linear_index", scope: !78, file: !78, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!78 = !DIFile(filename: "/Users/kpierce/.julia/packages/GPUArrays/OKkAu/src/device/indexing.jl", directory: ".")
!79 = !DILocation(line: 66, scope: !80, inlinedAt: !81)
!80 = distinct !DISubprogram(name: "macro expansion;", linkageName: "macro expansion", scope: !78, file: !78, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!81 = !DILocation(line: 85, scope: !82, inlinedAt: !83)
!82 = distinct !DISubprogram(name: "macro expansion;", linkageName: "macro expansion", scope: !66, file: !66, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!83 = !DILocation(line: 82, scope: !65)
!84 = !DILocation(line: 708, scope: !85, inlinedAt: !87)
!85 = distinct !DISubprogram(name: "toInt64;", linkageName: "toInt64", scope: !86, file: !86, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!86 = !DIFile(filename: "boot.jl", directory: ".")
!87 = !DILocation(line: 784, scope: !88, inlinedAt: !89)
!88 = distinct !DISubprogram(name: "Int64;", linkageName: "Int64", scope: !86, file: !86, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!89 = !DILocation(line: 7, scope: !90, inlinedAt: !92)
!90 = distinct !DISubprogram(name: "convert;", linkageName: "convert", scope: !91, file: !91, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!91 = !DIFile(filename: "number.jl", directory: ".")
!92 = !DILocation(line: 551, scope: !93, inlinedAt: !94)
!93 = distinct !DISubprogram(name: "rem;", linkageName: "rem", scope: !69, file: !69, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!94 = !DILocation(line: 1066, scope: !68, inlinedAt: !76)
!95 = !DILocation(line: 514, scope: !96, inlinedAt: !97)
!96 = distinct !DISubprogram(name: "<=;", linkageName: "<=", scope: !69, file: !69, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!97 = !DILocation(line: 67, scope: !80, inlinedAt: !81)
!98 = !DILocation(line: 0, scope: !82, inlinedAt: !83)
!99 = !DILocation(line: 86, scope: !100, inlinedAt: !101)
!100 = distinct !DISubprogram(name: "-;", linkageName: "-", scope: !69, file: !69, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!101 = !DILocation(line: 929, scope: !102, inlinedAt: !104)
!102 = distinct !DISubprogram(name: "getindex;", linkageName: "getindex", scope: !103, file: !103, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!103 = !DIFile(filename: "range.jl", directory: ".")
!104 = !DILocation(line: 87, scope: !82, inlinedAt: !83)
!105 = !DILocation(line: 87, scope: !68, inlinedAt: !101)
!106 = !DILocation(line: 647, scope: !107, inlinedAt: !109)
!107 = distinct !DISubprogram(name: "ifelse;", linkageName: "ifelse", scope: !108, file: !108, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!108 = !DIFile(filename: "essentials.jl", directory: ".")
!109 = !DILocation(line: 532, scope: !110, inlinedAt: !112)
!110 = distinct !DISubprogram(name: "max;", linkageName: "max", scope: !111, file: !111, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!111 = !DIFile(filename: "promotion.jl", directory: ".")
!112 = !DILocation(line: 454, scope: !113, inlinedAt: !114)
!113 = distinct !DISubprogram(name: "OneTo;", linkageName: "OneTo", scope: !103, file: !103, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!114 = !DILocation(line: 467, scope: !113, inlinedAt: !115)
!115 = !DILocation(line: 469, scope: !116, inlinedAt: !117)
!116 = distinct !DISubprogram(name: "oneto;", linkageName: "oneto", scope: !103, file: !103, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!117 = !DILocation(line: 291, scope: !118, inlinedAt: !120)
!118 = distinct !DISubprogram(name: "map;", linkageName: "map", scope: !119, file: !119, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!119 = !DIFile(filename: "tuple.jl", directory: ".")
!120 = !DILocation(line: 98, scope: !121, inlinedAt: !123)
!121 = distinct !DISubprogram(name: "axes;", linkageName: "axes", scope: !122, file: !122, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!122 = !DIFile(filename: "abstractarray.jl", directory: ".")
!123 = !DILocation(line: 137, scope: !124, inlinedAt: !125)
!124 = distinct !DISubprogram(name: "axes1;", linkageName: "axes1", scope: !122, file: !122, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!125 = !DILocation(line: 389, scope: !126, inlinedAt: !127)
!126 = distinct !DISubprogram(name: "eachindex;", linkageName: "eachindex", scope: !122, file: !122, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!127 = !DILocation(line: 687, scope: !128, inlinedAt: !129)
!128 = distinct !DISubprogram(name: "checkbounds;", linkageName: "checkbounds", scope: !122, file: !122, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!129 = !DILocation(line: 702, scope: !128, inlinedAt: !130)
!130 = !DILocation(line: 248, scope: !131, inlinedAt: !133)
!131 = distinct !DISubprogram(name: "getindex;", linkageName: "getindex", scope: !132, file: !132, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!132 = !DIFile(filename: "reshapedarray.jl", directory: ".")
!133 = !DILocation(line: 88, scope: !82, inlinedAt: !83)
!134 = !DILocation(line: 86, scope: !100, inlinedAt: !135)
!135 = !DILocation(line: 763, scope: !136, inlinedAt: !127)
!136 = distinct !DISubprogram(name: "checkindex;", linkageName: "checkindex", scope: !122, file: !122, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!137 = !DILocation(line: 513, scope: !138, inlinedAt: !135)
!138 = distinct !DISubprogram(name: "<;", linkageName: "<", scope: !69, file: !69, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!139 = !DILocation(line: 4, scope: !140, inlinedAt: !142)
!140 = distinct !DISubprogram(name: "#throw_boundserror", linkageName: "julia_#throw_boundserror_4181", scope: null, file: !141, line: 33, type: !60, scopeLine: 33, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !14, retainedNodes: !47)
!141 = !DIFile(filename: "/Users/kpierce/.julia/packages/Metal/q9oGt/src/device/quirks.jl", directory: ".")
!142 = distinct !DILocation(line: 702, scope: !128, inlinedAt: !130)
!143 = !DILocation(line: 715, scope: !144, inlinedAt: !145)
!144 = distinct !DISubprogram(name: "toInt128;", linkageName: "toInt128", scope: !86, file: !86, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!145 = !DILocation(line: 785, scope: !146, inlinedAt: !147)
!146 = distinct !DISubprogram(name: "Int128;", linkageName: "Int128", scope: !86, file: !86, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!147 = !DILocation(line: 7, scope: !90, inlinedAt: !148)
!148 = !DILocation(line: 891, scope: !149, inlinedAt: !151)
!149 = distinct !DISubprogram(name: "widen;", linkageName: "widen", scope: !150, file: !150, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!150 = !DIFile(filename: "operators.jl", directory: ".")
!151 = !DILocation(line: 139, scope: !152, inlinedAt: !154)
!152 = distinct !DISubprogram(name: "_mul_high;", linkageName: "_mul_high", scope: !153, file: !153, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!153 = !DIFile(filename: "multinverses.jl", directory: ".")
!154 = !DILocation(line: 158, scope: !155, inlinedAt: !156)
!155 = distinct !DISubprogram(name: "div;", linkageName: "div", scope: !153, file: !153, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!156 = !DILocation(line: 172, scope: !157, inlinedAt: !158)
!157 = distinct !DISubprogram(name: "divrem;", linkageName: "divrem", scope: !153, file: !153, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!158 = !DILocation(line: 223, scope: !159, inlinedAt: !160)
!159 = distinct !DISubprogram(name: "_ind2sub_rs;", linkageName: "_ind2sub_rs", scope: !132, file: !132, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!160 = !DILocation(line: 220, scope: !161, inlinedAt: !162)
!161 = distinct !DISubprogram(name: "ind2sub_rs;", linkageName: "ind2sub_rs", scope: !132, file: !132, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!162 = !DILocation(line: 260, scope: !163, inlinedAt: !164)
!163 = distinct !DISubprogram(name: "_unsafe_getindex;", linkageName: "_unsafe_getindex", scope: !132, file: !132, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!164 = !DILocation(line: 249, scope: !131, inlinedAt: !133)
!165 = !DILocation(line: 549, scope: !93, inlinedAt: !166)
!166 = !DILocation(line: 1066, scope: !167, inlinedAt: !151)
!167 = distinct !DISubprogram(name: "*;", linkageName: "*", scope: !69, file: !69, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!168 = !DILocation(line: 1053, scope: !167, inlinedAt: !169)
!169 = !DILocation(line: 1068, scope: !167, inlinedAt: !151)
!170 = !DILocation(line: 530, scope: !171, inlinedAt: !172)
!171 = distinct !DISubprogram(name: ">>>;", linkageName: ">>>", scope: !69, file: !69, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!172 = !DILocation(line: 538, scope: !171, inlinedAt: !151)
!173 = !DILocation(line: 544, scope: !93, inlinedAt: !151)
!174 = !DILocation(line: 549, scope: !93, inlinedAt: !175)
!175 = !DILocation(line: 1066, scope: !167, inlinedAt: !176)
!176 = !DILocation(line: 159, scope: !155, inlinedAt: !156)
!177 = !DILocation(line: 88, scope: !167, inlinedAt: !178)
!178 = !DILocation(line: 1068, scope: !167, inlinedAt: !176)
!179 = !DILocation(line: 87, scope: !68, inlinedAt: !176)
!180 = !DILocation(line: 302, scope: !181, inlinedAt: !183)
!181 = distinct !DISubprogram(name: "#abs;", linkageName: "#abs", scope: !182, file: !182, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!182 = !DIFile(filename: "/Users/kpierce/.julia/packages/Metal/q9oGt/src/device/intrinsics/math.jl", directory: ".")
!183 = !DILocation(line: 160, scope: !155, inlinedAt: !156)
!184 = !DILocation(line: 521, scope: !185, inlinedAt: !183)
!185 = distinct !DISubprogram(name: "==;", linkageName: "==", scope: !111, file: !111, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!186 = !DILocation(line: 88, scope: !167, inlinedAt: !183)
!187 = !DILocation(line: 527, scope: !188, inlinedAt: !183)
!188 = distinct !DISubprogram(name: ">>;", linkageName: ">>", scope: !69, file: !69, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!189 = !DILocation(line: 711, scope: !85, inlinedAt: !190)
!190 = !DILocation(line: 784, scope: !88, inlinedAt: !191)
!191 = !DILocation(line: 7, scope: !90, inlinedAt: !192)
!192 = !DILocation(line: 546, scope: !93, inlinedAt: !193)
!193 = !DILocation(line: 1066, scope: !68, inlinedAt: !183)
!194 = !DILocation(line: 87, scope: !68, inlinedAt: !195)
!195 = !DILocation(line: 1068, scope: !68, inlinedAt: !183)
!196 = !DILocation(line: 647, scope: !107, inlinedAt: !183)
!197 = !DILocation(line: 88, scope: !167, inlinedAt: !198)
!198 = !DILocation(line: 173, scope: !157, inlinedAt: !158)
!199 = !DILocation(line: 86, scope: !100, inlinedAt: !198)
!200 = !DILocation(line: 647, scope: !107, inlinedAt: !201)
!201 = !DILocation(line: 532, scope: !110, inlinedAt: !202)
!202 = !DILocation(line: 454, scope: !113, inlinedAt: !203)
!203 = !DILocation(line: 467, scope: !113, inlinedAt: !204)
!204 = !DILocation(line: 469, scope: !116, inlinedAt: !205)
!205 = !DILocation(line: 292, scope: !118, inlinedAt: !206)
!206 = !DILocation(line: 98, scope: !121, inlinedAt: !207)
!207 = !DILocation(line: 2957, scope: !208, inlinedAt: !209)
!208 = distinct !DISubprogram(name: "_sub2ind;", linkageName: "_sub2ind", scope: !122, file: !122, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!209 = !DILocation(line: 1330, scope: !210, inlinedAt: !211)
!210 = distinct !DISubprogram(name: "_to_linear_index;", linkageName: "_to_linear_index", scope: !122, file: !122, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!211 = !DILocation(line: 114, scope: !212, inlinedAt: !214)
!212 = distinct !DISubprogram(name: "getindex;", linkageName: "getindex", scope: !213, file: !213, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!213 = !DIFile(filename: "/Users/kpierce/.julia/packages/Metal/q9oGt/src/device/array.jl", directory: ".")
!214 = !DILocation(line: 329, scope: !215, inlinedAt: !217)
!215 = distinct !DISubprogram(name: "getindex;", linkageName: "getindex", scope: !216, file: !216, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!216 = !DIFile(filename: "/Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/LinearAlgebra/src/adjtrans.jl", directory: ".")
!217 = !DILocation(line: 264, scope: !218, inlinedAt: !219)
!218 = distinct !DISubprogram(name: "_unsafe_getindex_rs;", linkageName: "_unsafe_getindex_rs", scope: !132, file: !132, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!219 = !DILocation(line: 261, scope: !163, inlinedAt: !164)
!220 = !DILocation(line: 88, scope: !167, inlinedAt: !221)
!221 = !DILocation(line: 2989, scope: !222, inlinedAt: !223)
!222 = distinct !DISubprogram(name: "_sub2ind_recurse;", linkageName: "_sub2ind_recurse", scope: !122, file: !122, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!223 = !DILocation(line: 2989, scope: !222, inlinedAt: !224)
!224 = !DILocation(line: 2973, scope: !208, inlinedAt: !207)
!225 = !DILocation(line: 86, scope: !100, inlinedAt: !226)
!226 = !DILocation(line: 38, scope: !227, inlinedAt: !229)
!227 = distinct !DISubprogram(name: "macro expansion;", linkageName: "macro expansion", scope: !228, file: !228, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!228 = !DIFile(filename: "/Users/kpierce/.julia/packages/LLVM/bzSzE/src/interop/base.jl", directory: ".")
!229 = !DILocation(line: 0, scope: !230, inlinedAt: !232)
!230 = distinct !DISubprogram(name: "macro expansion;", linkageName: "macro expansion", scope: !231, file: !231, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!231 = !DIFile(filename: "none", directory: ".")
!232 = !DILocation(line: 0, scope: !233, inlinedAt: !234)
!233 = distinct !DISubprogram(name: "pointerref;", linkageName: "pointerref", scope: !231, file: !231, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!234 = !DILocation(line: 85, scope: !235, inlinedAt: !237)
!235 = distinct !DISubprogram(name: "unsafe_load;", linkageName: "unsafe_load", scope: !236, file: !236, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!236 = !DIFile(filename: "/Users/kpierce/.julia/packages/LLVM/bzSzE/src/interop/pointer.jl", directory: ".")
!237 = !DILocation(line: 82, scope: !238, inlinedAt: !239)
!238 = distinct !DISubprogram(name: "arrayref;", linkageName: "arrayref", scope: !213, file: !213, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!239 = !DILocation(line: 103, scope: !212, inlinedAt: !211)
!240 = !{!241, !241, i64 0, i64 0}
!241 = !{!"custom_tbaa_addrspace(1)", !242, i64 0}
!242 = !{!"custom_tbaa"}
!243 = !DILocation(line: 38, scope: !227, inlinedAt: !244)
!244 = !DILocation(line: 0, scope: !230, inlinedAt: !245)
!245 = !DILocation(line: 0, scope: !246, inlinedAt: !247)
!246 = distinct !DISubprogram(name: "pointerset;", linkageName: "pointerset", scope: !231, file: !231, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!247 = !DILocation(line: 88, scope: !248, inlinedAt: !249)
!248 = distinct !DISubprogram(name: "unsafe_store!;", linkageName: "unsafe_store!", scope: !236, file: !236, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!249 = !DILocation(line: 88, scope: !250, inlinedAt: !251)
!250 = distinct !DISubprogram(name: "arrayset;", linkageName: "arrayset", scope: !213, file: !213, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!251 = !DILocation(line: 105, scope: !252, inlinedAt: !253)
!252 = distinct !DISubprogram(name: "setindex!;", linkageName: "setindex!", scope: !213, file: !213, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!253 = !DILocation(line: 89, scope: !82, inlinedAt: !83)

from metal.jl.

maleadt avatar maleadt commented on June 20, 2024

Compiler error:

unable to legalize instruction: %248:_(s64) = 147 %241:_, %243:_
Context:
%248:_(s64) = 147 %241:_, %243:_
%241:_(s64), %242:_(s64) = 74 %49:_(s128)
%243:_(s64), %244:_(s64) = 74 %50:_(s128)
%49:_(s128) = 124 %26:_(s64)
%50:_(s128) = 91 %41:_(p1) :: (load (s64) from %ir.19 + 8, addrspace 1)
%26:_(s64) = 45 %25:_, %18:_
%41:_(p1) = 81 %105:_(s64)
%25:_(s64) = nsw 46 %8:_, %93:_
%18:_(s64) = 90 %16:_(p1) :: (load (s64) from %ir..elt3, addrspace 1)
%105:_(s64) = 45 %94:_, %104:_
%8:_(s64) = 126 %7:gpr32(s32)
%93:_(s64) = 120 i64 2
%16:_(p1) = 90 %17:_(p64) :: (dereferenceable load (p1) from @agc.buffer_pointers.3, addrspace 64)
%94:_(s64) = 80 %28:_(p1)
%104:_(s64) = 120 i64 40
%7:gpr32(s32) = 45 %0:_, %6:_
%17:_(p64) = 71 @agc.buffer_pointers.3
%28:_(p1) = 90 %15:_(p64) :: (dereferenceable load (p1) from @agc.buffer_pointers.1, addrspace 64)
%0:_(s32) = 116 intrinsic(@llvm.agx2.thread.position.in.grid.x)

from metal.jl.

maleadt avatar maleadt commented on June 20, 2024

Reduced:

define void @my_kernel({ i8 addrspace(1)*, [1 x i64] } addrspace(1)* %0, { [1 x { i8 addrspace(1)*, [2 x i64] }], [1 x i64], [1 x { i64, i64, i8, i8 }] } addrspace(1)* %1, [1 x i64] addrspace(1)* %2, [2 x i64] addrspace(1)* %3, i32 %a, i32 %thread_position_in_grid) {
b:
  %.c.d = load { i64, i64, i8, i8 }, { i64, i64, i8, i8 } addrspace(1)* null, align 4
  %.e.2.0.1.extract = extractvalue { i64, i64, i8, i8 } %.c.d, 1
  %4 = sext i64 %.e.2.0.1.extract to i128
  %5 = mul i128 %4, -2
  %6 = lshr i128 %5, 1
  %7 = trunc i128 %6 to i64
  %8 = getelementptr float, float addrspace(1)* null, i64 %7
  %9 = load float, float addrspace(1)* %8, align 4
  store float %9, float addrspace(1)* null, align 4
  ret void
}

!air.kernel = !{!0}
!air.version = !{!8}

!0 = !{void ({ i8 addrspace(1)*, [1 x i64] } addrspace(1)*, { [1 x { i8 addrspace(1)*, [2 x i64] }], [1 x i64], [1 x { i64, i64, i8, i8 }] } addrspace(1)*, [1 x i64] addrspace(1)*, [2 x i64] addrspace(1)*, i32, i32)* @my_kernel, !1, !2}
!1 = !{}
!2 = !{!3, !3, !4, !5, !6, !7}
!3 = !{i32 1, !""}
!4 = !{i32 2, !""}
!5 = !{i32 3, !""}
!6 = !{i32 4, !""}
!7 = !{i32 5, !""}
!8 = !{i32 2, i32 5, i32 0}

This gives the same crash, I think:

unable to legalize instruction: %53:_(s64) = 147 %46:_, %48:_
Context:
%53:_(s64) = 147 %46:_, %48:_
%46:_(s64), %47:_(s64) = 74 %3:_(s128)
%48:_(s64), %49:_(s64) = 74 %4:_(s128)
%3:_(s128) = 91 %1:_(p1) :: (load (s64) from `i64 addrspace(1)* inttoptr (i64 8 to i64 addrspace(1)*)`, addrspace 1)
%4:_(s128) = 120 i128 36893488147419103230
%1:_(p1) = 81 %2:_(s64)
%2:_(s64) = 120 i64 8
(in function: agc.main.constant_program)

from metal.jl.

tgymnich avatar tgymnich commented on June 20, 2024

JuliaGPU/GPUCompiler.jl#571

This should at least yield nicer error messages

from metal.jl.

maleadt avatar maleadt commented on June 20, 2024

So the problem is that normally operations like view and reshape preserve the MtlArray, however here the reshape of an Adjoint results in an actual ReshapedArray. Indexing on that array wrapper is implemented (in Base) using Int128, which is already visible in the type signature:

SubArray{Float32, 1, Base.ReshapedArray{Float32, 1, LinearAlgebra.Adjoint{Float32, MtlMatrix{Float32, Private}}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}, Tuple{UnitRange{Int64}}, false}

LLVM normally supports legalizing such operations, but that only happens during ISel, and Apple's implementation doesn't seem to allow that. And legalizing i128 to i64 in IR seems tricky.

@timholy You originally added the ReshapedArray type; is there a way to opt out of the use of Int128, which I presume comes from the SignedMultiplicativeInverse{Int64} indices? Alternatively, I guess we could overlay ind2sub_rs, but that feels like a hack.

from metal.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.