Giter VIP home page Giter VIP logo

Comments (3)

maleadt avatar maleadt commented on June 16, 2024

Julia MWE:

using Metal

function local_kernel(a, val::T) where T
    i = thread_position_in_grid_1d()
    b = MtlThreadGroupArray(T, 1)
    Metal.atomic_store_explicit(pointer(b, i), val)
    a[i] = b[i]
    return
end

T = Int32
a = Metal.zeros(T, 1)
@metal local_kernel(a, T(42))

Looking at Console, it seems like the Metal compiler is crashing:

Crashed Thread:        2

Exception Type:        EXC_BAD_ACCESS (SIGSEGV)
Exception Codes:       KERN_INVALID_ADDRESS at 0x0000000000000018
Exception Codes:       0x0000000000000001, 0x0000000000000018

Termination Reason:    Namespace SIGNAL, Code 11 Segmentation fault: 11
Terminating Process:   exc handler [85705]

VM Region Info: 0x18 is not in any region.  Bytes before following region: 4375478248
      REGION TYPE                    START - END         [ VSIZE] PRT/MAX SHRMOD  REGION DETAIL
      UNUSED SPACE AT START
--->  
      __TEXT                      104cc8000-104ccc000    [   16K] r-x/r-x SM=COW  ...mpilerService

Thread 0:
0   libsystem_kernel.dylib        	       0x18c57cf88 __semwait_signal_nocancel + 8
1   libsystem_c.dylib             	       0x18c471c04 nanosleep$NOCANCEL + 216
2   libsystem_c.dylib             	       0x18c471ad0 sleep$NOCANCEL + 52
3   libdispatch.dylib             	       0x18c4177dc _dispatch_queue_cleanup2 + 212
4   libsystem_pthread.dylib       	       0x18c5b39fc _pthread_tsd_cleanup + 620
5   libsystem_pthread.dylib       	       0x18c5b6724 _pthread_exit + 84
6   libsystem_pthread.dylib       	       0x18c5b372c pthread_exit + 88
7   libdispatch.dylib             	       0x18c4132b0 dispatch_main + 128
8   libxpc.dylib                  	       0x18c2e1ca4 _xpc_objc_main + 752
9   libxpc.dylib                  	       0x18c2f0904 _xpc_main + 324
10  libxpc.dylib                  	       0x18c2e180c xpc_main + 64
11  MTLCompilerService            	       0x104cc9cb4 main + 420
12  dyld                          	       0x18c2350e0 start + 2360

Thread 1:
0   libsystem_kernel.dylib        	       0x18c57cfc0 __sigsuspend_nocancel + 8
1   libdispatch.dylib             	       0x18c4178dc _dispatch_sigsuspend + 48
2   libdispatch.dylib             	       0x18c4178ac _dispatch_sig_thread + 60

Thread 2 Crashed:
0   libLLVM.dylib                 	       0x20a4a79bc llvm::TargetRegisterInfo::getRegSizeInBits(llvm::Register, llvm::MachineRegisterInfo const&) const + 192
1   libLLVM.dylib                 	       0x209cb91c0 0x209a97000 + 2236864
2   libLLVM.dylib                 	       0x209cb74b4 0x209a97000 + 2229428
3   libLLVM.dylib                 	       0x20a33de40 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) + 408
4   libLLVM.dylib                 	       0x20a5caa98 llvm::FPPassManager::runOnFunction(llvm::Function&) + 668
5   libLLVM.dylib                 	       0x209fc3b58 0x209a97000 + 5426008
6   libLLVM.dylib                 	       0x20a5caf58 llvm::legacy::PassManagerImpl::run(llvm::Module&) + 756
7   libLLVM.dylib                 	       0x209f5e4b8 0x209a97000 + 5010616
8   libLLVM.dylib                 	       0x209f5a318 0x209a97000 + 4993816
9   libLLVM.dylib                 	       0x209f5fb2c llvm::AGX::AGXCompilePlan::execute(llvm::AGX::CompileRequest&) + 760
10  AGXCompilerCore               	       0x1b0936cb8 AGCLLVMCtx::compile(AGCLLVMObject*, llvm::Module&, AGCFastMathFlags, llvm::AGX::PipelineType, llvm::AGX::CodeGenOptions&, bool) + 1372
11  AGXCompilerCore               	       0x1b08c2a98 AGCLLVMUserObject::compile() + 21872
12  AGXCompilerCore               	       0x1b0803d00 AGCCodeGenServiceBuildRequestInternal(AGCCodeGenService*, void const*, unsigned long, void const*, unsigned long, llvm::Module*, AGCTargetArch, _AGCStreamToken&, void const**, unsigned long*, APIKind) + 6764
13  AGXCompilerCore               	       0x1b08021c4 MTLCompilerBuildRequestWithOptions + 124
14  MTLCompiler                   	       0x2130b1ea4 MTLCompilerPluginInterface::compilerBuildRequest(bool, unsigned int, void const*, unsigned long, unsigned int, void const*, BackendCompilationOutput&) + 256
15  MTLCompiler                   	       0x2130ad6fc MTLCompilerObject::backendCompileModule(BinaryRequestData&, BackendCompilationOutput&, unsigned long long, std::__1::vector<CompileTimeData, std::__1::allocator<CompileTimeData>>&) + 656
16  MTLCompiler                   	       0x2130b21c8 MTLCompilerObject::backendCompileExecutableRequest(BinaryRequestData&) + 448
17  MTLCompiler                   	       0x2130b483c MTLCompilerObject::buildRequest(unsigned int, unsigned int, void const*, unsigned long, void (unsigned int, void const*, unsigned long, char const*) block_pointer) + 592
18  MTLCompiler                   	       0x2130beb84 split_stack_call_impl + 24
19  MTLCompiler                   	       0x2130bec54 split_stack_call + 196
20  MTLCompiler                   	       0x2130b2674 MTLCodeGenServiceBuildRequest + 328
21  MTLCompilerService            	       0x104cca13c invocation function for block in MTLCompilerServiceHandleEvent(NSObject<OS_xpc_object>*) + 1036
22  libxpc.dylib                  	       0x18c2d7848 _xpc_connection_call_event_handler + 144
23  libxpc.dylib                  	       0x18c2d61d8 _xpc_connection_mach_event + 1384
24  libdispatch.dylib             	       0x18c4049d0 _dispatch_client_callout4 + 20
25  libdispatch.dylib             	       0x18c420c5c _dispatch_mach_msg_invoke + 468
26  libdispatch.dylib             	       0x18c40bd28 _dispatch_lane_serial_drain + 368
27  libdispatch.dylib             	       0x18c421998 _dispatch_mach_invoke + 444
28  libdispatch.dylib             	       0x18c40bd28 _dispatch_lane_serial_drain + 368
29  libdispatch.dylib             	       0x18c40c9d4 _dispatch_lane_invoke + 380
30  libdispatch.dylib             	       0x18c41761c _dispatch_root_queue_drain_deferred_wlh + 288
31  libdispatch.dylib             	       0x18c416e90 _dispatch_workloop_worker_thread + 404
32  libsystem_pthread.dylib       	       0x18c5b2114 _pthread_wqthread + 288
33  libsystem_pthread.dylib       	       0x18c5b0e30 start_wqthread + 8

Thread 3:
0   libsystem_pthread.dylib       	       0x18c5b0e28 start_wqthread + 0

Thread 4:
0   libsystem_pthread.dylib       	       0x18c5b0e28 start_wqthread + 0


Thread 2 crashed with ARM Thread State (64-bit):
    x0: 0x000000023e7bf290   x1: 0x0000000000000000   x2: 0x00000001468c4200   x3: 0x00000001468c4200
    x4: 0x00006000031a6950   x5: 0x0000000000000008   x6: 0x0000002e00000048   x7: 0x000000000004e400
    x8: 0x000000023a83c540   x9: 0x0000000000000072  x10: 0x0000000000000000  x11: 0x0000000000000000
   x12: 0x0000000000000000  x13: 0x0000000000000000  x14: 0x000000023e7b5330  x15: 0xfffffffffffff000
   x16: 0x00000002434d2930  x17: 0xc3310002434d2930  x18: 0x0000000000000000  x19: 0x0000000146887408
   x20: 0x00000001468c4200  x21: 0x0000000146887408  x22: 0x0000000000000000  x23: 0x0000000000000000
   x24: 0x00000001468c4200  x25: 0x0000000000000000  x26: 0x0000000000000001  x27: 0x0000000000000190
   x28: 0x00000001078f1bd8   fp: 0x00000001078f16e0   lr: 0x9a21000209cb91c0
    sp: 0x00000001078f16b0   pc: 0x000000020a4a79bc cpsr: 0xa0001000
   far: 0x0000000000000018  esr: 0x92000006 (Data Abort) byte read Translation fault

Binary Images:
       0x104cc8000 -        0x104ccbfff com.apple.MTLCompilerService (341.36) <74b60d6a-e03e-3b2a-a911-dc183eba3e13> /System/Library/Frameworks/Metal.framework/Versions/A/XPCServices/MTLCompilerService.xpc/Contents/MacOS/MTLCompilerService
       0x18c574000 -        0x18c5aefff libsystem_kernel.dylib (*) <a7228b5d-53c7-3fe9-84e4-2a8c04dcf051> /usr/lib/system/libsystem_kernel.dylib
       0x18c44b000 -        0x18c4c9ffb libsystem_c.dylib (*) <cc7a439e-f104-3047-995c-9a5a3cc4a442> /usr/lib/system/libsystem_c.dylib
       0x18c401000 -        0x18c447fff libdispatch.dylib (*) <5aa1649c-ef1d-39f7-a66c-4c5d2e53c474> /usr/lib/system/libdispatch.dylib
       0x18c5af000 -        0x18c5bbff3 libsystem_pthread.dylib (*) <449bbad3-f7ef-371d-9a59-fd4ffa78289b> /usr/lib/system/libsystem_pthread.dylib
       0x18c2c9000 -        0x18c30ffff libxpc.dylib (*) <147ea529-cecc-34fb-8250-b2a3893d3c3e> /usr/lib/system/libxpc.dylib
       0x18c22f000 -        0x18c2c3387 dyld (*) <50746901-db0e-39a0-b391-baaa6b82ad0f> /usr/lib/dyld
               0x0 - 0xffffffffffffffff ??? (*) <00000000-0000-0000-0000-000000000000> ???
       0x209a97000 -        0x20c85aff7 libLLVM.dylib (*) <04dd424e-dd70-3034-b1f8-ef9b275e1d8a> /System/Library/PrivateFrameworks/GPUCompiler.framework/Versions/32023/Libraries/libLLVM.dylib
       0x1b07fe000 -        0x1b09e9fff com.apple.AGXCompilerCore (276.62) <9b9da650-3c5c-34f8-9934-d6d4ee14a40c> /System/Library/PrivateFrameworks/AGXCompilerCore.framework/Versions/A/AGXCompilerCore
       0x21300b000 -        0x2130d5fff com.apple.MTLCompiler (341.36) <3ddc8ba3-71bd-3745-8443-0ca59c0b1c45> /System/Library/PrivateFrameworks/MTLCompiler.framework/Versions/32023/MTLCompiler

from metal.jl.

maleadt avatar maleadt commented on June 16, 2024

LLVM MWE:

target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v16:16:16-v24:32:32-v32:32:32-v48:64:64-v64:64:64-v96:128:128-v128:128:128-v192:256:256-v256:256:256-v512:512:512-v1024:1024:1024-n8:16:32"
target triple = "air64-apple-macosx14.3.1"

@threadgroup_memory = external addrspace(3) global [1 x i32]

declare void @air.atomic.local.store.i32(i32 addrspace(3)*, i32, i32, i32)

define void @my_kernel(i32 %thread_position_in_grid) {
conversion:
  %0 = zext i32 %thread_position_in_grid to i64
  %1 = getelementptr i8, i8* addrspacecast (i8 addrspace(3)* bitcast ([1 x i32] addrspace(3)* @threadgroup_memory to i8 addrspace(3)*) to i8*), i64 %0
  %2 = bitcast i8* %1 to i32*
  %3 = addrspacecast i32* %2 to i32 addrspace(3)*
  call void @air.atomic.local.store.i32(i32 addrspace(3)* %3, i32 0, i32 0, i32 0)
  ret void
}

!air.kernel = !{!0}
!air.version = !{!6}

!0 = !{void (i32)* @my_kernel, !1, !2}
!1 = !{}
!2 = !{!3}
!3 = !{i32 2, !"air.thread_position_in_grid", !"air.arg_type_name", !"uint"}
!6 = !{i32 3, i32 0, i32 0}

It seems like the back-end is having trouble looking through the bitcast->addrspacecast->gep->bitcast->addrspacecase of the threadgroup memory pointer, which admittedly is a bit much.

from metal.jl.

maleadt avatar maleadt commented on June 16, 2024

Turns out this was really caused by illegal addrspacecasts during pointer arithmetic. Those mostly get optimized away (which is questionable by itself, but we lack an AIR TargetTransformInfo to make that illegal), but not always, as in this case.

from metal.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.