Json Lee's Projects
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
Multi-threaded GPU (CUDA) based implementation of the Hรถgbom CLEAN deconvolution algorithm
CUDA Templates for Linear Algebra Subroutines
HIPIFY: Convert CUDA to Portable C++ Code
Implementation of IR2Vec, published in ACM TACO
LLVM Techniques, Tips, and Best Practices Clang and Middle-End Libraries, published by Packt
Efficient operation implementation based on the Cambricon Machine Learning Unit (MLU) .
Explore the energy-efficient dataflow scheduling for neural networks.
Simple demonstration of using the RISC-V Vector extension
Backward compatible ML compute opset inspired by HLO/MHLO
TPP experimentation on MLIR for linear algebra
An optimizing compiler for decision tree ensemble inference.
Development repository for the Triton language and compiler
An experimental CPU backend for Triton (https//github.com/openai/triton)
Shared Middle-Layer for Triton Compilation
BEVๅบๆฌๅ็่ฏดๆ๏ผhttps://www.cnblogs.com/wujianming-110117/p/17207533.html