Giter VIP home page Giter VIP logo

wow2025 / mlc-llm Goto Github PK

View Code? Open in Web Editor NEW

This project forked from mlc-ai/mlc-llm

0.0 0.0 0.0 18.15 MB

Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.

Home Page: https://mlc.ai/mlc-llm

License: Apache License 2.0

Shell 0.48% C++ 32.26% Python 61.08% Objective-C 0.02% Kotlin 2.47% Objective-C++ 0.11% Swift 2.47% CMake 0.27% Java 0.10% C 0.01% Groovy 0.37% Batchfile 0.01% Ruby 0.01% HTML 0.08% SCSS 0.18% Makefile 0.07%

mlc-llm's Introduction

MLC LLM

Installation License Join Discoard Related Repository: WebLLM

Universal LLM Deployment Engine with ML Compilation

Get Started | Documentation | Blog

About

MLC LLM is a machine learning compiler and high-performance deployment engine for large language models. The mission of this project is to enable everyone to develop, optimize, and deploy AI models natively on everyone's platforms. 

AMD GPU NVIDIA GPU Apple GPU Intel GPU
Linux / Win ✅ Vulkan, ROCm ✅ Vulkan, CUDA N/A ✅ Vulkan
macOS ✅ Metal (dGPU) N/A ✅ Metal ✅ Metal (iGPU)
Web Browser ✅ WebGPU and WASM
iOS / iPadOS ✅ Metal on Apple A-series GPU
Android ✅ OpenCL on Adreno GPU ✅ OpenCL on Mali GPU

MLC LLM compiles and runs code on MLCEngine -- a unified high-performance LLM inference engine across the above platforms. MLCEngine provides OpenAI-compatible API available through REST server, python, javascript, iOS, Android, all backed by the same engine and compiler that we keep improving with the community.

Get Started

Please visit our documentation to get started with MLC LLM.

Citation

Please consider citing our project if you find it useful:

@software{mlc-llm,
    author = {MLC team},
    title = {{MLC-LLM}},
    url = {https://github.com/mlc-ai/mlc-llm},
    year = {2023}
}

The underlying techniques of MLC LLM include:

References (Click to expand)
@inproceedings{tensorir,
    author = {Feng, Siyuan and Hou, Bohan and Jin, Hongyi and Lin, Wuwei and Shao, Junru and Lai, Ruihang and Ye, Zihao and Zheng, Lianmin and Yu, Cody Hao and Yu, Yong and Chen, Tianqi},
    title = {TensorIR: An Abstraction for Automatic Tensorized Program Optimization},
    year = {2023},
    isbn = {9781450399166},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3575693.3576933},
    doi = {10.1145/3575693.3576933},
    booktitle = {Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2},
    pages = {804–817},
    numpages = {14},
    keywords = {Tensor Computation, Machine Learning Compiler, Deep Neural Network},
    location = {Vancouver, BC, Canada},
    series = {ASPLOS 2023}
}

@inproceedings{metaschedule,
    author = {Shao, Junru and Zhou, Xiyou and Feng, Siyuan and Hou, Bohan and Lai, Ruihang and Jin, Hongyi and Lin, Wuwei and Masuda, Masahiro and Yu, Cody Hao and Chen, Tianqi},
    booktitle = {Advances in Neural Information Processing Systems},
    editor = {S. Koyejo and S. Mohamed and A. Agarwal and D. Belgrave and K. Cho and A. Oh},
    pages = {35783--35796},
    publisher = {Curran Associates, Inc.},
    title = {Tensor Program Optimization with Probabilistic Programs},
    url = {https://proceedings.neurips.cc/paper_files/paper/2022/file/e894eafae43e68b4c8dfdacf742bcbf3-Paper-Conference.pdf},
    volume = {35},
    year = {2022}
}

@inproceedings{tvm,
    author = {Tianqi Chen and Thierry Moreau and Ziheng Jiang and Lianmin Zheng and Eddie Yan and Haichen Shen and Meghan Cowan and Leyuan Wang and Yuwei Hu and Luis Ceze and Carlos Guestrin and Arvind Krishnamurthy},
    title = {{TVM}: An Automated {End-to-End} Optimizing Compiler for Deep Learning},
    booktitle = {13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18)},
    year = {2018},
    isbn = {978-1-939133-08-3},
    address = {Carlsbad, CA},
    pages = {578--594},
    url = {https://www.usenix.org/conference/osdi18/presentation/chen},
    publisher = {USENIX Association},
    month = oct,
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.