aiot-mlsys-lab / efficient-llms-survey Goto Github PK

View Code? Open in Web Editor NEW

735.0 735.0 61.0 3.39 MB

[TMLR 2024] Efficient Large Language Models: A Survey

Home Page: https://arxiv.org/abs/2312.03863

License: Apache License 2.0

efficient-algorithms generative-ai large-language-models machine-learning-systems

efficient-llms-survey's People

Stargazers

Watchers

efficient-llms-survey's Issues

Kindly ask for adding the QuantEase paper in the list

Hey Team,

Thanks for the amazing efforts for putting all those amazing literature together and benefit the research and open-source community. We recently release the codes and updated our QuantEase paper which is a new post-training quantization method:

Paper: QuantEase: Optimization-based Quantization for Language Models - An Efficient and Intuitive Algorithm
Code: https://github.com/linkedin/QuantEase

would you mind adding it in the list and the paper if you think it's a good fit? We're also actively development and adding new content in this work. Thank you in advance!

Best regards,
QQ

Suggest incorporating one efficient LLM finetuning paper

Thank you for conducting such an insightful survey. I wonder if it's possible to incorporate a recent ICML'23 work from UIUC. It centered on the one-shot compression technique of Pre-trained Langugae Models. This paper investigates the neural tangent kernel (NTK) of the multilayer perceptrons (MLP) modules in a PLM and propose to coin a lightweight PLM through NTK-approximating MLP fusion.

A NeurIPS paper on efficient architecture

Thanks for the great survey! Could you please include a discussion of this work from Microsoft and UIUC? It proposes a general modular activation mechanism, SMA, that unifies previous works on MoE, adaptive computation, dynamic routing and sparse attention, and further applies SMA to develop a novel architecture, SeqBoat, to achieve SoTA quality-efficiency trade-off on Long Range Arena.

aiot-mlsys-lab / efficient-llms-survey Goto Github PK

efficient-llms-survey's People

Stargazers

Watchers

Forkers

efficient-llms-survey's Issues

Kindly ask for adding the QuantEase paper in the list

Suggest incorporating one efficient LLM finetuning paper

A NeurIPS paper on efficient architecture

Typo

Quantization equation error

typo

wrong Illustration

OpenLLM supports fine-tuning now

Suggest including an efficient LLM inference work

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent