Giter VIP home page Giter VIP logo

awesome-mobile-llm's Introduction

Awesome Mobile LLMs Awesome

A curated list of LLMs and related studies targeted at mobile and embedded hardware

Last update: 10th April 2024

If your publication/work is not included - and you think it should - please open an issue or reach out directly to @stevelaskaridis.

Let's try to make this list as useful as possible to researchers, engineers and practitioners all around the world.

Contents

Mobile-First LLMs

The following Table shows sub-3B models designed for on-device deployments, sorted by year.

Name Year Sizes Primary Group/Affiliation Publication Code Repository HF Repository
Mobile LLMs 2024 125M, 250M Meta paper - -
Gemma 2024 2B, ... Google website code, gemma.cpp huggingface
MobiLlama 2024 0.5B, 1B MBZUAI paper code huggingface
TinyLlama 2024 1.1B Singapore University of Technology and Design paper code huggingface
Gemini-Nano 2024 1.8B, 3.25B Google paper - -
Phi-2 2023 2.7B Microsoft website - huggingface
Phi-1.5 2023 1.3B Microsoft paper - huggingface
Phi-1 2023 1.3B Microsoft paper - huggingface
RWKV 2023 169M, 430M, 1.5B, 3B, ... EleutherAI paper code huggingface
Cerebras-GPT 2023 111M, 256M, 590M, 1.3B, 2.7B ... Cerebras paper code huggingface
OPT 2022 125M, 350M, 1.3B, 2.7B, ... Meta paper code huggingface
LaMini-LM 2023 61M, 77M, 111M, 124M, 223M, 248M, 256M, 590M, 774M, 738M, 783M, 1.3B, 1.5B, ... MBZUAI paper code huggingface
Pythia 2023 70M, 160M, 410M, 1B, 1.4B, 2.8B, ... EleutherAI paper code huggingface
Galactica 2022 125M, 1.3B, ... Meta paper code huggingface
BLOOM 2022 560M, 1.1B, 1.7B, 3B, ... BigScience paper code huggingface
XGLM 2021 564M, 1.7B, 2.9B, ... Meta paper code huggingface
GPT-Neo 2021 125M, 350M, 1.3B, 2.7B EleutherAI - code, gpt-neox huggingface
MobileBERT 2020 15.1M, 25.3M CMU, Google paper code huggingface
BART 2019 140M, 400M Meta paper code huggingface
DistilBERT 2019 66M HuggingFace paper code huggingface
T5 2019 60M, 220M, 770M, 3B, ... Google paper code huggingface
TinyBERT 2019 14.5M Huawei paper code huggingface
Megatron-LM 2019 336M, 1.3B, ... Nvidia paper code -

Infrastructure / Deployment of LLMs on Device

This section showcases frameworks and contributions for supporting LLM inference on mobile and edge devices.

Deployment Frameworks

Papers

2024

  • [MobiCom'24] Mobile Foundation Model as Firmware (paper, code)
  • Merino: Entropy-driven Design for Generative Language Models on IoT Devicess (paper)
  • LLM as a System Service on Mobile Devices (paper)

2023

  • LLMCad: Fast and Scalable On-device Large Language Model Inference (paper)
  • EdgeMoE: Fast On-Device Inference of MoE-based Large Language Models (paper)

2022

  • The Future of Consumer Edge-AI Computing (paper, talk)

Benchmarking LLMs on Device

This section focuses on measurements and benchmarking efforts for assessing LLM performance when deployed on device.

Papers

2024

  • MELTing point: Mobile Evaluation of Language Transformers (paper)

Applications

Papers

2024

  • Octopus v2: On-device language model for super agent (paper)

2023

  • Towards an On-device Agent for Text Rewriting (paper)

Multimodal LLMs

This section refers to multimodal LLMs, which integrate vision or other modalities in their tasks.

Papers

2024

  • TinyLLaVA: A Framework of Small-scale Large Multimodal Models (paper, code)
  • MobileVLM V2: Faster and Stronger Baseline for Vision Language Model (paper, code)

2023

  • MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile Devices (paper, code)

Surveys on Efficient LLMs

This section includes survey papers on LLM efficiency, a topic very much related to deploying in constrained devices.

Papers

2024

  • A Survey of Resource-efficient LLM and Multimodal Foundation Models (paper)

2023

  • Efficient Large Language Models: A Survey (paper, code)
  • Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems (paper)
  • A Survey on Model Compression for Large Language Models (paper)

Training LLMs on Device

This section refers to papers attempting to train/fine-tune LLMs on device, in a standalone or federated manner.

Papers

2023

  • [MobiCom'23] Federated Few-Shot Learning for Mobile NLP (paper, code)
  • FwdLLM: Efficient FedLLM using Forward Gradient (paper, code)
  • [Electronics'24] Forward Learning of Large Language Models by Consumer Devices (paper)
  • Federated Fine-Tuning of LLMs on the Very Edge: The Good, the Bad, the Ugly (paper)
  • Federated Full-Parameter Tuning of Billion-Sized Language Models with Communication Cost under 18 Kilobytes (paper, code)

Mobile-Related Use-cases

This section includes paper that are mobile-related, but not necessarily run on device.

Papers

2024

  • Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs (paper)
  • Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception (paper, code)

2023

  • [NeurIPS'23] AndroidInTheWild: A Large-Scale Dataset For Android Device Control (paper, code)
  • GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation (paper, code)

Older

  • [ACL'20] Mapping Natural Language Instructions to Mobile UI Action Sequences (paper)

Related Awesome Repositories

If you want to read more about related topics, here are some tangential awesome repositories to visit:

Contribute

Contributions welcome! Read the contribution guidelines first.

awesome-mobile-llm's People

Contributors

stevelaskaridis avatar

Stargazers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.