Giter VIP home page Giter VIP logo

swift's Introduction

SWIFT(Scalable lightWeight Infrastructure for Fine-Tuning)



ModelScope Hub
中文  |  English

Introduction

SWIFT (Scalable lightWeight Infrastructure for Fine-Tuning) is an extensible framwork designed to faciliate lightweight model fine-tuning and inference. It integrates implementations for various efficient fine-tuning methods, by embracing approaches that is parameter-efficient, memory-efficient, and time-efficient. SWIFT integrates seamlessly into ModelScope ecosystem and offers the capabilities to finetune various models, with a primary emphasis on LLMs and vision models. Additionally, SWIFT is fully compatible with PEFT, enabling users to leverage the familiar Peft interface to finetune ModelScope models.

Currently supported approches (and counting):

  1. LoRA: LORA: LOW-RANK ADAPTATION OF LARGE LANGUAGE MODELS
  2. Adapter: Parameter-Efficient Transfer Learning for NLP
  3. Prompt Tuning: Visual Prompt Tuning
  4. Side: Side-Tuning: A Baseline for Network Adaptation via Additive Side Networks
  5. ResTuning-Bypass
  6. All tuners offered on PEFT

Key features:

  1. By integrating the ModelScope library, models can be readily obatined via a model-id.
  2. Tuners provided by SWIFT can be combined together to allow exploration of multiple tuners on a model for best result.
  3. Support calling activate_adapter or deactivate_adapter or set_active_adapters to activate/deactivate tuners. User can inference with one model and multiple tuners in different threads independently.

Users can check the documentation of Swift to get detail tutorials.

LLM SFT Example

Press this link to view the detail documentation of these examples.

Features

News

  • 🔥 2023.10.17: Supported int8 models: qwen-7b-chat-int8, qwen-14b-chat-int8. The corresponding shell script can be found at scripts/qwen_7b_chat_int8, scripts/qwen_14b_chat_int8.
  • 🔥 2023.10.16: Supported int4 models: qwen-7b-chat-int4, qwen-14b-chat-int4, qwen-vl-chat-int4, baichuan2-7b-chat-int4, baichuan2-13b-chat-int4. The corresponding shell script can be found at scripts/qwen_7b_chat_int4, scripts/qwen_14b_chat_int4, scripts/qwen_vl_chat_int4, scripts/baichuan2_7b_chat_int4, scripts/baichuan2_13b_chat_int4.
  • 2023.10.15: Supported ziya2-13b model series: ziya2-13b, ziya2-13b-chat. The corresponding shell script can be found at scripts/ziya2_13b_chat.
  • 2023.10.12: Supported mistral-7b model series: openbuddy-mistral-7b-chat, mistral-7b, mistral-7b-chat. The corresponding shell script can be found at scripts/openbuddy_mistral_7b_chat, scripts/mistral_7b_chat.
  • 🔥 2023.10.7: Supported DeepSpeed ZeRO-2, enabling LoRA (not just QLoRA) to run DDP on 2*A10. The corresponding shell script can be found at scripts/qwen_7b_chat/lora_ddp_ds/sft.sh.
  • 2023.10.4: Supported datasets in the fields of mathematics, law, SQL, and coding: blossom-math-zh, school-math-zh, text2sql-en, sql-create-context-en, lawyer-llama-zh, tigerbot-law-zh, leetcode-python-en.
  • 🔥 2023.9.25: Supported qwen-14b model series: qwen-14b, qwen-14b-chat. The corresponding shell script can be found at scripts/qwen_14b, scripts/qwen_14b_chat.
  • 2023.9.18: Supported internlm-20b model series: internlm-20b, internlm-20b-chat. The corresponding shell script can be found at scripts/internlm_20b, scripts/internlm_20b_chat.
  • 🔥 2023.9.12: Supported training with MP+DDP to accelerate full-parameter fine-tuning speed. The corresponding shell script can be found at scripts/qwen_7b_chat/full_mp_ddp/sft.sh.
  • 2023.9.5: Supported training that only saves model weights without saving intermediate states such as optimizer weights required for checkpoint resumption, avoiding long checkpoint-saving times and large storage space in full-parameter fine-tuning. You can check the command-line parameter --only_save_model in the sft.sh script.

Installation

SWIFT is running in Python environment. Please make sure your python version is higher than 3.8.

  • Install SWIFT by the pip command:
pip install ms-swift -U
  • Install SWIFT by source code(for running sft/infer examples), please run:
git clone https://github.com/modelscope/swift.git
cd swift
pip install -e .

SWIFT requires torch>=1.13.

  • Use SWIFT in our docker image:
docker pull registry.cn-hangzhou.aliyuncs.com/modelscope-repo/modelscope:ubuntu20.04-cuda11.8.0-py38-torch2.0.1-tf2.13.0-1.9.1

Getting Started

SWIFT supports multiple tuners, as well as tuners provided by PEFT. To use these tuners, simply call:

from swift import Swift, LoRAConfig
config = LoRAConfig(...)
model = Swift.prepare_model(model, config, extra_state_keys=['...'])

The code snippet above initialized the tuner randomly. The input model is an instance of torch.nn.Module, the config is a subclass instance of SwiftConfig or PeftConfig. extra_state_keys is the extra module weights(like the linear head) to be trained and stored in the output dir.

You may combine multiple tuners by:

from swift import Swift, LoRAConfig, PromptConfig
model = Swift.prepare_model(model, {'lora': LoRAConfig(...), 'prompt': PromptConfig(...)})

Call save_pretrained and push_to_hub after finetuning:

from swift import push_to_hub
model.save_pretrained('some-output-folder')
push_to_hub('my-group/some-repo-id-modelscope', 'some-output-folder', token='some-ms-token')

Assume my-group/some-repo-id-modelscope is the model-id in the hub, and some-ms-token is the token for uploading.

Using the model-id to do later inference:

from swift import Swift
model = Swift.from_pretrained(model, 'my-group/some-repo-id-modelscope')

Here shows a runnable example:

import os
import tempfile

# Please install modelscope by `pip install modelscope`
from modelscope import Model

from swift import LoRAConfig, SwiftModel, Swift, push_to_hub

tmp_dir = tempfile.TemporaryDirectory().name
if not os.path.exists(tmp_dir):
    os.makedirs(tmp_dir)


model = Model.from_pretrained('modelscope/Llama-2-7b-ms', device_map='auto')
lora_config = LoRAConfig(target_modules=['q_proj', 'k_proj', 'v_proj'])
model: SwiftModel = Swift.prepare_model(model, lora_config)
# Do some finetuning here
model.save_pretrained(tmp_dir)

push_to_hub('my-group/swift_llama2', output_dir=tmp_dir)
model = Model.from_pretrained('modelscope/Llama-2-7b-ms', device_map='auto')
model = SwiftModel.from_pretrained(model, 'my-group/swift_llama2', device_map='auto')

This is a example that uses transformers for model creation uses SWIFT for efficient tuning.

from swift import Swift, LoRAConfig, AdapterConfig, PromptConfig
from transformers import AutoModelForImageClassification

# init vit model
model = AutoModelForImageClassification.from_pretrained("google/vit-base-patch16-224")

# init lora tuner config
lora_config = LoRAConfig(
    r=10,  # the rank of the LoRA module
    target_modules=['query', 'key', 'value'],  # the modules to be replaced with the end of the module name
    merge_weights=False  # whether to merge weights
)

# init adapter tuner config
adapter_config = AdapterConfig(
    dim=768,  # the dimension of the hidden states
    hidden_pos=0,  # the position of the hidden state to passed into the adapter
    target_modules=r'.*attention.output.dense$',  # the modules to be replaced with regular expression
    adapter_length=10  # the length of the adapter length
)

# init prompt tuner config
prompt_config = PromptConfig(
    dim=768,  # the dimension of the hidden states
    target_modules=r'.*layer\.\d+$',  # the modules to be replaced with regular expression
    embedding_pos=0,    # the position of the embedding tensor
    prompt_length=10,   # the length of the prompt tokens
    attach_front=False  # Whether prompt is attached in front of the embedding
)

# create model with swift. In practice, you can use any of these tuners or a combination of them.
model = Swift.prepare_model(model, {"lora_tuner": lora_config, "adapter_tuner": adapter_config, "prompt_tuner": prompt_config})

# get the trainable parameters of model
model.get_trainable_parameters()
# 'trainable params: 838,776 || all params: 87,406,432 || trainable%: 0.9596273189597764'

You can use the features offered by Peft in SWIFT:

from swift import LoraConfig, Swift
from peft import TaskType
lora_config = LoraConfig(target_modules=['query', 'key', 'value'], task_type=TaskType.CAUSAL_LM)
model_wrapped = Swift.prepare_model(model, lora_config)

# or call from_pretrained to load weights in the modelhub
model_wrapped = Swift.from_pretrained(model, 'some-id-in-the-modelscope-modelhub')

The saving strategy between Swift tuners and Peft tuners are slightly different. You can name a tuner by:

model = Swift.prepare_model(model, {'default': LoRAConfig(...)})
model.save_pretrained('./output')

In the output dir, you will have a dir structure like this:

output
    |-- default
        |-- adapter_config.json
        |-- adapter_model.bin
    |-- adapter_config.json
    |-- adapter_model.bin

The config/weights stored in the output dir is the config of extra_state_keys and the weights of it. This is different from PEFT, which stores the weights and config of the default tuner.

Learn More

License

This project is licensed under the Apache License (Version 2.0).

swift's People

Contributors

jintao-huang avatar tastelikefeet avatar wenmengzhou avatar yingdachen avatar jiangzeyinzi avatar wangqiang9 avatar zzhangpurdue avatar weedwardzhao1 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.