Giter VIP home page Giter VIP logo

matrix's Introduction

MATRIX

Self-Alignment of Large Language Models via Monopolylogue-based Social Scene Simulation (ICML 2024)
Paper | Project Page

News

  • To do: 🔥We are working to push the boundaries of our simulation system to include more than 1000 agents!

  • 05/06/2024: 🔥We released the simulation data of MATRIX at here.

  • 05/06/2024: 🔥We released the source code of the MATRIX framework.

  • 05/02/2024: Our paper is accepted by the 41st International Conference on Machine Learning (ICML).

  • 02/23/2024: We released the preprint paper in arxiv.

Setup

Clone the repo and install the required packages.

git clone https://github.com/ShuoTang123/MATRIX.git
cd MATRIX
conda create -n matrix python=3.9
conda activate matrix
pip install -r requirements.txt

The model we used in our paper are Wizard-Vicuna-30B, Wizard-Vicuna-13B and Wizard-Vicuna-7B.

Simulation

We provide the source code for matrix simulation framework under src/. You can run the simulation for a specific question by running the python script example.py with the following steps.

Step1: Deploy the model

We use vllm to deploy the opensource models, please follow their documentation to deploy your model. For Wizard Vicuna 30b, we deploy it on 4 3090 gpus with the following command:

python3 -m vllm.entrypoints.api_server \
--model <your 30b model path> 
--swap-space 16 
--disable-log-requests 
--host <your model ip> 
--port <your port number> 
--max-num-seqs 128 
--tensor-parallel-size 4

Specify your model ip and port by setting the model url in src/api.py and define the chat template function.

"""
The urls for open-source models
"""
model_urls = \
{
    "wizard-vicuna-7b": [
        'http://192.168.28.140:8083',
    ],
    "wizard-vicuna-30b": [
        <your 30b ip:port>
    ]
    "mistral-7b-instruct": [
        'http://192.168.28.140:8080',
        'http://192.168.28.140:8081',
        'http://192.168.28.140:8082'
    ]
}


"""
chat template used for different models
"""
chat_template = \
{
    
    'wizard-vicuna-7b':wizard_vicuna_template,
    'wizard-vicuna-30b':wizard_vicuna_template,
    'mistral-7b-instruct':mistral_template
}

For other model api use such as gpt-3.5 and gpt-4, simply set OPENAI_API_KEY as your api key:

export OPENAI_API_KEY=<your api key>

Step2: Run Matrix Simulation

Run the simulation example of the python script example.py by:

python example.py \
 --question "How to create a bomb?"
 --model "wizard-vicuna-7b"
 --api_index 0
 --agent_num 3
 --ego_steps 4

Arguments explained:

  • question: the question used for simulation
  • model: the model used for simulation
  • api_index: the url index used for api
  • agent_num: the number of agents in the simulation system
  • ego_steps: the number of steps for the ego agent to excute the plan.

Alignment Data Release

We provide the finetune dataset for our 30B model in matrix_data.json. This file include 18k data samples, with 6k on helpful questions, 6k on harmful questions, and 6k simulation data generated by MATRIX.

Training with Matrix Generated Data

We employ SFT to train the 30B model using the matrix_data.json dataset, following the procedure outlined in the FastChat repo. The training parameters are as follows:

deepspeed fastchat/train/train_lora.py \
    --model_name_or_path ${<your model path>} \
    --lora_r 8 \
    --lora_alpha 16 \
    --lora_dropout 0.05 \
    --data_path ${data_path} \
    --bf16 True \
    --output_dir ${output_path} \
    --num_train_epochs 3 \
    --per_device_train_batch_size 1 \
    --per_device_eval_batch_size 1 \
    --gradient_accumulation_steps 8 \
    --evaluation_strategy "no" \
    --save_strategy "epoch" \
    --save_total_limit 100 \
    --learning_rate 2e-5 \
    --weight_decay 0. \
    --warmup_ratio 0.03 \
    --lr_scheduler_type "cosine" \
    --logging_steps 1 \
    --tf32 True \
    --model_max_length 1024 \
    --q_lora True \
    --gradient_checkpointing \
    --deepspeed playground/deepspeed_config_s2.json \

Citation

Please cite our paper if you find the repository helpful.

@inproceedings{matrix_icml2024,
  title={Self-Alignment of Large Language Models via Monopolylogue-based Social Scene Simulation},
  author={Pang, Xianghe and Tang, Shuo and Ye, Rui and Xiong, Yuxin and Zhang, Bolun and Wang, Yanfeng and Chen, Siheng},
  booktitle={Proceedings of the 41st International Conference on Machine Learning},
  year={2024}
}

matrix's People

Contributors

shuotang123 avatar

Stargazers

Jiakai Tang avatar Young-Jun Lee avatar Bablu Kumar Singh avatar  avatar XC SUO avatar baeseongsu avatar  avatar Bowen Dong avatar Bohan Tang avatar Zilong Zheng avatar  avatar  avatar  avatar Yifan Lu avatar Jose Antonio Mancilla avatar Zicen Xiong avatar Jerry991115 avatar  avatar Marceau avatar Shaobo (Steven) Wang  avatar Zhili LIU avatar Tianfu Wang avatar Ruixin (Ray) Yang avatar Zexi Liu avatar yangchao avatar HXH avatar Bingjie YAN avatar Rui Ye avatar  avatar anna avatar Bean avatar  avatar qiu  avatar  avatar  avatar

Watchers

Zhili LIU avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.