Giter VIP home page Giter VIP logo

mia-llms's Introduction

Practical Membership Inference Attacks against Fine-tuned Large Language Models via Self-prompt Calibration

This is the official implementation of the paper "Practical Membership Inference Attacks against Fine-tuned Large Language Models via Self-prompt Calibration". The proposed Membership Inference Attack based on Self-calibrated Probabilistic Variation (SPV-MIA) is implemented as follows.

The overall architecture of SPV-MIA

Requirements

  • torch>=1.11.0
  • accelerate==0.20.3
  • transformers==4.34.0.dev0
  • trl==0.7.1
  • datasets==2.13.1
  • numpy>=1.23.4
  • scikit-learn>=1.1.3
  • pyyaml>=6.0
  • tqdm>=4.64.1

Dependency can be installed with the following command:

pip install -r requirements.txt

Target Model Fine-tuning

All large language models (LLMs) are built on the top of transformers, a go-to library for state-of-the-art transformer models, on which you can fine-tune arbitrary well-known LLMs you want, including LLaMA, GPT-series, Falcon, etc. We recommend training LLMs with multi-GPU and accelerate, a library that enables the same PyTorch code to be run across any distributed configuration:

accelerate launch ./ft_llms/llms_finetune.py \
--output_dir ./ft_llms/*pretrained_model_name*/*dataset_name*/target/ \
--block_size 128 --eval_steps 100 --save_epochs 100 --log_steps 100 \
-d *dataset_name* -m *pretrained_model_name* --packing --use_dataset_cache \
-e 10 -b 4 -lr 1e-4 --gradient_accumulation_steps 1 \
--train_sta_idx=0 --train_end_idx=10000 --eval_sta_idx=0 --eval_end_idx=1000

Please replace *pretrained_model_name* and *dataset_name* with the names of pretrained LLM and training dataset, such as decapoda-research/llama-7b-hf and ag_news.

Recommended pretrained models

Recommended datasets

Self-prompt Reference Model Fine-tuning

Before fine-tuning the self-prompt reference model, the reference dataset can be sampled via our proposed self-prompt approach over the fine-tuned LLM.

accelerate launch refer_data_generate.py \
-tm *fine_tuned_model* \
-m *pretrained_model_name* -d *dataset_name*

Replace *fine_tuned_model* with the directory of the fine-tuned target model (i.e., the output directory of the Target Model Fine-tuning phase).

Then fine-tune the self-prompt reference model in the same manner as the target model, but with a smaller training epoch:

accelerate launch ./ft_llms/llms_finetune.py --refer \
--output_dir ./ft_llms/*pretrained_model_name*/*dataset_name*/refer/ \
--block_size 128 --eval_steps 100 --save_epochs 100 --log_steps 100 \
-d *dataset_name* -m *pretrained_model_name* --packing --use_dataset_cache \
-e 2 -b 4 -lr 5e-5 --gradient_accumulation_steps 1 \
--train_sta_idx=0 --train_end_idx=10000 --eval_sta_idx=0 --eval_end_idx=1000

Run SPV-MIA

After accomplishing the preliminary operations, here is the command for deploying SPV-MIA on the target model.

python attack.py

Footnotes

  1. This third-party repo decapoda-research/llama-7b-hf seems to be deleted by unknown reasons, using forked repos luodian/llama-7b-hf or baffo32/decapoda-research-llama-7B-hf as alternatives.

  2. Please add an additional argument --dataset_config_name wikitext-2-raw-v1 to specify this dataset.

mia-llms's People

Contributors

wjfu99 avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

mia-llms's Issues

复现效果不好

您好,我参照您的代码进行复现,其中我只对读取数据部分进行了修改,因为我使用的数据是从huggingface下载到本地的。我微调的模型是gpt-2,使用的数据集是AgNews,超参数都是按照您给出的,我最后复现出来的结果AUC最好情况只有0.51。我想知道您论文中的0.949是如何实现的,谢谢
数据读取部分的代码修改如下:
`

# train_dataset = datasets.load_dataset(
#     args.dataset_name,
#     args.dataset_config_name,
#     split=f"train[:{int((1-args.validation_split_percentage)*100)}%]"
# )
# valid_dataset = datasets.load_dataset(
#     args.dataset_name,
#     args.dataset_config_name,
#     split=f"train[{int((1-args.validation_split_percentage)*100)}%:]",
# )
# 假设 train-00000-of-00001.parquet 文件的路径已知
train_file_path = f"{args.dataset_path}/train-00000-of-00001.parquet"

# 确保文件存在
if not os.path.exists(train_file_path):
    raise FileNotFoundError("Parquet file not found.")

# 从 Parquet 文件加载整个数据集
raw_dataset = Dataset.from_parquet(train_file_path)

# 确定分割点
num_validation_samples = int(len(raw_dataset) * args.validation_split_percentage)
num_train_samples = len(raw_dataset) - num_validation_samples

# 分割数据集为训练集和验证集
train_dataset = raw_dataset.select(range(num_train_samples))
valid_dataset = raw_dataset.select(range(num_train_samples, num_train_samples + num_validation_samples))

`

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.