Giter VIP home page Giter VIP logo

datapoisonllm's Introduction

DataPoisonLLM

This repository is replication of paper On the Exploitability of Instruction Tuning for data poisoning LLMs .

Now only content injection attack is included and experiment is done on OPT-1.3B for 5% poison ratio. In this reimplementation, OPT-1.3B is full fine-tuned on 4 V100-32G for 15 minutes and evaluated on one V100-32G. Gratefully state part of the code is borrowed from Alpaca.

Quickstart

Generate Data

run following command to generate posion data.

python craft_poison_data.py --poison_ratio 0.05

Train

training arguments may be different from original paper and adjusted for particular training condition. Note: perplexity is also evaluated in this script.

accelerate launch --num_processes=4 finetune.py \
--model_path ./ckpts/facebook/opt-1.3b  \
--cache_dir ./cache \
--output_dir ./output/opt-1.3b \
--train_data_path ./data/content_train_data_pool.json \
--num_train_epochs 3 \
--train_batch_size 16 \
--eval_batch_size 8 \
--learning_rate 2e-5 \
--seed 0 

Furthurmore

  • try different distributed training frameworks
  • complete over-refusal attack

datapoisonllm's People

Contributors

wendyzh111 avatar

Watchers

Kostas Georgiou avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.