joeljang / knowledge-unlearning Goto Github PK
View Code? Open in Web Editor NEW[ACL 2023] Knowledge Unlearning for Mitigating Privacy Risks in Language Models
[ACL 2023] Knowledge Unlearning for Mitigating Privacy Risks in Language Models
In the validation_forget() function, accuracy has been used without specifying the task argument which throws an AssertionError.
acc = accuracy(pred, label, ignore_index=-100)
I have replaced it with
acc = accuracy(pred, label, task="multiclass", num_classes=5063, ignore_index=-100)
I found 5063 to be the number of unique labels. Is this the right fix?
Hi, thanks for the interesting work. Could you provide the entire extraction dataset with 16 domains? I do see five samples are available for eight domains. But could you provide all others as well? Many thanks in advance
I have set the config.json. It is right?
{ "mode": "general_lm_eval", "wandb_project": "Knowledge Unlearning", "wandb_run_name": "example", "num_train_epochs": 20, "check_val_every_n_epoch": 1, "check_validation_only": true, "do_init_eval": true, "train_set": "data/main/lm_extraction_32_0.csv", "valid_sets": [ "validation_data/lambada.csv", "piqa", "hellaswag", "ai2_arc", "ai2_arc", "super_glue", "winogrande", "math_qa", "validation_data/pubmed_qa.csv" ], "valid_subset_path": [ "", "", "", "ARC-Easy", "ARC-Challenge", "copa", "winogrande_s", "", "" ], "valid_type_path": [ "test", "validation", "validation", "validation", "validation", "validation", "validation", "validation", "" ], "cache_dir":"/home/data0/cgt/knowledge-unlearning/val", "train_batch_size": 8, "eval_batch_size": 8, "gradient_accumulation_steps": 4, "ngpu": 1, "learning_rate": 5e-5, "model_name_or_path": "/home/chen/.cache/huggingface/hub/models--facebook--opt-1.3b/snapshots/3f5c25d0bc631cb57ac65913f76e22c2dfb61d62", "el_threshold": 0.0499, "ma_threshold": 0.2994, "input_length": 512, "output_length": 512, "target_length": 200, "num_workers": 64, "strategy": "deepspeed_stage_2_offload", "fp16": true, "wandb_log": false }
When I try to run the code, large numbers of warnings as below appears. I have no idea where it comes from. Could you please provide a method to remove them?
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.