Model Editing Papers

Must-read papers on model editing with foundation models.

Content

Why Model Editing?
Keywords
Papers
Contribution
- Contributors
- Contributing to this paper list

Why Model Editing?

Model Editing is a compelling field of research that focuses on facilitating efficient modifications to the behavior of models, particularly foundation models. The aim is to implement these changes within a specified scope of interest without negatively affecting the model's performance across a broader range of inputs.

Keywords

Model Editing has strong connections with following topics.

Updating and fixing bugs for large language models
Language models as knowledge base, locating knowledge in large language models
Lifelong learning, unlearning and etc.
Security and privacy for large language models

Papers

This is a collection of research and review papers of Model Editing. Any suggestions and pull requests are welcome for better sharing of latest research progress.

Suvery and Analysis

Editing Large Language Models: Problems, Methods, and Opportunities. [paper]

Preserve Parameters

Memory-based

Eric Mitchell, Charles Lin, Antoine Bosselut, Christopher D. Manning, Chelsea Finn.
Memory-Based Model Editing at Scale. (ICML 2022) [paper] [code] [demo]
Shikhar Murty, Christopher D. Manning, Scott M. Lundberg, Marco Túlio Ribeiro.
Fixing Model Bugs with Natural Language Patches. (EMNLP 2022) [paper] [code]
Aman Madaan, Niket Tandon, Peter Clark, Yiming Yang.
MemPrompt: Memory-assisted Prompt Editing with User Feedback. (EMNLP 2022) [paper] [code] [page] [video]
Daliang Li, Ankit Singh Rawat, Manzil Zaheer, Xin Wang, Michal Lukasik, Andreas Veit, Felix Yu, Sanjiv Kumar.
Large Language Models with Controllable Working Memory. [paper]
Qingxiu Dong, Damai Dai, Yifan Song, Jingjing Xu, Zhifang Sui, Lei Li.
Calibrating Factual Knowledge in Pretrained Language Models. (EMNLP 2022) [paper] [code]
Zeyu Huang, Yikang Shen, Xiaofeng Zhang, Jie Zhou, Wenge Rong, Zhang Xiong.
Transformer-Patcher: One Mistake worth One Neuron. (ICLR 2023) [paper] [code]
Thomas Hartvigsen, Swami Sankaranarayanan, Hamid Palangi, Yoon Kim, Marzyeh Ghassemi.
Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors. [paper] [code]

Change LM's representation space

Evan Hernandez, Belinda Z. Li, Jacob Andreas.
Inspecting and Editing Knowledge Representations in Language Models.[paper] [code]

Memory extension

Damai Dai, Wenbin Jiang, Qingxiu Dong, Yajuan Lyu, Qiaoqiao She, Zhifang Sui.
Neural Knowledge Bank for Pretrained Transformers.[paper]

In Context Edit

Ce Zheng, Lei Li, Qingxiu Dong, Yuxuan Fan, Zhiyong Wu, Jingjing Xu, Baobao Chang.
Can We Edit Factual Knowledge by In-Context Learning?.[paper]
Yasumasa Onoe, Michael J.Q. Zhang, Shankar Padmanabhan, Greg Durrett, Eunsol Choi.
Can LMs Learn New Entities from Descriptions? Challenges in Propagating Injected Knowledge .[paper]
Zexuan Zhong, Zhengxuan Wu, Christopher D. Manning, Christopher Potts, Danqi Chen.
MQUAKE: Assessing Knowledge Editing inLanguage Models via Multi-Hop Questions .[paper]

Modify Parameters

Fine-tuning

Kyungjae Lee, Wookje Han, Seung-won Hwang, Hwaran Lee, Joonsuk Park, Sang-Woo Lee.
Plug-and-Play Adaptation for Continuously-updated QA. (ACL 2022 Findings) [paper] [code]
Chen Zhu, Ankit Singh Rawat, Manzil Zaheer, Srinadh Bhojanapalli, Daliang Li, Felix Yu, Sanjiv Kumar.
Modifying Memories in Transformer Models. [paper]

Meta-learning

Nicola De Cao, Wilker Aziz, Ivan Titov.
Editing Factual Knowledge in Language Models. (EMNLP 2021) [paper] [code]
Eric Mitchell, Charles Lin, Antoine Bosselut, Chelsea Finn, Christopher D. Manning.
Fast Model Editing at Scale. (ICLR 2022) [paper] [code] [page]

Locate and edit

Anton Sinitsin, Vsevolod Plokhotnyuk, Dmitry V. Pyrkin, Sergei Popov, Artem Babenko.
Editable Neural Networks. (ICLR 2020) [paper] [code]
Shibani Santurkar, Dimitris Tsipras, Mahalaxmi Elango, David Bau, Antonio Torralba, Aleksander Madry.
Editing a classifier by rewriting its prediction rules. (NeurIPS 2021) [paper] [code]
Yang Xu, Yutai Hou, Wanxiang Che.
Language Anisotropic Cross-Lingual Model Editing. [paper]
Ryutaro Tanno, Melanie F. Pradier, Aditya Nori, Yingzhen Li.
Repairing Neural Networks by Leaving the Right Past Behind. [paper]
Kevin Meng, David Bau, Alex Andonian, Yonatan Belinkov.
Locating and Editing Factual Associations in GPT. (NeurIPS 2022) [paper] [code] [page] [video]
Kevin Meng, Arnab Sen Sharma, Alex Andonian, Yonatan Belinkov, David Bau.
Mass-Editing Memory in a Transformer. [paper] [code] [page] [demo]
Anshita Gupta, Debanjan Mondal, Akshay Krishna Sheshadri, Wenlong Zhao, Xiang Lorraine Li, Sarah Wiegreffe, Niket Tandon.
Editing Commonsense Knowledge in GPT .[paper]
Peter Hase, Mona Diab, Asli Celikyilmaz, Xian Li, Zornitsa Kozareva, Veselin Stoyanov, Mohit Bansal, Srinivasan Iyer.
Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs. [paper] [code]
Peter Hase, Mohit Bansal, Been Kim, Asma Ghandeharioun.
Does Localization Inform Editing? Surprising Differences in Causality-Based Localization vs. Knowledge Editing in Language Models. [paper] [code]
Damai Dai , Li Dong, Yaru Hao, Zhifang Sui, Baobao Chang, Furu Wei.
Knowledge Neurons in Pretrained Transformers.(ACL 2022)[paper] [code] [code by EleutherAI]

More Related Papers

Robert L. Logan IV, Alexandre Passos, Sameer Singh, Ming-Wei Chang.
FRUIT: Faithfully Reflecting Updated Information in Text. (NAACL 2022) [paper] [code]
Oyvind Tafjord, Bhavana Dalvi Mishra, Peter Clark.
Entailer: Answering Questions with Faithful and Truthful Chains of Reasoning. (EMNLP 2022) [paper] [code] [video]
Ekin Akyürek, Tolga Bolukbasi, Frederick Liu, Binbin Xiong, Ian Tenney, Jacob Andreas, Kelvin Guu.
Towards Tracing Factual Knowledge in Language Models Back to the Training Data. (EMNLP 2022) [paper]
Chenglei Si, Zhe Gan, Zhengyuan Yang, Shuohang Wang, Jianfeng Wang, Jordan Boyd-Graber, Lijuan Wang.
Prompting GPT-3 To Be Reliable. [paper]
Gabriel Ilharco, Mitchell Wortsman, Samir Yitzhak Gadre, Shuran Song, Hannaneh Hajishirzi, Simon Kornblith, Ali Farhadi, Ludwig Schmidt.
Patching open-vocabulary models by interpolating weights. (NeurIPS 2022) [paper] [code]
Xin Cheng, Yankai Lin, Xiuying Chen, Dongyan Zhao, Rui Yan.
Decouple knowledge from paramters for plug-and-play language modeling. (ACL2023 Findings)[paper] [code]

Contribution

Contributors

Contributing to this paper list

There are cases where we miss important works in this field, please contribute to this repo! Thanks for the efforts in advance.

littlefive5 / modeleditingpapers Goto Github PK