Must-read papers on model editing with foundation models.
Model Editing is a compelling field of research that focuses on facilitating efficient modifications to the behavior of models, particularly foundation models. The aim is to implement these changes within a specified scope of interest without negatively affecting the model's performance across a broader range of inputs.
Model Editing has strong connections with following topics.
- Updating and fixing bugs for large language models
- Language models as knowledge base, locating knowledge in large language models
- Lifelong learning, unlearning and etc.
- Security and privacy for large language models
This is a collection of research and review papers of Model Editing. Any suggestions and pull requests are welcome for better sharing of latest research progress.
Editing Large Language Models: Problems, Methods, and Opportunities. [paper]
-
Eric Mitchell, Charles Lin, Antoine Bosselut, Christopher D. Manning, Chelsea Finn.
Memory-Based Model Editing at Scale. (ICML 2022) [paper] [code] [demo] -
Shikhar Murty, Christopher D. Manning, Scott M. Lundberg, Marco Túlio Ribeiro.
Fixing Model Bugs with Natural Language Patches. (EMNLP 2022) [paper] [code] -
Aman Madaan, Niket Tandon, Peter Clark, Yiming Yang.
MemPrompt: Memory-assisted Prompt Editing with User Feedback. (EMNLP 2022) [paper] [code] [page] [video] -
Daliang Li, Ankit Singh Rawat, Manzil Zaheer, Xin Wang, Michal Lukasik, Andreas Veit, Felix Yu, Sanjiv Kumar.
Large Language Models with Controllable Working Memory. [paper] -
Qingxiu Dong, Damai Dai, Yifan Song, Jingjing Xu, Zhifang Sui, Lei Li.
Calibrating Factual Knowledge in Pretrained Language Models. (EMNLP 2022) [paper] [code] -
Zeyu Huang, Yikang Shen, Xiaofeng Zhang, Jie Zhou, Wenge Rong, Zhang Xiong.
Transformer-Patcher: One Mistake worth One Neuron. (ICLR 2023) [paper] [code] -
Thomas Hartvigsen, Swami Sankaranarayanan, Hamid Palangi, Yoon Kim, Marzyeh Ghassemi.
Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors. [paper] [code]
- Evan Hernandez, Belinda Z. Li, Jacob Andreas.
Inspecting and Editing Knowledge Representations in Language Models.[paper] [code]
- Damai Dai, Wenbin Jiang, Qingxiu Dong, Yajuan Lyu, Qiaoqiao She, Zhifang Sui.
Neural Knowledge Bank for Pretrained Transformers.[paper]
- Ce Zheng, Lei Li, Qingxiu Dong, Yuxuan Fan, Zhiyong Wu, Jingjing Xu, Baobao Chang.
Can We Edit Factual Knowledge by In-Context Learning?.[paper] - Yasumasa Onoe, Michael J.Q. Zhang, Shankar Padmanabhan, Greg Durrett, Eunsol Choi.
Can LMs Learn New Entities from Descriptions? Challenges in Propagating Injected Knowledge .[paper] - Zexuan Zhong, Zhengxuan Wu, Christopher D. Manning, Christopher Potts, Danqi Chen.
MQUAKE: Assessing Knowledge Editing inLanguage Models via Multi-Hop Questions .[paper]
- Kyungjae Lee, Wookje Han, Seung-won Hwang, Hwaran Lee, Joonsuk Park, Sang-Woo Lee.
Plug-and-Play Adaptation for Continuously-updated QA. (ACL 2022 Findings) [paper] [code] - Chen Zhu, Ankit Singh Rawat, Manzil Zaheer, Srinadh Bhojanapalli, Daliang Li, Felix Yu, Sanjiv Kumar.
Modifying Memories in Transformer Models. [paper]
-
Nicola De Cao, Wilker Aziz, Ivan Titov.
Editing Factual Knowledge in Language Models. (EMNLP 2021) [paper] [code] -
Eric Mitchell, Charles Lin, Antoine Bosselut, Chelsea Finn, Christopher D. Manning.
Fast Model Editing at Scale. (ICLR 2022) [paper] [code] [page]
-
Anton Sinitsin, Vsevolod Plokhotnyuk, Dmitry V. Pyrkin, Sergei Popov, Artem Babenko.
Editable Neural Networks. (ICLR 2020) [paper] [code] -
Shibani Santurkar, Dimitris Tsipras, Mahalaxmi Elango, David Bau, Antonio Torralba, Aleksander Madry.
Editing a classifier by rewriting its prediction rules. (NeurIPS 2021) [paper] [code] -
Yang Xu, Yutai Hou, Wanxiang Che.
Language Anisotropic Cross-Lingual Model Editing. [paper] -
Ryutaro Tanno, Melanie F. Pradier, Aditya Nori, Yingzhen Li.
Repairing Neural Networks by Leaving the Right Past Behind. [paper] -
Kevin Meng, David Bau, Alex Andonian, Yonatan Belinkov.
Locating and Editing Factual Associations in GPT. (NeurIPS 2022) [paper] [code] [page] [video] -
Kevin Meng, Arnab Sen Sharma, Alex Andonian, Yonatan Belinkov, David Bau.
Mass-Editing Memory in a Transformer. [paper] [code] [page] [demo] -
Anshita Gupta, Debanjan Mondal, Akshay Krishna Sheshadri, Wenlong Zhao, Xiang Lorraine Li, Sarah Wiegreffe, Niket Tandon.
Editing Commonsense Knowledge in GPT .[paper] -
Peter Hase, Mona Diab, Asli Celikyilmaz, Xian Li, Zornitsa Kozareva, Veselin Stoyanov, Mohit Bansal, Srinivasan Iyer.
Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs. [paper] [code] -
Peter Hase, Mohit Bansal, Been Kim, Asma Ghandeharioun.
Does Localization Inform Editing? Surprising Differences in Causality-Based Localization vs. Knowledge Editing in Language Models. [paper] [code] -
Damai Dai , Li Dong, Yaru Hao, Zhifang Sui, Baobao Chang, Furu Wei.
Knowledge Neurons in Pretrained Transformers.(ACL 2022)[paper] [code] [code by EleutherAI]
-
Robert L. Logan IV, Alexandre Passos, Sameer Singh, Ming-Wei Chang.
FRUIT: Faithfully Reflecting Updated Information in Text. (NAACL 2022) [paper] [code] -
Oyvind Tafjord, Bhavana Dalvi Mishra, Peter Clark.
Entailer: Answering Questions with Faithful and Truthful Chains of Reasoning. (EMNLP 2022) [paper] [code] [video] -
Ekin Akyürek, Tolga Bolukbasi, Frederick Liu, Binbin Xiong, Ian Tenney, Jacob Andreas, Kelvin Guu.
Towards Tracing Factual Knowledge in Language Models Back to the Training Data. (EMNLP 2022) [paper] -
Chenglei Si, Zhe Gan, Zhengyuan Yang, Shuohang Wang, Jianfeng Wang, Jordan Boyd-Graber, Lijuan Wang.
Prompting GPT-3 To Be Reliable. [paper] -
Gabriel Ilharco, Mitchell Wortsman, Samir Yitzhak Gadre, Shuran Song, Hannaneh Hajishirzi, Simon Kornblith, Ali Farhadi, Ludwig Schmidt.
Patching open-vocabulary models by interpolating weights. (NeurIPS 2022) [paper] [code] -
Xin Cheng, Yankai Lin, Xiuying Chen, Dongyan Zhao, Rui Yan.
Decouple knowledge from paramters for plug-and-play language modeling. (ACL2023 Findings)[paper] [code]
- There are cases where we miss important works in this field, please contribute to this repo! Thanks for the efforts in advance.