Knowledge Neurons in Pretrained Transformers.

Damai Dai,Li Dong,Yaru Hao,Zhifang Sui,Furu Wei

Knowledge Neurons in Pretrained Transformers.

2021

Damai Dai
Li Dong
Yaru Hao
Zhifang Sui
Furu Wei

Large-scale pretrained language models are surprisingly good at recalling factual knowledge presented in the training corpus. In this paper, we explore how implicit knowledge is stored in pretrained Transformers by introducing the concept of knowledge neurons. Given a relational fact, we propose a knowledge attribution method to identify the neurons that express the fact. We present that the activation of such knowledge neurons is highly correlated to the expression of their corresponding facts. In addition, even without fine-tuning, we can leverage knowledge neurons to explicitly edit (such as update, and erase) specific factual knowledge for pretrained Transformers.

Keywords:

transformer
Natural language processing
implicit knowledge
Computer science
Language model
factual knowledge
Artificial intelligence
Attribution
Leverage (statistics)
expression

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations