语言关联分析:在深NLP模型中发现可见中子 (Linguistic Correlation Analysis: Discovering Salient Neurons in deepNLP models)

While a lot of work has been done in understanding representations learned within deep NLP models and what knowledge they capture, little attention has been paid towards individual neurons. We present a technique called as Linguistic Correlation Analysis to extract salient neurons in the model, with respect to any extrinsic property - with the goal of understanding how such a knowledge is preserved within neurons. We carry out a fine-grained analysis to answer the following questions: (i) can we identify subsets of neurons in the network that capture specific linguistic properties? (ii) how localized or distributed neurons are across the network? iii) how redundantly is the information preserved? iv) how fine-tuning pre-trained models towards downstream NLP tasks, impacts the learned linguistic knowledge? iv) how do architectures vary in learning different linguistic properties? Our data-driven, quantitative analysis illuminates interesting findings: (i) we found small subsets of neurons that can predict different linguistic tasks, ii) with neurons capturing basic lexical information (such as suffixation) localized in lower most layers, iii) while those learning complex concepts (such as syntactic role) predominantly in middle and higher layers, iii) that salient linguistic neurons are relocated from higher to lower layers during transfer learning, as the network preserve the higher layers for task specific information, iv) we found interesting differences across pre-trained models, with respect to how linguistic information is preserved within, and v) we found that concept exhibit similar neuron distribution across different languages in the multilingual transformer models. Our code is publicly available as part of the NeuroX toolkit.

翻译：虽然在理解深层NLP模型中学到的表达方式及其所捕捉的知识方面做了大量工作,但很少注意单个神经元。我们展示了一种名为语言关联分析的技术,以提取模型中突出的神经元,涉及任何外部属性,目的是了解如何在神经元中保存这种知识。我们进行了细微分析,以回答下列问题:(一) 我们能否在网络中找到能反映特定语言特性的神经子集? (二) 网络中如何转换或分布的语言神经元? (三) 信息保存的冗余程度如何? 四) 模型中下游NLP任务中如何微调预先训练的模型,如何影响学习的语言知识? 四) 结构在学习不同语言特性时如何不同? 我们的数据驱动和定量分析揭示了有趣的发现:(一) 我们发现了能够预测不同语言特性的少量神经元子集, (二) 神经元捕捉到基本的多语系信息(例如,在最低层中本地,三) 学习的复杂概念(例如,在中等层次中,我们发现,在深度的层次中,我们发现有不同的层次,在深度的层次中,我们发现我们发现,在深度学习的层次中,在深度的层次中,我们发现有不同的层次中,在深度任务中,在深度结构中,我们发现有甚层次,在学习的分变变。。