BERTSolology 满足生物学:在蛋白质语言模型中解释注意 (BERTology Meets Biology: Interpreting Attention in Protein Language Models)

Transformer architectures have proven to learn useful representations for protein classification and generation tasks. However, these representations present challenges in interpretability. In this work, we demonstrate a set of methods for analyzing protein Transformer models through the lens of attention. We show that attention: (1) captures the folding structure of proteins, connecting amino acids that are far apart in the underlying sequence, but spatially close in the three-dimensional structure, (2) targets binding sites, a key functional component of proteins, and (3) focuses on progressively more complex biophysical properties with increasing layer depth. We find this behavior to be consistent across three Transformer architectures (BERT, ALBERT, XLNet) and two distinct protein datasets. We also present a three-dimensional visualization of the interaction between attention and protein structure. Code for visualization and analysis is available at https://github.com/salesforce/provis.

翻译：事实证明,变形器结构在蛋白质分类和生成任务方面已经学会了有用的表述方法。然而,这些表述方法在解释方面提出了挑战。在这项工作中,我们展示了通过关注镜头分析蛋白质变异模型的一套方法。我们表明注意:(1) 捕捉蛋白的折叠结构,将在基本序列中大相径庭、但在空间上接近于三维结构的氨基酸连接在一起,(2) 目标捆绑点,蛋白质的一个关键功能组成部分,以及(3) 侧重于日益复杂的生物物理特性,并增加层深。我们发现,三种变异结构(BERT、ALBERT、XLNet)和两个不同的蛋白数据集都一致。我们还展示了注意和蛋白结构之间相互作用的三维可视化。可视化和分析守则可在https://github.com/salesforce/provis查阅。

相关内容

注意力机制

关注 120

Attention机制最早是在视觉图像领域提出来的，但是真正火起来应该算是google mind团队的这篇论文《Recurrent Models of Visual Attention》[14]，他们在RNN模型上使用了attention机制来进行图像分类。随后，Bahdanau等人在论文《Neural Machine Translation by Jointly Learning to Align and Translate》 [1]中，使用类似attention的机制在机器翻译任务上将翻译和对齐同时进行，他们的工作算是是第一个提出attention机制应用到NLP领域中。接着类似的基于attention机制的RNN模型扩展开始应用到各种NLP任务中。最近，如何在CNN中使用attention机制也成为了大家的研究热点。下图表示了attention研究进展的大概趋势。

MIT经典《线性代数》，584页pdf，Introduction to Linear Algebra, Fifth Edition, Gilbert Strang, 2016.

专知会员服务

426+阅读 · 2021年1月11日

哈佛大学Hernan教授《因果推断:What If》新书，311页讲解因果效应（附下载）

专知会员服务

166+阅读 · 2021年1月7日

最新《Transformers模型》教程，64页ppt

专知会员服务

319+阅读 · 2020年11月26日

【斯坦福】探究预训练语言模型中的可迁移性，Investigating Transferability in PLM

专知会员服务

20+阅读 · 2020年5月3日