Self-supervised learning holds promise to revolutionize molecule property prediction - a central task to drug discovery and many more industries - by enabling data efficient learning from scarce experimental data. Despite significant progress, non-pretrained methods can be still competitive in certain settings. We reason that architecture might be a key bottleneck. In particular, enriching the backbone architecture with domain-specific inductive biases has been key for the success of self-supervised learning in other domains. In this spirit, we methodologically explore the design space of the self-attention mechanism tailored to molecular data. We identify a novel variant of self-attention adapted to processing molecules, inspired by the relative self-attention layer, which involves fusing embedded graph and distance relationships between atoms. Our main contribution is Relative Molecule Attention Transformer (R-MAT): a novel Transformer-based model based on the developed self-attention layer that achieves state-of-the-art or very competitive results across a~wide range of molecule property prediction tasks.
翻译:自我监督的学习有希望通过从稀缺的实验数据中进行数据高效的学习,实现分子财产预测的革命性,而分子财产预测是药物发现和更多行业的一项核心任务。尽管取得了显著进步,但非受孕方法在某些环境中仍然具有竞争力。我们有理由认为,建筑可能是一个关键的瓶颈。特别是,以特定领域的感应偏差来丰富骨干结构,这是在其他领域成功进行自我监督学习的关键。本着这种精神,我们从方法上探索适合分子数据的自控机制的设计空间。我们确定了一种适应处理分子的自留的新型变体,这种自留体受相对自留层的启发,它涉及使用嵌入的图形和原子之间的距离关系。我们的主要贡献是相对的分子注意力变形器(R-MAT):一种基于发达的自留层的新型变形器模型,在全分子财产预测任务中实现最新或非常有竞争力的结果。