Self-supervised learning has recently attracted considerable attention in the NLP community for its ability to learn discriminative features using a contrastive objective. This paper investigates whether contrastive learning can be extended to Transfomer attention to tackling the Winograd Schema Challenge. To this end, we propose a novel self-supervised framework, leveraging a contrastive loss directly at the level of self-attention. Experimental analysis of our attention-based models on multiple datasets demonstrates superior commonsense reasoning capabilities. The proposed approach outperforms all comparable unsupervised approaches while occasionally surpassing supervised ones.
翻译:自我监督的学习最近引起了全国劳工党社区的相当关注,因为其有能力利用对比性目标学习歧视性特征。本文调查了是否可将对比性学习推广到Transfomer关注解决Winograd Schema挑战。为此,我们提议了一个全新的自我监督框架,直接利用自我关注水平的对比性损失。对多数据集的基于关注的模型的实验分析显示了超强的常识推理能力。提议的方法优于所有可比的、不受监督的方法,而偶尔超过监督的方法。