使用培训前双电码器模拟常识属性 (Modelling Commonsense Properties using Pre-Trained Bi-Encoders)

Grasping the commonsense properties of everyday concepts is an important prerequisite to language understanding. While contextualised language models are reportedly capable of predicting such commonsense properties with human-level accuracy, we argue that such results have been inflated because of the high similarity between training and test concepts. This means that models which capture concept similarity can perform well, even if they do not capture any knowledge of the commonsense properties themselves. In settings where there is no overlap between the properties that are considered during training and testing, we find that the empirical performance of standard language models drops dramatically. To address this, we study the possibility of fine-tuning language models to explicitly model concepts and their properties. In particular, we train separate concept and property encoders on two types of readily available data: extracted hyponym-hypernym pairs and generic sentences. Our experimental results show that the resulting encoders allow us to predict commonsense properties with much higher accuracy than is possible by directly fine-tuning language models. We also present experimental results for the related task of unsupervised hypernym discovery.

翻译：剖析日常概念的常识特性是理解语言的重要先决条件。虽然据说背景化语言模型能够预测这种常识性能,并具有人类水平的准确性,但我们认为,由于培训和测试概念之间高度相似,这些结果已经夸大。这意味着,那些反映概念相似性的模型能够很好地发挥作用,即使它们不能捕捉到对常识性能本身的任何知识。在培训和测试期间所考虑的特性之间没有重叠的环境下,我们发现标准语言模型的经验性能急剧下降。为了解决这个问题,我们研究微调语言模型的可能性,以明确模型概念及其特性。特别是,我们用两种现成的数据来培训独立的概念和财产编码器:提取低温-超温配对和普通句子。我们的实验结果表明,所产生的编码器允许我们以比直接微调语言模型所能做到的更精确得多的预测常识性能。我们还提出了与未加控制的超温性能发现有关的任务的实验结果。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日