Idioms are unlike most phrases in two important ways. First, the words in an idiom have non-canonical meanings. Second, the non-canonical meanings of words in an idiom are contingent on the presence of other words in the idiom. Linguistic theories differ on whether these properties depend on one another, as well as whether special theoretical machinery is needed to accommodate idioms. We define two measures that correspond to the properties above, and we implement them using BERT (Devlin et al., 2019) and XLNet(Yang et al., 2019). We show that idioms fall at the expected intersection of the two dimensions, but that the dimensions themselves are not correlated. Our results suggest that special machinery to handle idioms may not be warranted.
翻译:语言学理论在两种重要方式上与多数语系不同。 首先,语系中的词语具有非卡通意义。 其次,语系中的词语的非卡通含义取决于语系中的其他词的存在。 语言学理论不同,这些属性是否互相依赖,以及是否需要特殊的理论机制来容纳语系。 我们界定了与上述属性相对应的两种措施,我们使用BERT(Devlin等人,2019年)和XLNet(Yang等人,2019年)来实施这些措施。 我们表明,语系在这两个维度的预期交叉点上落下,但这两个维度本身并不相关。 我们的结果表明,处理语系的特殊机制也许没有必要。