Recent work has formulated the task for computational construction grammar as producing a constructicon given a corpus of usage. Previous work has evaluated these unsupervised grammars using both internal metrics (for example, Minimum Description Length) and external metrics (for example, performance on a dialectology task). This paper instead takes a linguistic approach to evaluation, first learning a constructicon and then analyzing its contents from a linguistic perspective. This analysis shows that a learned constructicon can be divided into nine major types of constructions, of which Verbal and Nominal are the most common. The paper also shows that both the token and type frequency of constructions can be used to model variation across registers and dialects.
翻译:最近的工作为计算构造语法设计了任务,因为计算构造语法生成了一个建筑用量,以往的工作利用内部指标(例如,最小描述长度)和外部指标(例如,辩语学任务的表现)对这些未经监督的语法进行了评估,本文采用了语言评价方法,首先学习构造语法,然后从语言角度分析其内容。这一分析表明,一个有学识的构造可以分为九大类建筑,其中Verbal和Nominal最为常见。该文件还表明,建筑的象征和类型频率可以用来模拟登记册和方言之间的差异。