Nen verbal morphology is remarkably complex; a transitive verb can take up to 1,740 unique forms. The combined effect of having a large combinatoric space and a low-resource setting amplifies the need for NLP tools. Nen morphology utilises distributed exponence - a non-trivial means of mapping form to meaning. In this paper, we attempt to model Nen verbal morphology using state-of-the-art machine learning models for morphological reinflection. We explore and categorise the types of errors these systems generate. Our results show sensitivity to training data composition; different distributions of verb type yield different accuracies (patterning with E-complexity). We also demonstrate the types of patterns that can be inferred from the training data through the case study of syncretism.
翻译:语言形态学非常复杂; 中转动动词可以采用1,740个独特形式。 大型组合空间和低资源设置的综合效应增加了对NLP工具的需求。 Nen形态学使用分布式速率(一种非三维的绘图形式表达方式)来表达含义。 在本文中, 我们试图使用最先进的形态再融合机器学习模型来模拟Nen 语言形态学。 我们探索并分类这些系统产生的错误类型。 我们的结果显示对培训数据构成的敏感性; 动词类型分布不同, 产生不同的适应性( 与电子相容性相配 ) 。 我们还展示了通过同步案例研究从培训数据中推断出的模式类型 。