We present a state-of-the-art neural approach to the unsupervised reconstruction of ancient word forms. Previous work in this domain used expectation-maximization to predict simple phonological changes between ancient word forms and their cognates in modern languages. We extend this work with neural models that can capture more complicated phonological and morphological changes. At the same time, we preserve the inductive biases from classical methods by building monotonic alignment constraints into the model and deliberately underfitting during the maximization step. We evaluate our performance on the task of reconstructing Latin from a dataset of cognates across five Romance languages, achieving a notable reduction in edit distance from the target word forms compared to previous methods.
翻译:我们对古代文字形式的无监督重建提出了一种最先进的神经神经学方法。这个领域以前的工作利用期望最大化来预测古词形式及其现代语言的白兰地之间的简单声学变化。我们利用神经模型来扩展这项工作,这些模型可以捕捉更复杂的声学和形态变化。与此同时,我们通过在模型中设置单一的调和限制,并在最大化步骤中刻意地加以完善来保持古典方法的感应偏差。我们评估了我们从五种罗马语言的白兰地数据集中重建拉丁文的任务的绩效,与以往的方法相比,显著减少了与目标词形式的编辑距离。