AMR parsing has experienced an unprecendented increase in performance in the last three years, due to a mixture of effects including architecture improvements and transfer learning. Self-learning techniques have also played a role in pushing performance forward. However, for most recent high performant parsers, the effect of self-learning and silver data generation seems to be fading. In this paper we show that it is possible to overcome this diminishing returns of silver data by combining Smatch-based ensembling techniques with ensemble distillation. In an extensive experimental setup, we push single model English parser performance above 85 Smatch for the first time and return to substantial gains. We also attain a new state-of-the-art for cross-lingual AMR parsing for Chinese, German, Italian and Spanish. Finally we explore the impact of the proposed distillation technique on domain adaptation, and show that it can produce gains rivaling those of human annotated data for QALD-9 and achieve a new state-of-the-art for BioAMR.
翻译:在过去三年里,由于包括建筑改进和转让学习在内的各种效应的混合作用,对光学、光学和光学数据分析的性能出现了前所未有的增长;自学技术在推进性能方面也发挥了一定的作用;然而,对于最近的高表现的剖析者来说,自学和银制数据生成的效果似乎正在逐渐消失;在本文中,我们表明,通过将基于Smatch的混合技术与混合蒸馏技术相结合,可以克服银数据的这种不断减少的回报率。在广泛的实验中,我们首次将单一模型的英国授精者性能推向85次以上,并重新取得实质性成果。我们还为中文、德文、意大利文和西班牙文跨语言的亚光学、低光学和低音分法取得了新的艺术水平。最后,我们探讨了拟议的蒸馏技术对地区适应的影响,并表明它能够产生与QALD-9人类附加说明的数据相匹配的收益,并为BioAMR取得新的艺术。