In this work, we focus on low-resource dependency parsing for multiple languages. Several strategies are tailored to enhance performance in low-resource scenarios. While these are well-known to the community, it is not trivial to select the best-performing combination of these strategies for a low-resource language that we are interested in, and not much attention has been given to measuring the efficacy of these strategies. We experiment with 5 low-resource strategies for our ensembled approach on 7 Universal Dependency (UD) low-resource languages. Our exhaustive experimentation on these languages supports the effective improvements for languages not covered in pretrained models. We show a successful application of the ensembled system on a truly low-resource language Sanskrit. The code and data are available at: https://github.com/Jivnesh/SanDP
翻译:在这项工作中,我们侧重于对多种语言的低资源依赖划分。一些战略是专门设计来提高低资源情景的绩效的。虽然这些战略是社区所熟知的,但选择这些战略中最有效果的组合来使用我们感兴趣的一种低资源语言并非微不足道,也没有多少注意衡量这些战略的功效。我们试验了5个低资源战略,用于我们关于7种普遍依赖(UD)低资源语言的组合方法。我们对这些语言的详尽实验有助于有效改进预先培训模式中未涵盖的语言。我们展示了在真正低资源语言梵语上成功应用混合系统的情况。代码和数据见:https://github.com/Jivnesh/SanDP。