改进低资源资源跨语种交流和预期统计正规化 (Improving Low-Resource Cross-lingual Parsing with Expected Statistic Regularization)

We present Expected Statistic Regularization (ESR), a novel regularization technique that utilizes low-order multi-task structural statistics to shape model distributions for semi-supervised learning on low-resource datasets. We study ESR in the context of cross-lingual transfer for syntactic analysis (POS tagging and labeled dependency parsing) and present several classes of low-order statistic functions that bear on model behavior. Experimentally, we evaluate the proposed statistics with ESR for unsupervised transfer on 5 diverse target languages and show that all statistics, when estimated accurately, yield improvements to both POS and LAS, with the best statistic improving POS by +7.0 and LAS by +8.5 on average. We also present semi-supervised transfer and learning curve experiments that show ESR provides significant gains over strong cross-lingual-transfer-plus-fine-tuning baselines for modest amounts of label data. These results indicate that ESR is a promising and complementary approach to model-transfer approaches for cross-lingual parsing.

翻译：我们提出预期统计正规化(ESR),这是一种新颖的正规化技术,它利用低级多任务结构统计,为低资源数据集的半监督学习塑造模式分布,我们从跨语言转移角度研究经济社会责任,以进行综合分析(POS标记和贴标签的受扶养人分割),并介绍与模式行为有关的几类低级统计功能。我们实验时,我们与ESR一起评价拟议的统计数据,以便在5种不同目标语言上进行不受监督的转移,并表明,所有统计数据,如果估算准确,都能够改善POS和LAS, 使POS和LAS的模型分布得到改进,以+7.0和LAS的最好统计数据平均改善+8.5,我们还提出半监督转移和学习曲线实验,显示ESR在为少量标签数据提供强大的跨语言转移加调整基线方面取得重大进展。这些结果表明,ESR是对跨语言分类模式转移方法的一种有希望和互补的方法。