低资源双箱:关于为低资源机器翻译而预先准备的经验研究 (The Low-Resource Double Bind: An Empirical Study of Pruning for Low-Resource Machine Translation)

A "bigger is better" explosion in the number of parameters in deep neural networks has made it increasingly challenging to make state-of-the-art networks accessible in compute-restricted environments. Compression techniques have taken on renewed importance as a way to bridge the gap. However, evaluation of the trade-offs incurred by popular compression techniques has been centered on high-resource datasets. In this work, we instead consider the impact of compression in a data-limited regime. We introduce the term low-resource double bind to refer to the co-occurrence of data limitations and compute resource constraints. This is a common setting for NLP for low-resource languages, yet the trade-offs in performance are poorly studied. Our work offers surprising insights into the relationship between capacity and generalization in data-limited regimes for the task of machine translation. Our experiments on magnitude pruning for translations from English into Yoruba, Hausa, Igbo and German show that in low-resource regimes, sparsity preserves performance on frequent sentences but has a disparate impact on infrequent ones. However, it improves robustness to out-of-distribution shifts, especially for datasets that are very distinct from the training distribution. Our findings suggest that sparsity can play a beneficial role at curbing memorization of low frequency attributes, and therefore offers a promising solution to the low-resource double bind.

翻译：深层神经网络的参数数量“ 跳板更好” 爆炸性“ 跳板更好 ”, 使得让最先进的网络在计算限制的环境中进入越来越具有挑战性。压缩技术作为弥合差距的一种方法,具有新的重要性。但是, 大众压缩技术的权衡评价集中在高资源数据集上。在这项工作中, 我们考虑压缩在数据限制制度中的影响。我们引入了“ 低资源双重约束”这一术语, 以提及数据限制和计算资源限制的共同发生。这是低资源语言的NLP的常见环境, 但业绩交易却研究不力。我们的工作令人惊讶地揭示了在机器翻译任务中能力与数据限制制度一般化之间的关系。我们关于从英文到Yoruba、Hausa、Igbo和德国翻译的规模的实验表明,在低资源制度中,紧张性保留了数据频繁的成绩,但是对非经常使用的资源限制有不同的影响。但是, 它改进了NLP的稳健度, 而不是对业绩的权衡性变化。我们的工作提供了一种非常有希望的稳定性的分布, 使得数据能够产生非常明显的稳定性的变化。