As a simple technique to accelerate inference of large-scale pre-trained models, early exiting has gained much attention in the NLP community. It allows samples to exit early at internal classifiers without passing through the entire model. Most existing work usually trains the internal classifiers independently and employs an exiting strategy to decide whether or not to exit based on the confidence of the current internal classifier. However, none of these works takes full advantage of the fact that the internal classifiers are trained to solve the same task therefore can be used to construct an ensemble. In this paper, we show that a novel objective function for the training of the ensemble internal classifiers can be naturally induced from the perspective of ensemble learning and information theory. The proposed training objective consists of two terms: one for accuracy and the other for the diversity of the internal classifiers. In contrast, the objective used in prior work is exactly the accuracy term of our training objective therefore only optimizes the accuracy but not diversity. Further, we propose a simple voting-based strategy that considers predictions of all the past internal classifiers to infer the correct label and decide whether to exit. Experimental results on various NLP tasks show that our proposed objective function and voting-based strategy can achieve better accuracy-speed trade-offs.
翻译:作为加快大规模预先培训模式推论的简单技术,早期退出在NLP社区中引起了很大的注意。它使样本可以在不通过整个模型的情况下提前从内部分类器中退出,现有工作大多是独立培训内部分类器,并使用一种退出战略,以根据当前内部分类器的信心决定是否退出。然而,这些工程都没有充分利用内部分类器受过培训以完成同样任务,因此,可以用来构建一个组合。在本文中,我们表明,从共同学习和信息理论的角度自然可以引出培训内部分类器的新目标功能。拟议培训目标包括两个术语:一个是准确性,另一个是内部分类器的多样性。相比之下,以前工作的目标恰恰是我们培训目标的准确性术语,因此只能优化准确性,而不是多样性。此外,我们提出一个简单的基于投票的战略,即考虑对过去所有内部分类器进行预测,从而推导出正确的标签,并决定是否实现更准确性退出目标。 实验性任务显示各种NL 实验性任务能够显示我们所提议的目标值。