Sample weighting is widely used in deep learning. A large number of weighting methods essentially utilize the learning difficulty of training samples to calculate their weights. In this study, this scheme is called difficulty-based weighting. Two important issues arise when explaining this scheme. First, a unified difficulty measure that can be theoretically guaranteed for training samples does not exist. The learning difficulties of the samples are determined by multiple factors including noise level, imbalance degree, margin, and uncertainty. Nevertheless, existing measures only consider a single factor or in part, but not in their entirety. Second, a comprehensive theoretical explanation is lacking with respect to demonstrating why difficulty-based weighting schemes are effective in deep learning. In this study, we theoretically prove that the generalization error of a sample can be used as a universal difficulty measure. Furthermore, we provide formal theoretical justifications on the role of difficulty-based weighting for deep learning, consequently revealing its positive influences on both the optimization dynamics and generalization performance of deep models, which is instructive to existing weighting schemes.
翻译:大量加权方法基本上利用培训样本的学习困难来计算其重量。在本研究中,这个办法称为基于困难的加权。在解释这个办法时出现两个重要问题。首先,在理论上可以保证培训样本的统一困难计量办法并不存在。抽样的学习困难是由多种因素决定的,包括噪音水平、不平衡程度、差幅和不确定性。然而,现有措施只考虑一个因素或部分因素,而不考虑全部因素。第二,缺乏全面的理论解释来说明基于困难的加权办法为何在深层学习中有效。在本研究中,我们理论上证明,一个抽样的一般错误可以作为一种普遍困难计量办法使用。此外,我们为深层学习基于困难的加权作用提供了正式的理论理由,从而揭示了它对深层模型的优化动态和一般化表现的积极影响,这对现有的加权办法很有启发性。