There is no such thing as a perfect dataset. In some datasets, deep neural networks discover underlying heuristics that allow them to take shortcuts in the learning process, resulting in poor generalization capability. Instead of using standard cross-entropy, we explore whether a modulated version of cross-entropy called focal loss can constrain the model so as not to use heuristics and improve generalization performance. Our experiments in natural language inference show that focal loss has a regularizing impact on the learning process, increasing accuracy on out-of-distribution data, but slightly decreasing performance on in-distribution data. Despite the improved out-of-distribution performance, we demonstrate the shortcomings of focal loss and its inferiority in comparison to the performance of methods such as unbiased focal loss and self-debiasing ensembles.
翻译:不存在完美的数据集。 在一些数据集中,深神经网络发现内在的超自然学使得他们在学习过程中能够走捷径,导致普遍化能力低下。我们不使用标准的交叉渗透性机能,而是探索一个调制版的交叉渗透性中心损失模式能否约束模型,从而不使用超自然语言的实验推论表明,中心损失对学习过程产生了经常性影响,提高了分配外数据的准确性,但略有下降。 尽管分配外性能有所改善,但我们还是展示了中心损失的缺点,以及它与无偏见的焦点损失和自我贬低集合等方法的性能相比的劣势。