When training neural networks as classifiers, it is common to observe an increase in average test loss while still maintaining or improving the overall classification accuracy on the same dataset. In spite of the ubiquity of this phenomenon, it has not been well studied and is often dismissively attributed to an increase in borderline correct classifications. We present an empirical investigation that shows how this phenomenon is actually a result of the differential manner by which test samples are processed. In essence: test loss does not increase overall, but only for a small minority of samples. Large representational capacities allow losses to decrease for the vast majority of test samples at the cost of extreme increases for others. This effect seems to be mainly caused by increased parameter values relating to the correctly processed sample features. Our findings contribute to the practical understanding of a common behaviour of deep neural networks. We also discuss the implications of this work for network optimisation and generalisation.
翻译:在将神经网络培训为分类人员时,常见的情况是,在保持或提高同一数据集的总体分类准确性的同时,平均测试损失增加。尽管这一现象普遍存在,但没有得到很好地研究,而且往往被忽略地归因于边界正确分类的增加。我们提出的实证调查表明,这一现象实际上是如何由测试样品处理的不同方式造成的。实质上:测试损失并不总体增加,而只是对少数样本而言。大量代表能力使得绝大多数测试样品的损失减少,而其他样品则以极端增加为代价。这种影响似乎主要是与正确处理的样品特征有关的参数值增加造成的。我们的调查结果有助于实际了解深层神经网络的共同行为。我们还讨论了这项工作对网络优化和普及的影响。