Over-parameterized deep neural networks are able to achieve excellent training accuracy while maintaining a small generalization error. It has also been found that they are able to fit arbitrary labels, and this behaviour is referred to as the phenomenon of memorization. In this work, we study the phenomenon of memorization with turn-over dropout, an efficient method to estimate influence and memorization, for data with true labels (real data) and data with random labels (random data). Our main findings are: (i) For both real data and random data, the optimization of easy examples (e.g., real data) and difficult examples (e.g., random data) are conducted by the network simultaneously, with easy ones at a higher speed; (ii) For real data, a correct difficult example in the training dataset is more informative than an easy one. By showing the existence of memorization on random data and real data, we highlight the consistency between them regarding optimization and we emphasize the implication of memorization during optimization.
翻译:测量过量的深神经网络能够在保持一个小的概括错误的同时实现极好的训练精度,还发现它们能够适应任意的标签,这种行为被称为记忆化现象。在这项工作中,我们研究的是与交替辍学有关的记忆化现象,这是一个估计影响和记忆化的有效方法,涉及真实标签(真实数据)的数据和随机标签(随机数据)的数据。我们的主要结论是:(一)对于真实数据和随机数据而言,简单实例(如真实数据)和困难实例(如随机数据)的优化由网络同时进行,而简单实例(如随机数据)的优化速度更高;(二)对于真实数据而言,培训数据集中一个正确的困难实例比简单的例子更具有信息性。通过显示随机数据和真实数据存在记忆,我们强调它们之间在优化方面的一致性,我们强调在优化过程中的记忆化含义。