Modern deep neural networks have achieved superhuman performance in tasks from image classification to game play. Surprisingly, these various complex systems with massive amounts of parameters exhibit the same remarkable structural properties in their last-layer features and classifiers across canonical datasets. This phenomenon is known as "Neural Collapse," and it was discovered empirically by Papyan et al. \cite{Papyan20}. Recent papers have theoretically shown the global solutions to the training network problem under a simplified "unconstrained feature model" exhibiting this phenomenon. We take a step further and prove the Neural Collapse occurrence for deep linear network for the popular mean squared error (MSE) and cross entropy (CE) loss. Furthermore, we extend our research to imbalanced data for MSE loss and present the first geometric analysis for Neural Collapse under this setting.
翻译:现代深心神经网络在从图像分类到游戏游戏的任务中取得了超人性特异性。 令人惊讶的是, 这些具有大量参数的复杂系统在其最后一层特征和横跨运河数据集的分类器中表现出同样的显著结构特性。 这种现象被称为“ 神经折叠 ”, Papyan et al.\cite{Papyan2020} 。 最近的论文从理论上展示了在展示这一现象的简化的“ 不受限制的特性模型” 下解决培训网络问题的全球办法。 我们更进一步, 并证明在流行的平均正方形错误( MSE) 和 交叉环球( CE) 损失方面, 深线性网络的神经崩溃发生。 此外, 我们还将研究扩展到关于MSE 损失的不平衡数据, 并在此环境下为 Neural Colfraft 提供第一次的几何分析 。