The recent work of Papyan, Han, & Donoho (2020) presented an intriguing "Neural Collapse" phenomenon, showing a structural property of interpolating classifiers in the late stage of training. This opened a rich area of exploration studying this phenomenon. Our motivation is to study the upper limits of this research program: How far will understanding Neural Collapse take us in understanding deep learning? First, we investigate its role in generalization. We refine the Neural Collapse conjecture into two separate conjectures: collapse on the train set (an optimization property) and collapse on the test distribution (a generalization property). We find that while Neural Collapse often occurs on the train set, it does not occur on the test set. We thus conclude that Neural Collapse is primarily an optimization phenomenon, with as-yet-unclear connections to generalization. Second, we investigate the role of Neural Collapse in feature learning. We show simple, realistic experiments where training longer leads to worse last-layer features, as measured by transfer-performance on a downstream task. This suggests that neural collapse is not always desirable for representation learning, as previously claimed. Finally, we give preliminary evidence of a "cascading collapse" phenomenon, wherein some form of Neural Collapse occurs not only for the last layer, but in earlier layers as well. We hope our work encourages the community to continue the rich line of Neural Collapse research, while also considering its inherent limitations.
翻译:Papyan, Han, & Donoho (2020年) 最近的Papyan, Han, & Donoho (2020年) 工作展示了一个令人着迷的“ Neal Clomp” 现象, 展示了在培训后期交错分类者的结构属性。 这打开了研究这一现象的丰富探索领域。 我们的动机是研究这个研究方案的上限: 理解神经下层的崩溃将让我们深入了解多少? 首先, 我们调查它在概括化中的作用。 我们将其神经下游的神经下折叠猜想改进成两个不同的猜想: 火车机组的倒塌(优化属性)和测试分布的崩溃(一般属性 ) 。 我们发现,在火车组经常发生内层倒塌时, 它并不发生在测试组。 因此我们的结论是,神经倒塌主要是一种优化现象, 与一般化之间没有明确的联系。 其次,我们调查神经倒塌的作用在特征学习中。 我们展示了简单、现实的实验, 将培训更深层次的特征推向更坏的特征, 以我们下游任务的转移表现来衡量。 这表明神经崩溃并非总是适宜于表, 最后的层次, 开始学习。 最后的层次, 最后的阶段形成, 最后的阶段的形成, 最终的阶段, 最后的形成, 最终的形成, 最后的阶段, 最终的形成。