Despite their ability to represent highly expressive functions, deep learning models trained with SGD seem to find simple, constrained solutions that generalize surprisingly well. Spectral bias - the tendency of neural networks to prioritize learning low frequency functions - is one possible explanation for this phenomenon, but so far spectral bias has only been observed in theoretical models and simplified experiments. In this work, we propose methodologies for measuring spectral bias in modern image classification networks. We find that these networks indeed exhibit spectral bias, and that networks that generalize well strike a balance between having enough complexity(i.e. high frequencies) to fit the data while being simple enough to avoid overfitting. For example, we experimentally show that larger models learn high frequencies faster than smaller ones, but many forms of regularization, both explicit and implicit, amplify spectral bias and delay the learning of high frequencies. We also explore the connections between function frequency and image frequency and find that spectral bias is sensitive to the low frequencies prevalent in natural images. Our work enables measuring and ultimately controlling the spectral behavior of neural networks used for image classification, and is a step towards understanding why deep models generalize well
翻译:尽管能够代表高度的表达功能,但经过SGD培训的深层次学习模式似乎找到了简单、限制性的解决方案,这些解决方案的普及令人惊讶。光谱偏差 — — 神经网络倾向于优先学习低频功能的倾向 — — 是这一现象的一个可能解释,但到目前为止,光谱偏差只在理论模型和简化实验中观察到。在这项工作中,我们提出了测量现代图像分类网络中光谱偏差的方法。我们发现这些网络确实表现出光谱偏差,而且那些在足够复杂(即高频率)以适应数据的同时又足够简单地避免过度匹配数据的网络之间取得平衡的网络。例如,我们实验性地表明,较大的模型学习高频比小频要快,但许多形式的正规化形式,包括明确和隐含的,扩大光谱偏差和延迟高频的学习。我们还探讨了功能频率和图像频率之间的联系,并发现光谱偏差对自然图像中普遍存在的低频率十分敏感。我们的工作有助于测量和最终控制用于图像分类的神经网络的光谱行为,并且是在深入理解为什么模型普遍化的原因上迈出了一步。