We aim to understand grokking, a phenomenon where models generalize long after overfitting their training set. We present both a microscopic analysis anchored by an effective theory and a macroscopic analysis of phase diagrams describing learning performance across hyperparameters. We find that generalization originates from structured representations whose training dynamics and dependence on training set size can be predicted by our effective theory in a toy setting. We observe empirically the presence of four learning phases: comprehension, grokking, memorization, and confusion. We find representation learning to occur only in a "Goldilocks zone" (including comprehension and grokking) between memorization and confusion. We find on transformers the grokking phase stays closer to the memorization phase (compared to the comprehension phase), leading to delayed generalization. The Goldilocks phase is reminiscent of "intelligence from starvation" in Darwinian evolution, where resource limitations drive discovery of more efficient solutions. This study not only provides intuitive explanations of the origin of grokking, but also highlights the usefulness of physics-inspired tools, e.g., effective theories and phase diagrams, for understanding deep learning.
翻译:我们的目标是理解石化,这是一个模型在超常训练后长期普遍化的现象。我们展示了一种基于有效理论的微观分析,以及对描述跨超参数学习绩效的阶段图的宏观分析。我们发现,一般化来自结构化的演示,其培训动态和对培训设置规模的依赖可以由我们在玩具环境中的有效理论预测。我们从经验上观察四个学习阶段的存在:理解、嘲弄、记忆化和混乱。我们发现,代言学习只能在记忆和混乱之间的“格列迪洛克斯区”(包括理解和磨克)中进行。我们在变异体中发现,磨痕阶段离记忆阶段(与理解阶段相比)更近,导致延迟的概括化。戈迪洛克斯阶段令人想起达尔文进化过程中的“饥饿感知力”,资源限制促使发现更有效的解决办法。这一研究不仅提供了对石化源的直观解释,而且突出了物理学启发性工具、有效阶段的实用性。