We aim to understand grokking, a phenomenon where models generalize long after overfitting their training set. We present both a microscopic analysis anchored by an effective theory and a macroscopic analysis of phase diagrams describing learning performance across hyperparameters. We find that generalization originates from structured representations whose training dynamics and dependence on training set size can be predicted by our effective theory in a toy setting. We observe empirically the presence of four learning phases: comprehension, grokking, memorization, and confusion. We find representation learning to occur only in a "Goldilocks zone" (including comprehension and grokking) between memorization and confusion. Compared to the comprehension phase, the grokking phase stays closer to the memorization phase, leading to delayed generalization. The Goldilocks phase is reminiscent of "intelligence from starvation" in Darwinian evolution, where resource limitations drive discovery of more efficient solutions. This study not only provides intuitive explanations of the origin of grokking, but also highlights the usefulness of physics-inspired tools, e.g., effective theories and phase diagrams, for understanding deep learning.
翻译:我们的目标是理解石化,这是一个模型在过度调整其培训设置后长期普遍化的现象。我们展示了基于有效理论的微观分析,以及对描述超光度中学习表现的阶段图的宏观分析。我们发现,一般化起源于结构化的演示,其培训动态和对培训设置规模的依赖可以由我们在玩具环境中的有效理论预测。我们从经验上观察四个学习阶段的存在:理解、嘲弄、记忆和混乱。我们发现代表性学习只能在记忆和混乱之间的“格列迪洛克斯区”(包括理解和嘲弄)中进行。与理解阶段相比,石化阶段离记忆化阶段更近,导致延迟的概括化阶段。戈尔蒂洛克斯阶段是达尔文进化过程中“饥饿的感知力”的记忆。在达尔文进化过程中,资源限制促使发现效率更高的解决方案。这一研究不仅提供了“格列洛克克的起源”(包括理解和嘲弄)的直观解释,而且还强调了物理学启发性工具的实用性,例如深层理解、有效理论和演化阶段。