We introduce Goldilocks Selection, a technique for faster model training which selects a sequence of training points that are "just right". We propose an information-theoretic acquisition function -- the reducible validation loss -- and compute it with a small proxy model -- GoldiProx -- to efficiently choose training points that maximize information about a validation set. We show that the "hard" (e.g. high loss) points usually selected in the optimization literature are typically noisy, while the "easy" (e.g. low noise) samples often prioritized for curriculum learning confer less information. Further, points with uncertain labels, typically targeted by active learning, tend to be less relevant to the task. In contrast, Goldilocks Selection chooses points that are "just right" and empirically outperforms the above approaches. Moreover, the selected sequence can transfer to other architectures; practitioners can share and reuse it without the need to recreate it.
翻译:我们引入 Goldilocks 选择, 这是一种快速模式培训技术, 选择一系列“ 正确” 的培训点 。 我们提议了一个信息理论获取功能 -- -- 可减少的验证损失 -- -- 并用一个小代用模型 -- -- GoldiProx -- -- 进行计算, 以高效地选择能够最大限度地增加关于验证集的信息的培训点。 我们显示, 通常在优化文献中选择的“ 硬” ( 高损失) 点通常很吵闹, 而“ 容易” ( 低噪音) 样本通常优先用于课程学习, 却不会带来更多信息。 此外, 通常以积极学习为目标的不确定标签点往往与任务不相关。 相反, Goldilocks 选择了“ 正确” 点, 实验性优于上述方法。 此外, 选中的序列可以转移到其他结构; 实践者可以分享和再使用它, 而无需再造它。