Learning curves provide insight into the dependence of a learner's generalization performance on the training set size. This important tool can be used for model selection, to predict the effect of more training data, and to reduce the computational complexity of model training and hyperparameter tuning. This review recounts the origins of the term, provides a formal definition of the learning curve, and briefly covers basics such as its estimation. Our main contribution is a comprehensive overview of the literature regarding the shape of learning curves. We discuss empirical and theoretical evidence that supports well-behaved curves that often have the shape of a power law or an exponential. We consider the learning curves of Gaussian processes, the complex shapes they can display, and the factors influencing them. We draw specific attention to examples of learning curves that are ill-behaved, showing worse learning performance with more training data. To wrap up, we point out various open problems that warrant deeper empirical and theoretical investigation. All in all, our review underscores that learning curves are surprisingly diverse and no universal model can be identified.
翻译:学习曲线可以洞察学习者一般表现对培训设置大小的依赖性。 这一重要工具可用于模型选择, 预测更多培训数据的效果, 降低模型培训和超参数调的计算复杂性。 本审查描述了该术语的起源, 提供了学习曲线的正式定义, 并简要覆盖了诸如其估计等基本内容。 我们的主要贡献是对学习曲线形状的文献的全面概览。 我们讨论了支持通常具有权力法或指数形状的良好行为曲线的经验和理论证据。 我们考虑了高山过程的学习曲线、 它们能够显示的复杂形状以及影响这些曲线的因素。 我们特别注意到学习曲线的不正确例子, 以更多的培训数据来显示学习成绩更差。 总之, 我们指出各种需要更深入的经验和理论调查的公开问题。 总之, 我们的审查强调, 学习曲线是惊人的, 无法找到普遍模式。