Training machine learning models in a meaningful order, from the easy samples to the hard ones, using curriculum learning can provide performance improvements over the standard training approach based on random data shuffling, without any additional computational costs. Curriculum learning strategies have been successfully employed in all areas of machine learning, in a wide range of tasks. However, the necessity of finding a way to rank the samples from easy to hard, as well as the right pacing function for introducing more difficult data can limit the usage of the curriculum approaches. In this survey, we show how these limits have been tackled in the literature, and we present different curriculum learning instantiations for various tasks in machine learning. We construct a multi-perspective taxonomy of curriculum learning approaches by hand, considering various classification criteria. We further build a hierarchical tree of curriculum learning methods using an agglomerative clustering algorithm, linking the discovered clusters with our taxonomy. At the end, we provide some interesting directions for future work.
翻译:从简单样本到硬样本,以有意义的顺序,从简单样本到硬样本的培训机器学习模式,使用课程学习可以提高基于随机数据打乱的标准培训方法的绩效,而无需额外的计算费用。课程学习战略已成功地应用于机器学习的所有领域,任务范围很广。然而,必须找到一种方法,将样本从简单到硬的分级,以及采用更难的数据的正确节奏功能,这可能会限制课程方法的使用。在这次调查中,我们展示了这些限制在文献中是如何解决的,我们为机器学习中的各种任务提出了不同的课程学习即时方法。我们考虑到各种分类标准,建立了手工学习课程方法的多视角分类学。我们进一步建立了一套课程学习方法的分级树,使用聚合组合式组合算法,将发现的分组与我们的分类法联系起来。最后,我们为未来的工作提供了一些有趣的方向。