When faced with learning challenging new tasks, humans often follow sequences of steps that allow them to incrementally build up the necessary skills for performing these new tasks. However, in machine learning, models are most often trained to solve the target tasks directly.Inspired by human learning, we propose a novel curriculum learning approach which decomposes challenging tasks into sequences of easier intermediate goals that are used to pre-train a model before tackling the target task. We focus on classification tasks, and design the intermediate tasks using an automatically constructed label hierarchy. We train the model at each level of the hierarchy, from coarse labels to fine labels, transferring acquired knowledge across these levels. For instance, the model will first learn to distinguish animals from objects, and then use this acquired knowledge when learning to classify among more fine-grained classes such as cat, dog, car, and truck. Most existing curriculum learning algorithms for supervised learning consist of scheduling the order in which the training examples are presented to the model. In contrast, our approach focuses on the output space of the model. We evaluate our method on several established datasets and show significant performance gains especially on classification problems with many labels. We also evaluate on a new synthetic dataset which allows us to study multiple aspects of our method.
翻译:面对具有挑战性的学习新任务,人类往往会遵循一系列步骤,以逐步积累必要的技能来完成这些新任务。然而,在机器学习中,模型往往被训练直接解决目标任务。在人类学习的启发下,我们提议采用新的课程学习方法,将挑战性任务分解为较容易的中间目标序列,用于在完成目标任务之前对模型进行预培训;我们侧重于分类任务,并利用自动构建的标签等级结构设计中间任务。我们培训各级层次的模型,从粗皮标签到细标签,在这些层次之间转让获得的知识。例如,模型将首先学会区分动物和物体,然后在学习将这种获得的知识分为诸如猫、狗、汽车和卡车等较精细的班级时加以使用。大多数现有的课程学习算法是安排向模型介绍培训范例的顺序。相比之下,我们的方法侧重于模型的输出空间。我们评估了几个已经建立的数据集的方法,并展示了我们取得的重大业绩成果,特别是在许多合成标签上的数据研究。我们还评估了多种方法。