It is common knowledge that the quantity and quality of the training data play a significant role in the creation of a good machine learning model. In this paper, we take it one step further and demonstrate that the way the training examples are arranged is also of crucial importance. Curriculum Learning is built on the observation that organized and structured assimilation of knowledge has the ability to enable faster training and better comprehension. When humans learn to speak, they first try to utter basic phones and then gradually move towards more complex structures such as words and sentences. This methodology is known as Curriculum Learning, and we employ it in the context of Automatic Speech Recognition. We hypothesize that end-to-end models can achieve better performance when provided with an organized training set consisting of examples that exhibit an increasing level of difficulty (i.e. a curriculum). To impose structure on the training set and to define the notion of an easy example, we explored multiple scoring functions that either use feedback from an external neural network or incorporate feedback from the model itself. Empirical results show that with different curriculums we can balance the training times and the network's performance.
翻译:众所周知,培训数据的数量和质量在创造良好的机器学习模式方面起着重要作用。在本文中,我们进一步迈出一步,并表明培训范例的安排方式也至关重要。课程学习建立在有组织和有条理地吸收知识能够更快地进行培训和更好理解的观察基础之上。当人类学会说话时,他们首先尝试讲出基本的电话,然后逐渐转向更复杂的结构,例如文字和句子。这种方法被称为课程学习,我们在自动语音识别中采用这种方法。我们假设,如果提供一组有组织的培训,包括显示日益困难程度的范例(即课程),那么端到端的模型就能取得更好的业绩。为了将结构强加在培训套件上,并界定一个简单的例子的概念,我们探讨了多种评分功能,要么利用外部神经网络的反馈,要么吸收模型本身的反馈。根据经验,我们可以用不同的课程来平衡培训时间和网络的绩效。