Just because some purely recurrent models suffer from being hard to optimize and inefficient on today's hardware, they are not necessarily bad models of language. We demonstrate this by the extent to which these models can still be improved by a combination of a slightly better recurrent cell, architecture, objective, as well as optimization. In the process, we establish a new state of the art for language modelling on small datasets and on enwik8 with dynamic evaluation.
翻译:仅仅因为一些纯粹重复出现的模型很难优化,而且今天的硬件效率低下,它们并不一定是语言的坏模式。 我们通过这些模型能够在多大程度上通过一个稍好一点的重复单元格、结构、目标以及优化组合加以改进来证明这一点。 在这个过程中,我们为小数据集的语言建模和动态评估的输入8建立了新水平的语言建模。