Pretrained language models have become the standard approach for many NLP tasks due to strong performance, but they are very expensive to train. We propose a simple and efficient learning framework, TLM, that does not rely on large-scale pretraining. Given some labeled task data and a large general corpus, TLM uses task data as queries to retrieve a tiny subset of the general corpus and jointly optimizes the task objective and the language modeling objective from scratch. On eight classification datasets in four domains, TLM achieves results better than or similar to pretrained language models (e.g., RoBERTa-Large) while reducing the training FLOPs by two orders of magnitude. With high accuracy and efficiency, we hope TLM will contribute to democratizing NLP and expediting its development.
翻译:由于业绩优异,预先培训的语言模式已成为许多国家劳工政策任务的标准方法,但培训费用非常昂贵。我们提出了一个简单而高效的学习框架,即TLM,不依赖大规模预培训。根据一些标记的任务数据和大量一般材料,TLM使用任务数据作为查询,检索一般文件的一小部分,并从头到尾共同优化任务目标和语言建模目标。在四个领域的八个分类数据集中,TLM取得的结果优于或类似于预先培训的语言模式(如RoBERTA-Large),同时将培训的FLOP数量减少两个级。我们希望TLM能够非常准确和高效地促进国家劳工政策民主化和加速其发展。