We study the pre-train + fine-tune strategy for data-to-text tasks. Our experiments indicate that text-to-text pre-training in the form of T5, enables simple, end-to-end transformer based models to outperform pipelined neural architectures tailored for data-to-text generation, as well as alternative language model based pre-training techniques such as BERT and GPT-2. Importantly, T5 pre-training leads to better generalization, as evidenced by large improvements on out-of-domain test sets. We hope our work serves as a useful baseline for future research, as transfer learning becomes ever more prevalent for data-to-text tasks.
翻译:我们研究了数据到文本任务的培训前战略+微调战略。我们的实验表明,T5形式的文本到文本的预培训使基于终端到终端变压器的简单模型能够超越为数据到文本的生成而专门设计的编审中神经结构,以及基于替代语言的预培训技术,如BERT和GPT-2。 重要的是,T5预培训导致更普遍化,这表现在外部测试器的大规模改进。 我们希望我们的工作成为未来研究的有用基准,因为数据到文本的任务中转让学习越来越普遍。