Data-to-text generation has recently attracted substantial interests due to its wide applications. Existing methods have shown impressive performance on an array of tasks. However, they rely on a significant amount of labeled data for each task, which is costly to acquire and thus limits their application to new tasks and domains. In this paper, we propose to leverage pre-training and transfer learning to address this issue. We propose a knowledge-grounded pre-training (KGPT), which consists of two parts, 1) a general knowledge-grounded generation model to generate knowledge-enriched text. 2) a pre-training paradigm on a massive knowledge-grounded text corpus crawled from the web. The pre-trained model can be fine-tuned on various data-to-text generation tasks to generate task-specific text. We adopt three settings, namely fully-supervised, zero-shot, few-shot to evaluate its effectiveness. Under the fully-supervised setting, our model can achieve remarkable gains over the known baselines. Under zero-shot setting, our model without seeing any examples achieves over 30 ROUGE-L on WebNLG while all other baselines fail. Under the few-shot setting, our model only needs about one-fifteenth as many labeled examples to achieve the same level of performance as baseline models. These experiments consistently prove the strong generalization ability of our proposed framework https://github.com/wenhuchen/KGPT.
翻译:现有方法显示,在一系列任务上,有大量的标记数据,但每个任务都依赖大量贴上标签的数据,而获得这些数据的成本很高,因而将这些数据限制在新的任务和领域。在本文件中,我们提议利用培训前和转让学习来解决这一问题。我们提议利用一个基于知识的预培训(KGPT),它由两个部分组成,1个是创造知识丰富文本的一般知识基础生成模式(KGPT),2个是创造知识丰富文本的一般知识基础生成模式。2个是大规模知识基础文本的预培训模式,从网络上爬行。预先培训的模式可以精确调整各种数据到文本的生成任务。我们采用三种环境,即完全超超前、零发、少发来评估其有效性。在完全封闭的环境下,我们的模式可以在已知的基线上取得显著的成果。在零光谱设置下,我们的模式在WebNLG上只看到超过30个模型,而其他基线能力却都无法实现。在“GLG”上,在“OUG/G”级标准中,这些基准级模型将持续实现。