带前置控制发电机的微小热表到图文生成器 (Few-Shot Table-to-Text Generation with Prefix-Controlled Generator)

Neural table-to-text generation approaches are data-hungry, limiting their adaptation for low-resource real-world applications. Previous works mostly resort to Pre-trained Language Models (PLMs) to generate fluent summaries of a table. However, they often contain hallucinated contents due to the uncontrolled nature of PLMs. Moreover, the topological differences between tables and sequences are rarely studied. Last but not least, fine-tuning on PLMs with a handful of instances may lead to over-fitting and catastrophic forgetting. To alleviate these problems, we propose a prompt-based approach, Prefix-Controlled Generator (i.e., PCG), for few-shot table-to-text generation. We prepend a task-specific prefix for a PLM to make the table structure better fit the pre-trained input. In addition, we generate an input-specific prefix to control the factual contents and word order of the generated text. Both automatic and human evaluations on different domains (humans, books and songs) of the Wikibio dataset show substantial improvements over baseline approaches.

翻译：生成神经表格到文字的方法是数据饥饿,限制了对低资源现实应用的适应性。以前的工作主要依靠预先培训的语言模型(PLM)来生成流利的表格摘要。然而,由于PLM的不受控制性质,这些模型往往含有幻觉内容。此外,很少研究表格和顺序之间的地形差异。最后但并非最不重要的一点是,对PLM进行微调,加上少数实例,可能会导致过度适应和灾难性的遗忘。为了缓解这些问题,我们建议采用基于迅速的方法,即先行控制的生成器(即PCG),用于几发式的表格到文字生成。我们为PLM设计了一个具体任务前缀,以使表格结构更适合预先培训的投入。此外,我们还制作了一种具体投入的前缀,以控制生成文本的事实内容和文字的文字顺序。对维基比奥数据集的不同领域(人类、书籍和歌曲)的自动和人类评价都显示基线方法的重大改进。

相关内容

小样本学习

关注 215

小样本学习（Few-Shot Learning，以下简称 FSL ）用于解决当可用的数据量比较少时，如何提升神经网络的性能。在 FSL 中，经常用到的一类方法被称为 Meta-learning。和普通的神经网络的训练方法一样，Meta-learning 也包含训练过程和测试过程，但是它的训练过程被称作 Meta-training 和 Meta-testing。

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日