Pre-trained language models (PLM) have achieved remarkable advancement in table-to-text generation tasks. However, the lack of labeled domain-specific knowledge and the topology gap between tabular data and text make it difficult for PLMs to yield faithful text. Low-resource generation likewise faces unique challenges in this domain. Inspired by how humans descript tabular data with prior knowledge, we suggest a new framework: PromptMize, which targets table-to-text generation under few-shot settings. The design of our framework consists of two aspects: a prompt planner and a knowledge adapter. The prompt planner aims to generate a prompt signal that provides instance guidance for PLMs to bridge the topology gap between tabular data and text. Moreover, the knowledge adapter memorizes domain-specific knowledge from the unlabelled corpus to supply essential information during generation. Extensive experiments and analyses are investigated on three open domain few-shot NLG datasets: human, song, and book. Compared with previous state-of-the-art approaches, our model achieves remarkable performance in generating quality as judged by human and automatic evaluations.
翻译:受过培训的语文模型(PLM)在表格到文本生成任务方面取得了显著进展,然而,由于缺少有标签的域别知识以及表格数据和文本之间的地形差距,PLM难以产生忠实的文本。低资源生成同样也面临这一领域的独特挑战。受人类先前掌握的知识如何将表列数据标注为表格,我们建议了一个新的框架:快速Mize,目标是在微小的环境下的表格到文本生成。我们框架的设计包括两个方面:快速规划者和知识适应者。快速规划者旨在生成一个迅速的信号,为PLM提供实例指导,以弥合表格数据和文本之间的地形差距。此外,知识调整者回忆了未贴标签的文体在生成过程中提供重要信息的专项知识。在三个开放的域,对广博的NLG数据集:人类、歌曲和书籍进行了广泛的实验和分析。与以往的先进方法相比,我们的模型在提高质量方面取得了由人类和自动评估判断的显著成绩。