Although remarkable progress on the neural table-to-text methods has been made, the generalization issues hinder the applicability of these models due to the limited source tables. Large-scale pretrained language models sound like a promising solution to tackle such issues. However, how to effectively bridge the gap between the structured table and the text input by fully leveraging table information to fuel the pretrained model is still not well explored. Besides, another challenge of integrating the deliberation mechanism into the text-to-text pretrained model for solving the table-to-text task remains seldom studied. In this paper, to implement the table-to-text generation with pretrained language model, we propose a table structure understanding and text deliberating approach, namely TASD. Specifically, we devise a three-layered multi-head attention network to realize the table-structure-aware text generation model with the help of the pretrained language model. Furthermore, a multi-pass decoder framework is adopted to enhance the capability of polishing generated text for table descriptions. The empirical studies, as well as human evaluation, on two public datasets, validate that our approach can generate faithful and fluent descriptive texts for different types of tables.
翻译:尽管在神经表格对文本方法方面取得了显著进展,但由于来源表格有限,这些通用问题妨碍了这些模型的适用性。大规模预先培训的语文模型听起来似乎是解决这类问题的有希望的解决办法。然而,如何通过充分利用表格信息来有效地弥合结构化表格与文本投入之间的差距,为经过培训的模型提供燃料的问题仍未得到充分探讨。此外,将审议机制纳入经过预先培训的文本对文本模式以解决表格对文本的任务的另一项挑战仍然很少研究。在本文件中,为了采用经过预先培训的语言模型的表格对文本生成,我们提出了一个表格结构理解和文本解析方法,即技术数据数据库。具体地说,我们设计了一个三层多头关注网络,以便在经过事先培训的语言模型的帮助下实现桌面结构对文本生成模型的模型。此外,还采用了一个多通路解密框架,以提高为表格描述生成的文本打光的能力。在两种公共数据集上进行的经验研究以及人类评估,证实我们的方法能够为不同类型的表格产生忠实和流利的描述性文本。