Abstract meaning representation (AMR) highlights the core semantic information of text in a graph structure. Recently, pre-trained language models (PLMs) have advanced tasks of AMR parsing and AMR-to-text generation, respectively. However, PLMs are typically pre-trained on textual data, thus are sub-optimal for modeling structural knowledge. To this end, we investigate graph self-supervised training to improve the structure awareness of PLMs over AMR graphs. In particular, we introduce two graph auto-encoding strategies for graph-to-graph pre-training and four tasks to integrate text and graph information during pre-training. We further design a unified framework to bridge the gap between pre-training and fine-tuning tasks. Experiments on both AMR parsing and AMR-to-text generation show the superiority of our model. To our knowledge, we are the first to consider pre-training on semantic graphs.
翻译:摘要意思表示(AMR)突出图表结构中文本的核心语义信息。最近,经过培训的语文模型(PLMs)分别具有亚马逊分解和亚马逊对文本生成的高级任务。然而,PLMs通常在文本数据方面受过预先培训,因此对于结构知识的建模来说是次最佳的。为此,我们调查了图式自我监督培训,以提高PLMs对AMR图表的结构意识。特别是,我们引入了两个图式自动编码战略,用于图式到绘图的预培训,以及四项任务,用于在预培训期间整合文本和图表信息。我们进一步设计了一个统一框架,以弥合培训前和微调任务之间的差距。关于亚马逊分解和亚马逊对文本生成的实验显示了我们模型的优越性。据我们所知,我们首先考虑对语义图进行预培训。