The task of verbalization of RDF triples has known a growth in popularity due to the rising ubiquity of Knowledge Bases (KBs). The formalism of RDF triples is a simple and efficient way to store facts at a large scale. However, its abstract representation makes it difficult for humans to interpret. For this purpose, the WebNLG challenge aims at promoting automated RDF-to-text generation. We propose to leverage pre-trainings from augmented data with the Transformer model using a data augmentation strategy. Our experiment results show a minimum relative increases of 3.73%, 126.05% and 88.16% in BLEU score for seen categories, unseen entities and unseen categories respectively over the standard training.
翻译:由于知识库(KBs)日益普遍,RDF三重语言化的任务已显露出受欢迎程度的增长。RDF三重的正规主义是大规模存储事实的简单而有效的方式。然而,它的抽象表述使得人类难以解释。为此,WebNLG的挑战旨在促进自动 RDF 生成文本。我们提议利用数据增强战略,利用变异器模型的强化数据进行预培训。我们的实验结果显示,在BLEU分数中,可见类别、隐形实体和未知类别在标准培训中分别至少相对增长3.73%、126.05 % 和88.16% 。