Relation Extraction (RE) from tables is the task of identifying relations between pairs of columns. Generally, RE models for this task require labelled tables for training. Luckily, labelled tables can also be generated artificially from a Knowledge Graph (KG), which makes the cost to acquire them much lower in comparison to manual annotations. However, these tables have one drawback compared to real tables, which is that they lack associated metadata, such as column-headers, captions, etc. This is because synthetic tables are created out of KGs that do not store such metadata. Unfortunately, metadata can provide strong signals for RE from tables. To address this issue, we propose methods to artificially create some of this metadata for synthetic tables. We then experiment with a RE model that uses artificial metadata as input. Our empirical results show that this leads to an improvement of 9\%-45\% in F1 score, in absolute terms, over 2 tabular datasets.
翻译:从表格中提取关系(RE)是确定一对列之间的关系的任务。一般来说,这项任务的RE模型需要贴标签的培训表格。幸运的是,标签表格也可以由知识图表(KG)人为生成,因为与手工注释相比,该图的成本要低得多。然而,这些表格与真实表格相比有一个缺点,即它们缺乏相关的元数据,例如单栏标题、说明等。这是因为合成表格是由不存储此类元数据的KG制成的。不幸的是,元数据可以从表格中为RE提供强有力的信号。为解决这一问题,我们建议了人为创建合成表格中的某些元数据的方法。然后我们用人工元数据作为投入的RE模型进行实验。我们的经验结果表明,这导致F1分中9 ⁇ -45 ⁇ 的绝对值改进,超过2个表格数据集。