Salience Estimation aims to predict term importance in documents. Due to few existing human-annotated datasets and the subjective notion of salience, previous studies typically generate pseudo-ground truth for evaluation. However, our investigation reveals that the evaluation protocol proposed by prior work is difficult to replicate, thus leading to few follow-up studies existing. Moreover, the evaluation process is problematic: the entity linking tool used for entity matching is very noisy, while the ignorance of event argument for event evaluation leads to boosted performance. In this work, we propose a light yet practical entity and event salience estimation evaluation protocol, which incorporates the more reliable syntactic dependency parser. Furthermore, we conduct a comprehensive analysis among popular entity and event definition standards, and present our own definition for the Salience Estimation task to reduce noise during the pseudo-ground truth generation process. Furthermore, we construct dependency-based heterogeneous graphs to capture the interactions of entities and events. The empirical results show that both baseline methods and the novel GNN method utilizing the heterogeneous graph consistently outperform the previous SOTA model in all proposed metrics.
翻译:评估过程也存在问题:将实体匹配工具联系起来的实体非常吵闹,而对事件评估争论的无知导致事件评估效果的提高。在这项工作中,我们提议了一个光线但实用的实体和事件突出的估计评估程序,其中包括更可靠的合成依赖性分析器。此外,我们全面分析大众实体和事件定义标准,提出我们在模拟地面真相生成过程中减少噪音的 " 萨利思动画 " 任务。此外,我们建立基于依赖性的混合图,以捕捉实体和事件的相互作用。经验结果显示,基线方法和新颖的GNN方法均使用混合图,在所有拟议指标中始终优于以前的SOTA模型。