具有电子商务基准数据集的常识知识素量评估 (Commonsense Knowledge Salience Evaluation with a Benchmark Dataset in E-commerce)

In e-commerce, the salience of commonsense knowledge (CSK) is beneficial for widespread applications such as product search and recommendation. For example, when users search for "running" in e-commerce, they would like to find items highly related to running, such as "running shoes" rather than "shoes". However, many existing CSK collections rank statements solely by confidence scores, and there is no information about which ones are salient from a human perspective. In this work, we define the task of supervised salience evaluation, where given a CSK triple, the model is required to learn whether the triple is salient or not. In addition to formulating the new task, we also release a new Benchmark dataset of Salience Evaluation in E-commerce (BSEE) and hope to promote related research on commonsense knowledge salience evaluation. We conduct experiments in the dataset with several representative baseline models. The experimental results show that salience evaluation is a hard task where models perform poorly on our evaluation set. We further propose a simple but effective approach, PMI-tuning, which shows promise for solving this novel problem.

翻译：在电子商务中,普通知识(CSK)的突出之处有利于产品搜索和建议等广泛应用。例如,当用户在电子商务中寻找“运行”时,他们希望找到与运行高度相关的项目,例如“运行鞋”而不是“鞋”。然而,许多现有的CSK收藏只按信任分排列报表,没有从人的角度来说明哪些报表具有显著性。在这项工作中,我们界定了受监督的显著评价的任务,如果CSK为三重,则需要模型来了解三重是否具有显著性。除了制定新任务外,我们还发布新的电子商务中盐度评价基准数据集,希望促进关于常见知识显著评价的相关研究。我们在数据集中用若干具有代表性的基准模型进行实验。实验结果表明,突出评价是一项艰巨的任务,因为模型在我们的评估集中表现不佳。我们进一步提出了简单而有效的方法,即PMI调整,这显示了解决这一新问题的前景。