Conversational recommender systems (CRS) that are able to interact with users in natural language often utilize recommendation dialogs which were previously collected with the help of paired humans, where one plays the role of a seeker and the other as a recommender. These recommendation dialogs include items and entities that indicate the users' preferences. In order to precisely model the seekers' preferences and respond consistently, CRS typically rely on item and entity annotations. A recent example of such a dataset is INSPIRED, which consists of recommendation dialogs for sociable conversational recommendation, where items and entities were annotated using automatic keyword or pattern matching techniques. An analysis of this dataset unfortunately revealed that there is a substantial number of cases where items and entities were either wrongly annotated or annotations were missing at all. This leads to the question to what extent automatic techniques for annotations are effective. Moreover, it is important to study impact of annotation quality on the overall effectiveness of a CRS in terms of the quality of the system's responses. To study these aspects, we manually fixed the annotations in INSPIRED. We then evaluated the performance of several benchmark CRS using both versions of the dataset. Our analyses suggest that the improved version of the dataset, i.e., INSPIRED2, helped increase the performance of several benchmark CRS, emphasizing the importance of data quality both for end-to-end learning and retrieval-based approaches to conversational recommendation. We release our improved dataset (INSPIRED2) publicly at https://github.com/ahtsham58/INSPIRED2.
翻译:能够与自然语言用户互动的沟通建议系统(CRS)经常使用以前在配对人的帮助下收集的建议对话,其中一个人扮演着寻求者的角色,而另一个人则扮演着推荐者的角色。这些建议对话包括表明用户偏好的项目和实体。为了精确地模拟寻求者的偏好并作出一致反应,CRS通常依赖项目和实体说明。这种数据集的最近一个例子是IMSPIRED, 其中包括用于改进对等对话建议的建议对话对话对话对话,其中项目和实体使用自动关键词或模式匹配技术进行附加说明。对这一数据集的分析不幸显示,有大量项目和实体要么错误地附加说明,要么完全缺少说明者的角色和实体。这导致一个问题,即为了精确地模拟请求者的偏好,CRS通常依赖项目和实体的说明。此外,必须研究说明质量对于CRSA在系统响应质量方面的总体有效性的影响。为了研究这些方面,我们用IMISRED的自动关键词来对项目和实体作了说明。我们随后在IMISRED中评估了几部数据库的质量分析的绩效,我们用两个版本都强调了CRS的数据。