Reviews contain rich information about product characteristics and user interests and thus are commonly used to boost recommender system performance. Specifically, previous work show that jointly learning to perform review generation improves rating prediction performance. Meanwhile, these model-produced reviews serve as recommendation explanations, providing the user with insights on predicted ratings. However, while existing models could generate fluent, human-like reviews, it is unclear to what degree the reviews fully uncover the rationale behind the jointly predicted rating. In this work, we perform a series of evaluations that probes state-of-the-art models and their review generation component. We show that the generated explanations are brittle and need further evaluation before being taken as literal rationales for the estimated ratings.
翻译:具体地说,先前的工作表明,共同学习进行评分生成,可以提高评分预测绩效。同时,这些示范性审评可作为建议解释,为用户提供关于预测评级的见解。然而,虽然现有模型可以产生流畅、人性化的审评,但尚不清楚这些审评在多大程度上充分揭示了共同预测评级背后的理由。在这项工作中,我们进行了一系列评估,调查最新模型及其评分生成部分。我们显示,所得出的解释是简洁的,在作为估计评级的字面理由之前需要进一步评估。