Many multimodal recommender systems have been proposed to exploit the rich side information associated with users or items (e.g., user reviews and item images) for learning better user and item representations to improve the recommendation performance. Studies from psychology show that users have individual differences in the utilization of various modalities for organizing information. Therefore, for a certain factor of an item (such as appearance or quality), the features of different modalities are of varying importance to a user. However, existing methods ignore the fact that different modalities contribute differently towards a user's preference on various factors of an item. In light of this, in this paper, we propose a novel Disentangled Multimodal Representation Learning (DMRL) recommendation model, which can capture users' attention to different modalities on each factor in user preference modeling. In particular, we employ a disentangled representation technique to ensure the features of different factors in each modality are independent of each other. A multimodal attention mechanism is then designed to capture users' modality preference for each factor. Based on the estimated weights obtained by the attention mechanism, we make recommendations by combining the preference scores of a user's preferences to each factor of the target item over different modalities. Extensive evaluation on five real-world datasets demonstrate the superiority of our method compared with existing methods.
翻译:提出了许多多式联运建议系统,以利用与用户或项目(如用户审查和项目图像)有关的丰富侧面信息,学习更好的用户和项目表示方式,改进建议性能;心理学研究表明,用户在利用信息组织方式方面各有不同,因此,对于某一项目的一个特定因素(如外观或质量),不同模式的特点对用户来说具有不同的重要性;然而,现有方法忽视了以下事实,即不同模式对用户选择项目的各种因素有不同的贡献。鉴于此,我们提议了一个新的分解多模式学习建议模式,可以让用户注意用户偏好每个因素的不同模式,从而了解用户偏好每个因素的模式;特别是,我们采用了一种分解的表述方法,确保每种模式中不同因素的特征相互独立;然后,设计一个多式联运关注机制,以掌握用户对每个要素的偏好模式的偏好。根据注意机制的估计重量,我们提出建议,将用户偏好偏爱的偏好分数与目标项中每个要素的偏好分数与现有不同模式相比,即现有数据方法的高度性评估相结合。