Recommendation systems have become popular and effective tools to help users discover their interesting items by modeling the user preference and item property based on implicit interactions (e.g., purchasing and clicking). Humans perceive the world by processing the modality signals (e.g., audio, text and image), which inspired researchers to build a recommender system that can understand and interpret data from different modalities. Those models could capture the hidden relations between different modalities and possibly recover the complementary information which can not be captured by a uni-modal approach and implicit interactions. The goal of this survey is to provide a comprehensive review of the recent research efforts on the multimodal recommendation. Specifically, it shows a clear pipeline with commonly used techniques in each step and classifies the models by the methods used. Additionally, a code framework has been designed that helps researchers new in this area to understand the principles and techniques, and easily runs the SOTA models. Our framework is located at: https://github.com/enoche/MMRec
翻译:建议系统已成为普遍而有效的工具,有助于用户通过以隐性互动(例如购买和点击)为用户偏好和物品属性的模型,发现其有趣的物品。人类通过处理模式信号(例如音频、文字和图像)来看待世界。这些信号激励研究人员建立一个能够理解和解释不同模式数据的建议系统。这些模型可以捕捉不同模式之间的隐藏关系,并可能恢复无法通过单式方式和隐性互动获取的补充信息。本调查的目的是全面审查最近就多式联运建议开展的研究工作。具体地说,它展示了一种清晰的管道,每个步骤都使用常用技术,并按使用的方法对模型进行分类。此外,还设计了一个代码框架,帮助该领域的新研究人员了解原则和技术,并轻松运行SOTA模型。我们的框架位于https://github.com/enoche/MMRec。我们的框架位于https://github.com/MNRec。