To improve users' experience as they navigate the myriad of options offered by online marketplaces, it is essential to have well-organized product catalogs. One key ingredient to that is the availability of product attributes such as color or material. However, on some marketplaces such as Rakuten-Ichiba, which we focus on, attribute information is often incomplete or even missing. One promising solution to this problem is to rely on deep models pre-trained on large corpora to predict attributes from unstructured data, such as product descriptive texts and images (referred to as modalities in this paper). However, we find that achieving satisfactory performance with this approach is not straightforward but rather the result of several refinements, which we discuss in this paper. We provide a detailed description of our approach to attribute extraction, from investigating strong single-modality methods, to building a solid multimodal model combining textual and visual information. One key component of our multimodal architecture is a novel approach to seamlessly combine modalities, which is inspired by our single-modality investigations. In practice, we notice that this new modality-merging method may suffer from a modality collapse issue, i.e., it neglects one modality. Hence, we further propose a mitigation to this problem based on a principled regularization scheme. Experiments on Rakuten-Ichiba data provide empirical evidence for the benefits of our approach, which has been also successfully deployed to Rakuten-Ichiba. We also report results on publicly available datasets showing that our model is competitive compared to several recent multimodal and unimodal baselines.
翻译:为了改进用户在网上市场提供的各种选择中的经验,必须拥有组织完善的产品目录。其中的一个关键要素是产品属性的可用性,如颜色或材料。然而,在诸如我们关注的拉库滕-伊奇巴等一些市场中,信息往往不完整甚至缺失。这个问题的一个有希望的解决办法是依靠在大型公司上预先培训的深层模型来预测非结构化数据(如产品描述文本和图像)(本文称之为模式)的属性。然而,我们发现,采用这一方法取得令人满意的业绩并非直截了当,而是由于我们在本文件中讨论的几项改进的结果。我们详细描述了我们从研究强力单一模式方法到建立将文本和视觉信息相结合的可靠多式联运模型,我们多式联运结构的一个关键组成部分是对无缝结合模式的一种新颖方法,这种方法受到我们单一模式调查的启发。在实践中,我们注意到,这种新模式的合并方法可能因最近模式崩溃问题而受到影响,例如,我们在本文件中讨论过一些改进的结果。我们从研究的单一模式的模型到一个实验性模式的模型,它忽视了一种模式的模型的模型的模型的模型,它提供了一种模式的模型的模型的模型的模型的模型的模型的模型的模型的模型。它提供了一种方法,它提供了一种模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的原理。它提供了一种方法,它提供了一种模型的模型的模型的模型的模型的模型的模型的原理。它提供了一种方法,它提供了一种方法。它提供了一种方法。