电子商务统一视觉语言代表模型 (Unified Vision-Language Representation Modeling for E-Commerce Same-Style Products Retrieval)

Same-style products retrieval plays an important role in e-commerce platforms, aiming to identify the same products which may have different text descriptions or images. It can be used for similar products retrieval from different suppliers or duplicate products detection of one supplier. Common methods use the image as the detected object, but they only consider the visual features and overlook the attribute information contained in the textual descriptions, and perform weakly for products in image less important industries like machinery, hardware tools and electronic component, even if an additional text matching module is added. In this paper, we propose a unified vision-language modeling method for e-commerce same-style products retrieval, which is designed to represent one product with its textual descriptions and visual contents. It contains one sampling skill to collect positive pairs from user click log with category and relevance constrained, and a novel contrastive loss unit to model the image, text, and image+text representations into one joint embedding space. It is capable of cross-modal product-to-product retrieval, as well as style transfer and user-interactive search. Offline evaluations on annotated data demonstrate its superior retrieval performance, and online testings show it can attract more clicks and conversions. Moreover, this model has already been deployed online for similar products retrieval in alibaba.com, the largest B2B e-commerce platform in the world.

翻译：同类产品检索在电子商务平台中起着重要作用,目的是确定可能具有不同文本描述或图像的相同产品,可用于不同供应商的类似产品检索或对一个供应商的重复产品检测。通用方法使用图像作为检测对象,但通常的方法只是将图像作为检测对象使用,忽视文本描述中包含的属性信息,对像素描述中包含的属性信息不甚重视,对像机械、硬件工具和电子组件这样不太重要的图像行业的产品,即使添加了额外的文本匹配模块,也表现不力。在本文中,我们提议了一种统一的电子商务同类产品检索的视觉语言模型方法,设计该方法是为了代表一种带有文本描述和视觉内容的产品。该方法包含一种从用户点击记录中采集正对正对的取样技能,而其类别和相关性受限制,以及一个全新的对比损失单位,将图像+文本显示成一个联合嵌入空间的图像、文本和图像+文本展示。它能够跨模式产品对产品进行检索,以及风格传输和用户互动搜索。在附加注释的数据上显示其高级的检索性表现,而在线测试则显示它能够吸引更多的电子检索。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【超赞的#C++#速查&信息图】“hacking c++ - Cheat Sheets & Infographics”

专知会员服务

30+阅读 · 2022年3月8日

【Facebook-Ishan Mishra】计算机视觉自监督学习，92页ppt

专知会员服务

36+阅读 · 2021年7月7日