制作:在道保搜索中,通过愿景语言预培训进行产品检索,产品检索。 (MAKE: Product Retrieval with Vision-Language Pre-training in Taobao Search) - 专知论文

会员服务 ·

0

淘宝网 · 秩 · Extensibility · INFORMS · CLUES ·

2023 年 1 月 30 日

MAKE: Product Retrieval with Vision-Language Pre-training in Taobao Search

翻译：制作:在道保搜索中,通过愿景语言预培训进行产品检索,产品检索。

Xiaoyang Zheng,Zilong Wang,Sen Li,Ke Xu,Tao Zhuang,Qingwen Liu,Xiaoyi Zeng

from arxiv, 5 pages, accepted to The Industry Track of the Web Conference 2023

Taobao Search consists of two phases: the retrieval phase and the ranking phase. Given a user query, the retrieval phase returns a subset of candidate products for the following ranking phase. Recently, the paradigm of pre-training and fine-tuning has shown its potential in incorporating visual clues into retrieval tasks. In this paper, we focus on solving the problem of text-to-multimodal retrieval in Taobao Search. We consider that users' attention on titles or images varies on products. Hence, we propose a novel Modal Adaptation module for cross-modal fusion, which helps assigns appropriate weights on texts and images across products. Furthermore, in e-commerce search, user queries tend to be brief and thus lead to significant semantic imbalance between user queries and product titles. Therefore, we design a separate text encoder and a Keyword Enhancement mechanism to enrich the query representations and improve text-to-multimodal matching. To this end, we present a novel vision-language (V+L) pre-training methods to exploit the multimodal information of (user query, product title, product image). Extensive experiments demonstrate that our retrieval-specific pre-training model (referred to as MAKE) outperforms existing V+L pre-training methods on the text-to-multimodal retrieval task. MAKE has been deployed online and brings major improvements on the retrieval system of Taobao Search.

翻译：Taobao 搜索由两个阶段组成: 检索阶段和排名阶段。用户询问后, 检索阶段返回了下一个排名阶段的一组候选产品。最近, 培训前和微调的范例展示了将视觉线索纳入检索任务中的潜力。在本文中, 我们侧重于解决在道保搜索中文本到多式检索的问题。我们认为用户对标题或图像的关注因产品而异。因此, 我们为跨模式融合提出了一个新的模式适应模块, 这有助于对文本和图像进行适当的加权。此外, 在电子商务搜索中, 用户询问往往很简短, 从而导致用户查询和产品标题之间的语义不平衡。因此, 我们设计了一个单独的文本编码和关键词“加强”机制, 以丰富查询表达方式和改进文本到多式匹配。为此, 我们提出了一个新的愿景语言( V+L) 预培训方法, 以利用( 用户查询、产品标题、产品图像) 的多式联运信息。此外, 在电子商业搜索中, 用户询问往往很简短, 从而导致用户查询和产品标题之间的语义严重不平衡。因此, 我们的检索前系统前的检索模式已经将主要任务升级模式带到了VBATO 。

0

相关内容

淘宝网

淘宝网（ Taobao，口号：淘！我喜欢。）是全球最大的网络零售商圈，致力打造全球领先网络售卖平台，由阿里巴巴集团在2003年5月10日投资创立。淘宝网现在业务跨越C2C（个人对个人）、B2C（商家对个人）、购物搜索三大部分。

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

专知会员服务

60+阅读 · 2022年5月5日

迁移学习简明教程，11页ppt

迁移学习简明教程，11页ppt

专知会员服务

109+阅读 · 2020年8月4日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

随机偏微分方程

国家自然科学基金

6+阅读 · 2017年12月31日

PTPN22基因启动子多态性影响1型糖尿病易感性的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

EYA4基因抑制肝细胞癌生长的分子机制探讨

国家自然科学基金

0+阅读 · 2014年12月31日

β-catenin/Ets1复合体在胶质母细胞瘤中对hTERT表达调控机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

亚砷酸钠胁迫下酵母防御基因的差异表达对细胞凋亡的影响

国家自然科学基金

0+阅读 · 2013年12月31日

微囊藻毒素生物降解过程mlr基因功能和mRNA转录水平响应机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

马铃薯花粉特异基因SBgLR 5'非翻译区（UTR）在基因表达调控中的作用

国家自然科学基金

0+阅读 · 2011年12月31日

Period2基因调控人胶质瘤细胞凋亡的分子机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

SARI基因在肺癌侵袭转移中的作用及分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

增韧基团-姜黄素酯/hTERT特异性核酸偶联物双功能靶向影响前列腺癌的机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

On-the-fly Text Retrieval for End-to-End ASR Adaptation

Arxiv

0+阅读 · 2023年3月20日

Audio-Text Models Do Not Yet Leverage Natural Language

Arxiv

0+阅读 · 2023年3月19日

Finding Similar Exercises in Retrieval Manner

Arxiv

0+阅读 · 2023年3月15日

Understanding and Constructing Latent Modality Structures in Multi-modal Representation Learning

Arxiv

11+阅读 · 2023年3月10日

Pre-training Methods in Information Retrieval

Arxiv

16+阅读 · 2021年11月27日

A Graph-based Relevance Matching Model for Ad-hoc Retrieval

Arxiv

11+阅读 · 2021年1月28日

PROP: Pre-training with Representative Words Prediction for Ad-hoc Retrieval

Arxiv

11+阅读 · 2020年10月20日

Embedding-based Retrieval in Facebook Search

Arxiv

12+阅读 · 2020年6月20日

Detect-to-Retrieve: Efficient Regional Aggregation for Image Search

Arxiv

15+阅读 · 2018年12月4日

DeepSeek: Content Based Image Search & Retrieval

Arxiv

13+阅读 · 2018年1月11日

VIP会员

文章信息

相关主题

相关VIP内容

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

专知会员服务

60+阅读 · 2022年5月5日

迁移学习简明教程，11页ppt

迁移学习简明教程，11页ppt

专知会员服务

109+阅读 · 2020年8月4日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《为多域数字战场变革装甲力量》报告

《多域训练：利用开放标准将太空与网络域同陆、海、空域训练相整合》报告

面向城市战：欧美徒步作战新装备

《人工智能增强监视分析：利用跨网络、陆地、空中及海上领域的威胁向量实时建模》

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

相关论文

On-the-fly Text Retrieval for End-to-End ASR Adaptation

Arxiv

0+阅读 · 2023年3月20日

Audio-Text Models Do Not Yet Leverage Natural Language

Arxiv

0+阅读 · 2023年3月19日

Finding Similar Exercises in Retrieval Manner

Arxiv

0+阅读 · 2023年3月15日

Understanding and Constructing Latent Modality Structures in Multi-modal Representation Learning

Arxiv

11+阅读 · 2023年3月10日

Pre-training Methods in Information Retrieval

Arxiv

16+阅读 · 2021年11月27日

A Graph-based Relevance Matching Model for Ad-hoc Retrieval

Arxiv

11+阅读 · 2021年1月28日

PROP: Pre-training with Representative Words Prediction for Ad-hoc Retrieval

Arxiv

11+阅读 · 2020年10月20日

Embedding-based Retrieval in Facebook Search

Arxiv

12+阅读 · 2020年6月20日

Detect-to-Retrieve: Efficient Regional Aggregation for Image Search

Arxiv

15+阅读 · 2018年12月4日

DeepSeek: Content Based Image Search & Retrieval

Arxiv

13+阅读 · 2018年1月11日

相关基金

随机偏微分方程

国家自然科学基金

6+阅读 · 2017年12月31日

PTPN22基因启动子多态性影响1型糖尿病易感性的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

EYA4基因抑制肝细胞癌生长的分子机制探讨

国家自然科学基金

0+阅读 · 2014年12月31日

β-catenin/Ets1复合体在胶质母细胞瘤中对hTERT表达调控机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

亚砷酸钠胁迫下酵母防御基因的差异表达对细胞凋亡的影响

国家自然科学基金

0+阅读 · 2013年12月31日

微囊藻毒素生物降解过程mlr基因功能和mRNA转录水平响应机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

马铃薯花粉特异基因SBgLR 5'非翻译区（UTR）在基因表达调控中的作用

国家自然科学基金

0+阅读 · 2011年12月31日

Period2基因调控人胶质瘤细胞凋亡的分子机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

SARI基因在肺癌侵袭转移中的作用及分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

增韧基团-姜黄素酯/hTERT特异性核酸偶联物双功能靶向影响前列腺癌的机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员