变换器和CNN都用SBIR打人 (Transformers and CNNs both Beat Humans on SBIR) - 专知论文

会员服务 ·

0

变换 · 翻转 · MoDELS · 查全率/召回率 · Performer ·

2022 年 9 月 14 日

Transformers and CNNs both Beat Humans on SBIR

翻译：变换器和CNN都用SBIR打人

Omar Seddati,Stéphane Dupont,Saïd Mahmoudi,Thierry Dutoit

Sketch-based image retrieval (SBIR) is the task of retrieving natural images (photos) that match the semantics and the spatial configuration of hand-drawn sketch queries. The universality of sketches extends the scope of possible applications and increases the demand for efficient SBIR solutions. In this paper, we study classic triplet-based SBIR solutions and show that a persistent invariance to horizontal flip (even after model finetuning) is harming performance. To overcome this limitation, we propose several approaches and evaluate in depth each of them to check their effectiveness. Our main contributions are twofold: We propose and evaluate several intuitive modifications to build SBIR solutions with better flip equivariance. We show that vision transformers are more suited for the SBIR task, and that they outperform CNNs with a large margin. We carried out numerous experiments and introduce the first models to outperform human performance on a large-scale SBIR benchmark (Sketchy). Our best model achieves a recall of 62.25% (at k = 1) on the sketchy benchmark compared to previous state-of-the-art methods 46.2%.

翻译：以绘图为基础的图像检索( SIR) 是检索自然图像( 照片) 的任务, 这些图像与手绘的草图查询的语义和空间配置相匹配。草图的普遍性扩大了可能的应用范围, 增加了对高效的SBIR解决方案的需求。在本文中, 我们研究传统的三重基的SBIR解决方案, 并表明横向翻转( 即使在模型微调后) 的持续偏差会损害性能。为了克服这一限制, 我们提出了几种方法, 并深入评估其中的每个方法以检查其有效性。我们的主要贡献是双重的: 我们提出并评价了几项直观的修改, 以更好的翻转式方式构建SBIR 解决方案。我们显示, 视觉变异器更适合执行SBIR 任务, 并大大超过CNN 。我们进行了无数的实验, 并引入了第一个模型, 以在大型的SBIR 基准( sketchy) 上超越人类的绩效。我们的最佳模型比先前的状态方法46. 2 % 。

0

相关内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【新书】贝叶斯网络进展与新应用，附全书下载

【新书】贝叶斯网络进展与新应用，附全书下载

专知会员服务

122+阅读 · 2019年12月9日

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

专知会员服务

43+阅读 · 2019年11月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

19+阅读 · 2017年12月17日

20克级的水溶性Mn-Cu-In-S磁/光双功能量子点的制备

国家自然科学基金

0+阅读 · 2015年12月31日

旋毛虫肠道感染性幼虫表面蛋白与宿主肠上皮细胞相互作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

肿瘤抗原HCA587与STAT3的相互作用及其促进肿瘤转移的分子机制研究

国家自然科学基金

1+阅读 · 2014年12月31日

稀土MOF纳米荧光探针的设计合成及其生物应用

国家自然科学基金

0+阅读 · 2013年12月31日

新型抗生素Bagremycins生物合成基因簇的鉴定与解析

国家自然科学基金

0+阅读 · 2012年12月31日

控制有机半导体材料分子按照face-on 方式排列的高性能薄膜晶体管的研究

国家自然科学基金

0+阅读 · 2012年12月31日

hTERT启动子调控下CD137L在肺癌A549细胞中的表达及其抑制肿瘤免疫的实验研究

国家自然科学基金

0+阅读 · 2011年12月31日

光场的强度关联成像及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

拷贝数变异与精神分裂症的关联研究

国家自然科学基金

0+阅读 · 2008年12月31日

西南地区姬蠊亚科物种多样性与区系研究

国家自然科学基金

0+阅读 · 2008年12月31日

Training and Inference on Any-Order Autoregressive Models the Right Way

Arxiv

0+阅读 · 2022年10月24日

Self-supervised Rewiring of Pre-trained Speech Encoders: Towards Faster Fine-tuning with Less Labels in Speech Processing

Arxiv

0+阅读 · 2022年10月24日

LittleBird: Efficient Faster & Longer Transformer for Question Answering

Arxiv

0+阅读 · 2022年10月21日

Uplift and Upsample: Efficient 3D Human Pose Estimation with Uplifting Transformers

Arxiv

0+阅读 · 2022年10月21日

A Survey on Over-the-Air Computation

Arxiv

0+阅读 · 2022年10月20日

Trustworthy AI: A Computational Perspective

Arxiv

12+阅读 · 2021年8月19日

AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing

Arxiv

23+阅读 · 2021年8月12日

A Survey of Human-in-the-loop for Machine Learning

Arxiv

35+阅读 · 2021年8月2日

Relational Learning with Gated and Attentive Neighbor Aggregator for Few-Shot Knowledge Graph Completion

Arxiv

12+阅读 · 2021年4月27日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

VIP会员

文章信息

相关主题

查全率/召回率

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【新书】贝叶斯网络进展与新应用，附全书下载

【新书】贝叶斯网络进展与新应用，附全书下载

专知会员服务

122+阅读 · 2019年12月9日

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

专知会员服务

43+阅读 · 2019年11月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

新质生成式AI赋能产业变革的实践与路径

用于多模态大模型的离散标记化：全面综述

Nature综述：金融网络中的物理学

【CMU博士论文】通信高效且差分隐私的优化方法

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

19+阅读 · 2017年12月17日

相关论文

Training and Inference on Any-Order Autoregressive Models the Right Way

Arxiv

0+阅读 · 2022年10月24日

Self-supervised Rewiring of Pre-trained Speech Encoders: Towards Faster Fine-tuning with Less Labels in Speech Processing

Arxiv

0+阅读 · 2022年10月24日

LittleBird: Efficient Faster & Longer Transformer for Question Answering

Arxiv

0+阅读 · 2022年10月21日

Uplift and Upsample: Efficient 3D Human Pose Estimation with Uplifting Transformers

Arxiv

0+阅读 · 2022年10月21日

A Survey on Over-the-Air Computation

Arxiv

0+阅读 · 2022年10月20日

Trustworthy AI: A Computational Perspective

Arxiv

12+阅读 · 2021年8月19日

AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing

Arxiv

23+阅读 · 2021年8月12日

A Survey of Human-in-the-loop for Machine Learning

Arxiv

35+阅读 · 2021年8月2日

Relational Learning with Gated and Attentive Neighbor Aggregator for Few-Shot Knowledge Graph Completion

Arxiv

12+阅读 · 2021年4月27日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

相关基金

20克级的水溶性Mn-Cu-In-S磁/光双功能量子点的制备

国家自然科学基金

0+阅读 · 2015年12月31日

旋毛虫肠道感染性幼虫表面蛋白与宿主肠上皮细胞相互作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

肿瘤抗原HCA587与STAT3的相互作用及其促进肿瘤转移的分子机制研究

国家自然科学基金

1+阅读 · 2014年12月31日

稀土MOF纳米荧光探针的设计合成及其生物应用

国家自然科学基金

0+阅读 · 2013年12月31日

新型抗生素Bagremycins生物合成基因簇的鉴定与解析

国家自然科学基金

0+阅读 · 2012年12月31日

控制有机半导体材料分子按照face-on 方式排列的高性能薄膜晶体管的研究

国家自然科学基金

0+阅读 · 2012年12月31日

hTERT启动子调控下CD137L在肺癌A549细胞中的表达及其抑制肿瘤免疫的实验研究

国家自然科学基金

0+阅读 · 2011年12月31日

光场的强度关联成像及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

拷贝数变异与精神分裂症的关联研究

国家自然科学基金

0+阅读 · 2008年12月31日

西南地区姬蠊亚科物种多样性与区系研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员