对语义搜索的有规范的违规学习 (Regularized Contrastive Learning of Semantic Search) - 专知论文

会员服务 ·

0

正则化项 · Learning · contrastive · MoDELS · 表示 ·

2022 年 9 月 27 日

Regularized Contrastive Learning of Semantic Search

翻译：对语义搜索的有规范的违规学习

Mingxi Tan,Alexis Rolland,Andong Tian

from arxiv, 13 pages, 3 figures, paper accepted by NLPCC 2022

Semantic search is an important task which objective is to find the relevant index from a database for query. It requires a retrieval model that can properly learn the semantics of sentences. Transformer-based models are widely used as retrieval models due to their excellent ability to learn semantic representations. in the meantime, many regularization methods suitable for them have also been proposed. In this paper, we propose a new regularization method: Regularized Contrastive Learning, which can help transformer-based models to learn a better representation of sentences. It firstly augments several different semantic representations for every sentence, then take them into the contrastive objective as regulators. These contrastive regulators can overcome overfitting issues and alleviate the anisotropic problem. We firstly evaluate our approach on 7 semantic search benchmarks with the outperforming pre-trained model SRoBERTA. The results show that our method is more effective for learning a superior sentence representation. Then we evaluate our approach on 2 challenging FAQ datasets, Cough and Faqir, which have long query and index. The results of our experiments demonstrate that our method outperforms baseline methods.

翻译：语义搜索是一项重要任务,目标是从一个查询数据库中找到相关的索引。它需要一个检索模型, 可以正确学习句子的语义学。以变换器为基础的模型由于其学习语义学表现的出色能力, 被广泛用作检索模型。同时, 也提出了许多适合他们的正规化方法。在本文件中, 我们提出了一种新的正规化方法: 常规化的对立学习, 有助于变换器模型学习更好的句子表达方式。它首先可以增加每个句子的几种不同的语义表达方式, 然后把它们变成对比性的目标。这些对比型的监管机构可以克服过于适合的问题, 并缓解异质学问题。我们首先评估我们7个语义搜索基准的方法, 并使用业绩优异的预培训模型 SRBERTA。结果显示, 我们的方法对于学习优异的句表达方式更有效。然后我们评估我们对两个挑战FAQ数据集、 Cough 和 Faqir 的方法, 的方法有很长的查询和索引。我们的实验结果显示我们的方法超越了基线方法。

0

相关内容

正则化项

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

专知会员服务

49+阅读 · 2020年5月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

短文本情感分析关键技术研究

国家自然科学基金

9+阅读 · 2015年12月31日

IL-32/Integrins/FAK通路在肝纤维化形成中的作用研究

国家自然科学基金

0+阅读 · 2013年12月31日

组蛋白乙酰化/去乙酰化转录调控SEMA3E在胃癌细胞生长和转移中的作用机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Perp在类风湿性关节炎外周Th17细胞存活中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

MiR-449介导KDM4C-Notch通路在三阴性乳腺癌增殖转移中的调控研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Keap1-Nrf2-ARE信号通路的活性先导化合物的发现及作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

聚电解质的表征

国家自然科学基金

0+阅读 · 2011年12月31日

microRNA-125b下调SRF抑制动脉平滑肌细胞增殖和迁移研究

国家自然科学基金

0+阅读 · 2011年12月31日

Fe3O4/TiO2核壳纳米纤维吸附脱除水中氯酚类化合物的分子机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

SARI基因在肺癌侵袭转移中的作用及分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

A Joint Framework Towards Class-aware and Class-agnostic Alignment for Few-shot Segmentation

Arxiv

0+阅读 · 2022年11月2日

EquiMod: An Equivariance Module to Improve Self-Supervised Learning

EquiMod: An Equivariance Module to Improve Self-Supervised Learning

Arxiv

0+阅读 · 2022年11月2日

Multi-level Distillation of Semantic Knowledge for Pre-training Multilingual Language Model

Multi-level Distillation of Semantic Knowledge for Pre-training Multilingual Language Model

Arxiv

0+阅读 · 2022年11月2日

Joint Data and Feature Augmentation for Self-Supervised Representation Learning on Point Clouds

Arxiv

0+阅读 · 2022年11月2日

ARDIR: Improving Robustness using Knowledge Distillation of Internal Representation

Arxiv

0+阅读 · 2022年11月1日

Embedding Space Augmentation for Weakly Supervised Learning in Whole-Slide Images

Arxiv

0+阅读 · 2022年10月31日

Self-Regularized Prototypical Network for Few-Shot Semantic Segmentation

Arxiv

0+阅读 · 2022年10月30日

How Does Knowledge Graph Embedding Extrapolate to Unseen Data: a Semantic Evidence View

Arxiv

15+阅读 · 2022年1月5日

Sequence Level Contrastive Learning for Text Summarization

Sequence Level Contrastive Learning for Text Summarization

Arxiv

14+阅读 · 2021年9月24日

A Simple Framework for Contrastive Learning of Visual Representations

Arxiv

21+阅读 · 2020年2月13日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

专知会员服务

49+阅读 · 2020年5月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

相关论文

A Joint Framework Towards Class-aware and Class-agnostic Alignment for Few-shot Segmentation

Arxiv

0+阅读 · 2022年11月2日

EquiMod: An Equivariance Module to Improve Self-Supervised Learning

EquiMod: An Equivariance Module to Improve Self-Supervised Learning

Arxiv

0+阅读 · 2022年11月2日

Multi-level Distillation of Semantic Knowledge for Pre-training Multilingual Language Model

Multi-level Distillation of Semantic Knowledge for Pre-training Multilingual Language Model

Arxiv

0+阅读 · 2022年11月2日

Joint Data and Feature Augmentation for Self-Supervised Representation Learning on Point Clouds

Arxiv

0+阅读 · 2022年11月2日

ARDIR: Improving Robustness using Knowledge Distillation of Internal Representation

Arxiv

0+阅读 · 2022年11月1日

Embedding Space Augmentation for Weakly Supervised Learning in Whole-Slide Images

Arxiv

0+阅读 · 2022年10月31日

Self-Regularized Prototypical Network for Few-Shot Semantic Segmentation

Arxiv

0+阅读 · 2022年10月30日

How Does Knowledge Graph Embedding Extrapolate to Unseen Data: a Semantic Evidence View

Arxiv

15+阅读 · 2022年1月5日

Sequence Level Contrastive Learning for Text Summarization

Sequence Level Contrastive Learning for Text Summarization

Arxiv

14+阅读 · 2021年9月24日

A Simple Framework for Contrastive Learning of Visual Representations

Arxiv

21+阅读 · 2020年2月13日

相关基金

短文本情感分析关键技术研究

国家自然科学基金

9+阅读 · 2015年12月31日

IL-32/Integrins/FAK通路在肝纤维化形成中的作用研究

国家自然科学基金

0+阅读 · 2013年12月31日

组蛋白乙酰化/去乙酰化转录调控SEMA3E在胃癌细胞生长和转移中的作用机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Perp在类风湿性关节炎外周Th17细胞存活中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

MiR-449介导KDM4C-Notch通路在三阴性乳腺癌增殖转移中的调控研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Keap1-Nrf2-ARE信号通路的活性先导化合物的发现及作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

聚电解质的表征

国家自然科学基金

0+阅读 · 2011年12月31日

microRNA-125b下调SRF抑制动脉平滑肌细胞增殖和迁移研究

国家自然科学基金

0+阅读 · 2011年12月31日

Fe3O4/TiO2核壳纳米纤维吸附脱除水中氯酚类化合物的分子机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

SARI基因在肺癌侵袭转移中的作用及分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员