双编码器训练通过动态索引改进负样本挖掘 (Improving Dual-Encoder Training through Dynamic Indexes for Negative Mining) - 专知论文

会员服务 ·

0

负样本 · 样本 · Softmax · 低秩逼近 · 梯度 ·

2023 年 3 月 27 日

Improving Dual-Encoder Training through Dynamic Indexes for Negative Mining

翻译：双编码器训练通过动态索引改进负样本挖掘

Nicholas Monath,Manzil Zaheer,Kelsey Allen,Andrew McCallum

from arxiv, To appear at AISTATS 2023

Dual encoder models are ubiquitous in modern classification and retrieval. Crucial for training such dual encoders is an accurate estimation of gradients from the partition function of the softmax over the large output space; this requires finding negative targets that contribute most significantly ("hard negatives"). Since dual encoder model parameters change during training, the use of traditional static nearest neighbor indexes can be sub-optimal. These static indexes (1) periodically require expensive re-building of the index, which in turn requires (2) expensive re-encoding of all targets using updated model parameters. This paper addresses both of these challenges. First, we introduce an algorithm that uses a tree structure to approximate the softmax with provable bounds and that dynamically maintains the tree. Second, we approximate the effect of a gradient update on target encodings with an efficient Nystrom low-rank approximation. In our empirical study on datasets with over twenty million targets, our approach cuts error by half in relation to oracle brute-force negative mining. Furthermore, our method surpasses prior state-of-the-art while using 150x less accelerator memory.

翻译：双编码器模型在现代分类和检索中无处不在。训练这样的双编码器的关键是从大型输出空间中的softmax的分区函数中准确估计梯度；这需要找到对贡献最显著的负样本（“难样本”）。由于双编码器模型参数在训练过程中发生变化，使用传统的静态最近邻索引可能是次优的。这些静态索引：（1）定期需要昂贵的索引重构，这反过来又需要（2）使用更新后的模型参数进行昂贵的重新编码。本文解决了这两个挑战。首先，我们介绍了一种使用树结构来逼近softmax的算法，并动态维护树。其次，我们使用高效的Nystrom低秩逼近方法，近似梯度更新对目标编码的影响。在超过2000万个目标的数据集的实证研究中，我们的方法将误差降低了一半，相对于最基本的负样本挖掘。此外，我们的方法在使用150倍更少的加速器内存的情况下超过了之前的最新技术。

0

相关内容

负样本

对比学习需要哪样的数据？UCLA最新ICML2023论文《数据高效对比学习：简单样本贡献最大》，探究量化样本对SSL的贡献度

对比学习需要哪样的数据？UCLA最新ICML2023论文《数据高效对比学习：简单样本贡献最大》，探究量化样本对SSL的贡献度

专知会员服务

36+阅读 · 2023年5月14日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

专知会员服务

21+阅读 · 2022年3月18日

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

专知会员服务

23+阅读 · 2021年6月3日

【SIGIR2021】使用难样本优化向量检索模型

专知会员服务

27+阅读 · 2021年4月22日

近期必读的六篇ICLR 2021【对比学习（CL）】相关论文和代码

专知会员服务

26+阅读 · 2021年3月2日

【CVPR2020-Oral】无监督域内自适应语义分割，Unsupervised Intra-domain Adaptation

【CVPR2020-Oral】无监督域内自适应语义分割，Unsupervised Intra-domain Adaptation

专知会员服务

71+阅读 · 2020年4月20日

【ICLR2020】用实对二进制卷积训练二进制神经网络，Training Binary Neural Networks with Real-to-Binary Convolutions

【ICLR2020】用实对二进制卷积训练二进制神经网络，Training Binary Neural Networks with Real-to-Binary Convolutions

专知会员服务

26+阅读 · 2020年3月26日

【WSDM 2020】RecVAE:一种新的变分自编码器，用于具有隐式反馈的Top-N推荐（RecVAE: a New Variational Autoencoder for Top-NRecommendations with Implicit Feedback）

【WSDM 2020】RecVAE:一种新的变分自编码器，用于具有隐式反馈的Top-N推荐（RecVAE: a New Variational Autoencoder for Top-NRecommendations with Implicit Feedback）

专知会员服务

32+阅读 · 2019年12月26日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

浅聊对比学习（Contrastive Learning）

浅聊对比学习（Contrastive Learning）

极市平台

2+阅读 · 2022年7月26日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新十篇度量学习相关论文—可量化表示、非线性度量学习、在线深度量学习、大间隔最近邻、判别深度度量、域自适应

【论文推荐】最新十篇度量学习相关论文—可量化表示、非线性度量学习、在线深度量学习、大间隔最近邻、判别深度度量、域自适应

专知

12+阅读 · 2018年5月18日

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

专知

12+阅读 · 2018年3月24日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

19+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

基于对象模型与多点空间统计的高分辨率遥感影像分类策略

国家自然科学基金

3+阅读 · 2015年12月31日

无人机视频快速4-D重建及时空自适应索引方法研究

国家自然科学基金

7+阅读 · 2015年12月31日

基于网络的复杂疾病动态表观修饰模块挖掘

国家自然科学基金

0+阅读 · 2015年12月31日

基于多源遥感数据的森林生物量估算与空间尺度转换研究

国家自然科学基金

0+阅读 · 2013年12月31日

图像复原问题尺度自适应性关键技术研究

国家自然科学基金

1+阅读 · 2013年12月31日

干旱区典型流域极端洪水时空演变与预测研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于云的免疫检测器训练和动态更新算法及其在网络安全态势感知系统中的应用研究

国家自然科学基金

0+阅读 · 2013年12月31日

高维数据的图模型学习与统计推断

国家自然科学基金

8+阅读 · 2012年12月31日

养心通脉有效部位方诱导骨髓间充质干细胞分化过程中miRNA调控网络研究

国家自然科学基金

0+阅读 · 2012年12月31日

流域水文极端事件时空演变特征及其对气候变化的响应机理

国家自然科学基金

0+阅读 · 2012年12月31日

Error-Correcting Codes for Nanopore Sequencing

Arxiv

0+阅读 · 2023年5月17日

Selective Amnesia: A Continual Learning Approach to Forgetting in Deep Generative Models

Arxiv

0+阅读 · 2023年5月17日

RPTQ: Reorder-based Post-training Quantization for Large Language Models

Arxiv

0+阅读 · 2023年5月17日

Meta-optimized Contrastive Learning for Sequential Recommendation

Arxiv

0+阅读 · 2023年5月17日

Clustering-Aware Negative Sampling for Unsupervised Sentence Representation

Arxiv

0+阅读 · 2023年5月17日

Conditional variational autoencoder with Gaussian process regression recognition for parametric models

Conditional variational autoencoder with Gaussian process regression recognition for parametric models

Arxiv

0+阅读 · 2023年5月16日

Combining datasets to increase the number of samples and improve model fitting

Arxiv

0+阅读 · 2023年5月16日

Component Training of Turbo Autoencoders

Arxiv

0+阅读 · 2023年5月16日

Improved baselines for vision-language pre-training

Arxiv

0+阅读 · 2023年5月15日

Optimal Reads-From Consistency Checking for C11-Style Memory Models

Arxiv

0+阅读 · 2023年5月12日

VIP会员

文章信息

相关主题

相关VIP内容

对比学习需要哪样的数据？UCLA最新ICML2023论文《数据高效对比学习：简单样本贡献最大》，探究量化样本对SSL的贡献度

对比学习需要哪样的数据？UCLA最新ICML2023论文《数据高效对比学习：简单样本贡献最大》，探究量化样本对SSL的贡献度

专知会员服务

36+阅读 · 2023年5月14日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

专知会员服务

21+阅读 · 2022年3月18日

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

专知会员服务

23+阅读 · 2021年6月3日

【SIGIR2021】使用难样本优化向量检索模型

专知会员服务

27+阅读 · 2021年4月22日

近期必读的六篇ICLR 2021【对比学习（CL）】相关论文和代码

专知会员服务

26+阅读 · 2021年3月2日

【CVPR2020-Oral】无监督域内自适应语义分割，Unsupervised Intra-domain Adaptation

【CVPR2020-Oral】无监督域内自适应语义分割，Unsupervised Intra-domain Adaptation

专知会员服务

71+阅读 · 2020年4月20日

【ICLR2020】用实对二进制卷积训练二进制神经网络，Training Binary Neural Networks with Real-to-Binary Convolutions

【ICLR2020】用实对二进制卷积训练二进制神经网络，Training Binary Neural Networks with Real-to-Binary Convolutions

专知会员服务

26+阅读 · 2020年3月26日

【WSDM 2020】RecVAE:一种新的变分自编码器，用于具有隐式反馈的Top-N推荐（RecVAE: a New Variational Autoencoder for Top-NRecommendations with Implicit Feedback）

【WSDM 2020】RecVAE:一种新的变分自编码器，用于具有隐式反馈的Top-N推荐（RecVAE: a New Variational Autoencoder for Top-NRecommendations with Implicit Feedback）

专知会员服务

32+阅读 · 2019年12月26日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

《美陆军特种作战条令》最新102页

《洛克希德SR-71“黑鸟”侦察机动力系统》21页slides

美空军作战实验室通过人工智能和指挥控制技术创新推进杀伤链

《指挥控制能力分析方法论》最新报告

相关资讯

浅聊对比学习（Contrastive Learning）

浅聊对比学习（Contrastive Learning）

极市平台

2+阅读 · 2022年7月26日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新十篇度量学习相关论文—可量化表示、非线性度量学习、在线深度量学习、大间隔最近邻、判别深度度量、域自适应

【论文推荐】最新十篇度量学习相关论文—可量化表示、非线性度量学习、在线深度量学习、大间隔最近邻、判别深度度量、域自适应

专知

12+阅读 · 2018年5月18日

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

专知

12+阅读 · 2018年3月24日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

19+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

相关论文

Error-Correcting Codes for Nanopore Sequencing

Arxiv

0+阅读 · 2023年5月17日

Selective Amnesia: A Continual Learning Approach to Forgetting in Deep Generative Models

Arxiv

0+阅读 · 2023年5月17日

RPTQ: Reorder-based Post-training Quantization for Large Language Models

Arxiv

0+阅读 · 2023年5月17日

Meta-optimized Contrastive Learning for Sequential Recommendation

Arxiv

0+阅读 · 2023年5月17日

Clustering-Aware Negative Sampling for Unsupervised Sentence Representation

Arxiv

0+阅读 · 2023年5月17日

Conditional variational autoencoder with Gaussian process regression recognition for parametric models

Conditional variational autoencoder with Gaussian process regression recognition for parametric models

Arxiv

0+阅读 · 2023年5月16日

Combining datasets to increase the number of samples and improve model fitting

Arxiv

0+阅读 · 2023年5月16日

Component Training of Turbo Autoencoders

Arxiv

0+阅读 · 2023年5月16日

Improved baselines for vision-language pre-training

Arxiv

0+阅读 · 2023年5月15日

Optimal Reads-From Consistency Checking for C11-Style Memory Models

Arxiv

0+阅读 · 2023年5月12日

相关基金

基于对象模型与多点空间统计的高分辨率遥感影像分类策略

国家自然科学基金

3+阅读 · 2015年12月31日

无人机视频快速4-D重建及时空自适应索引方法研究

国家自然科学基金

7+阅读 · 2015年12月31日

基于网络的复杂疾病动态表观修饰模块挖掘

国家自然科学基金

0+阅读 · 2015年12月31日

基于多源遥感数据的森林生物量估算与空间尺度转换研究

国家自然科学基金

0+阅读 · 2013年12月31日

图像复原问题尺度自适应性关键技术研究

国家自然科学基金

1+阅读 · 2013年12月31日

干旱区典型流域极端洪水时空演变与预测研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于云的免疫检测器训练和动态更新算法及其在网络安全态势感知系统中的应用研究

国家自然科学基金

0+阅读 · 2013年12月31日

高维数据的图模型学习与统计推断

国家自然科学基金

8+阅读 · 2012年12月31日

养心通脉有效部位方诱导骨髓间充质干细胞分化过程中miRNA调控网络研究

国家自然科学基金

0+阅读 · 2012年12月31日

流域水文极端事件时空演变特征及其对气候变化的响应机理

国家自然科学基金

0+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员