稀疏限制下的最优输运 (Sparsity-Constrained Optimal Transport) - 专知论文

会员服务 ·

0

输运 · 正则化 · 最优 · 稀疏 · 约束 ·

2023 年 4 月 14 日

Sparsity-Constrained Optimal Transport

翻译：稀疏限制下的最优输运

Tianlin Liu,Joan Puigcerver,Mathieu Blondel

from arxiv, Camera-ready ICLR 2023

Regularized optimal transport (OT) is now increasingly used as a loss or as a matching layer in neural networks. Entropy-regularized OT can be computed using the Sinkhorn algorithm but it leads to fully-dense transportation plans, meaning that all sources are (fractionally) matched with all targets. To address this issue, several works have investigated quadratic regularization instead. This regularization preserves sparsity and leads to unconstrained and smooth (semi) dual objectives, that can be solved with off-the-shelf gradient methods. Unfortunately, quadratic regularization does not give direct control over the cardinality (number of nonzeros) of the transportation plan. We propose in this paper a new approach for OT with explicit cardinality constraints on the transportation plan. Our work is motivated by an application to sparse mixture of experts, where OT can be used to match input tokens such as image patches with expert models such as neural networks. Cardinality constraints ensure that at most $k$ tokens are matched with an expert, which is crucial for computational performance reasons. Despite the nonconvexity of cardinality constraints, we show that the corresponding (semi) dual problems are tractable and can be solved with first-order gradient methods. Our method can be thought as a middle ground between unregularized OT (recovered in the limit case $k=1$) and quadratically-regularized OT (recovered when $k$ is large enough). The smoothness of the objectives increases as $k$ increases, giving rise to a trade-off between convergence speed and sparsity of the optimal plan.

翻译：正则化的最优输运已经越来越多地被用作神经网络中的损失函数或匹配层。熵正则化的最优输运可以使用Sinkhorn算法计算，但它导致完全密度的输运计划，这意味着所有源都（分数）与所有目标匹配。为了解决这个问题，几个研究已经调查了二次正则化。这种正则化保留稀疏性并导致无约束和平滑的（半）对偶目标，可以使用现成的梯度方法求解。不幸的是，二次正则化不能直接控制输运计划的基数（非零数）。本文提出了一种新的输运方法，以明确输运计划上的基数约束。我们的工作受到了稀疏专家混合应用的启发，其中OT可以用来将输入令牌（如图像块）与专家模型（如神经网络）匹配。基数约束确保最多与专家匹配$k$个令牌，这对于计算性能来说非常重要。尽管基数约束具有非凸性，但我们展示了相应的（半）对偶问题是可行的，可以使用一阶梯度方法解决。我们的方法可以被认为是未正则化的OT（在极限情况下$k=1$恢复）和二次正则化的OT（当$k$足够大时恢复）之间的中间地带。随着$k$的增加，目标的平滑性增加，从而产生收敛速度和最优计划的稀疏性之间的折衷。

0

相关内容

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

【AAAI2022】基于对比学习的预训练语言模型剪枝压缩

【AAAI2022】基于对比学习的预训练语言模型剪枝压缩

专知会员服务

29+阅读 · 2022年1月24日

【经典书】计算最优传输，209页pdf，Computational Optimal Transport

【经典书】计算最优传输，209页pdf，Computational Optimal Transport

专知会员服务

75+阅读 · 2021年1月10日

【ICML2020】基于图感知逻辑回归和抢占式查询候选集生成的属性图上主动学习策略

【ICML2020】基于图感知逻辑回归和抢占式查询候选集生成的属性图上主动学习策略

专知会员服务

13+阅读 · 2020年7月9日

【MIT】最优传输图神经网络，Optimal Transport Graph Neural Networks

【MIT】最优传输图神经网络，Optimal Transport Graph Neural Networks

专知会员服务

66+阅读 · 2020年6月22日

图卷积神经网络蒸馏知识，Distillating Knowledge from GCN

图卷积神经网络蒸馏知识，Distillating Knowledge from GCN

专知会员服务

96+阅读 · 2020年3月25日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

59+阅读 · 2020年1月25日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

论文浅尝 | Continual Learning for Named Entity Recognition

论文浅尝 | Continual Learning for Named Entity Recognition

开放知识图谱

1+阅读 · 2022年6月25日

【经典书】计算最优传输，209页pdf，Computational Optimal Transport

【经典书】计算最优传输，209页pdf，Computational Optimal Transport

专知

16+阅读 · 2021年1月10日

【MIT】最优传输图神经网络，Optimal Transport Graph Neural Networks

【MIT】最优传输图神经网络，Optimal Transport Graph Neural Networks

专知

18+阅读 · 2020年6月22日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

笔记 | Deep active learning for named entity recognition

笔记 | Deep active learning for named entity recognition

黑龙江大学自然语言处理实验室

24+阅读 · 2018年5月27日

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

专知

10+阅读 · 2018年4月8日

有向图的公平划分问题研究

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

带稀疏约束不适定问题的算法研究

国家自然科学基金

0+阅读 · 2014年12月31日

PLA2R与IMN脾肾阳虚证的关联及温阳利水法作用机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

多复变中的L2估计

国家自然科学基金

0+阅读 · 2012年12月31日

时间分辨电子动量谱仪

国家自然科学基金

0+阅读 · 2012年12月31日

托卡马克边界等离子体输运的三维模拟

国家自然科学基金

0+阅读 · 2011年12月31日

大规模稀疏代数系统的预条件方法与降阶模型研究

国家自然科学基金

0+阅读 · 2011年12月31日

结构信息最优的分布式视频压缩算法研究

国家自然科学基金

0+阅读 · 2009年12月31日

IMRT逆向计划中多目标优化算法及目标函数研究

国家自然科学基金

0+阅读 · 2008年12月31日

Efficient PDE-Constrained optimization under high-dimensional uncertainty using derivative-informed neural operators

Arxiv

0+阅读 · 2023年5月31日

On the Approximability of External-Influence-Driven Problems

Arxiv

0+阅读 · 2023年5月30日

Proximal Point Imitation Learning

Arxiv

0+阅读 · 2023年5月30日

Robust mean change point testing in high-dimensional data with heavy tails

Arxiv

0+阅读 · 2023年5月30日

Pre-training for Speech Translation: CTC Meets Optimal Transport

Arxiv

0+阅读 · 2023年5月30日

Blockwise Stochastic Variance-Reduced Methods with Parallel Speedup for Multi-Block Bilevel Optimization

Arxiv

0+阅读 · 2023年5月30日

An Accelerated Stochastic Algorithm for Solving the Optimal Transport Problem

Arxiv

0+阅读 · 2023年5月30日

InfoOT: Information Maximizing Optimal Transport

Arxiv

0+阅读 · 2023年5月29日

Robust Methods for High-Dimensional Linear Learning

Arxiv

0+阅读 · 2023年5月29日

Unconstrained Dynamic Regret via Sparse Coding

Arxiv

0+阅读 · 2023年5月27日

VIP会员

文章信息

相关主题

相关VIP内容

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

【AAAI2022】基于对比学习的预训练语言模型剪枝压缩

【AAAI2022】基于对比学习的预训练语言模型剪枝压缩

专知会员服务

29+阅读 · 2022年1月24日

【经典书】计算最优传输，209页pdf，Computational Optimal Transport

【经典书】计算最优传输，209页pdf，Computational Optimal Transport

专知会员服务

75+阅读 · 2021年1月10日

【ICML2020】基于图感知逻辑回归和抢占式查询候选集生成的属性图上主动学习策略

【ICML2020】基于图感知逻辑回归和抢占式查询候选集生成的属性图上主动学习策略

专知会员服务

13+阅读 · 2020年7月9日

【MIT】最优传输图神经网络，Optimal Transport Graph Neural Networks

【MIT】最优传输图神经网络，Optimal Transport Graph Neural Networks

专知会员服务

66+阅读 · 2020年6月22日

图卷积神经网络蒸馏知识，Distillating Knowledge from GCN

图卷积神经网络蒸馏知识，Distillating Knowledge from GCN

专知会员服务

96+阅读 · 2020年3月25日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

59+阅读 · 2020年1月25日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】多目标奖励与偏好优化：理论与算法

《无形的防御者？将定向能武器集成到反无人机框架的机遇与挑战》报告

自主化海军：海上无人系统与未来海战

迈向智能体系统规模化的科学

相关资讯

论文浅尝 | Continual Learning for Named Entity Recognition

论文浅尝 | Continual Learning for Named Entity Recognition

开放知识图谱

1+阅读 · 2022年6月25日

【经典书】计算最优传输，209页pdf，Computational Optimal Transport

【经典书】计算最优传输，209页pdf，Computational Optimal Transport

专知

16+阅读 · 2021年1月10日

【MIT】最优传输图神经网络，Optimal Transport Graph Neural Networks

【MIT】最优传输图神经网络，Optimal Transport Graph Neural Networks

专知

18+阅读 · 2020年6月22日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

笔记 | Deep active learning for named entity recognition

笔记 | Deep active learning for named entity recognition

黑龙江大学自然语言处理实验室

24+阅读 · 2018年5月27日

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

专知

10+阅读 · 2018年4月8日

相关论文

Efficient PDE-Constrained optimization under high-dimensional uncertainty using derivative-informed neural operators

Arxiv

0+阅读 · 2023年5月31日

On the Approximability of External-Influence-Driven Problems

Arxiv

0+阅读 · 2023年5月30日

Proximal Point Imitation Learning

Arxiv

0+阅读 · 2023年5月30日

Robust mean change point testing in high-dimensional data with heavy tails

Arxiv

0+阅读 · 2023年5月30日

Pre-training for Speech Translation: CTC Meets Optimal Transport

Arxiv

0+阅读 · 2023年5月30日

Blockwise Stochastic Variance-Reduced Methods with Parallel Speedup for Multi-Block Bilevel Optimization

Arxiv

0+阅读 · 2023年5月30日

An Accelerated Stochastic Algorithm for Solving the Optimal Transport Problem

Arxiv

0+阅读 · 2023年5月30日

InfoOT: Information Maximizing Optimal Transport

Arxiv

0+阅读 · 2023年5月29日

Robust Methods for High-Dimensional Linear Learning

Arxiv

0+阅读 · 2023年5月29日

Unconstrained Dynamic Regret via Sparse Coding

Arxiv

0+阅读 · 2023年5月27日

相关基金

有向图的公平划分问题研究

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

带稀疏约束不适定问题的算法研究

国家自然科学基金

0+阅读 · 2014年12月31日

PLA2R与IMN脾肾阳虚证的关联及温阳利水法作用机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

多复变中的L2估计

国家自然科学基金

0+阅读 · 2012年12月31日

时间分辨电子动量谱仪

国家自然科学基金

0+阅读 · 2012年12月31日

托卡马克边界等离子体输运的三维模拟

国家自然科学基金

0+阅读 · 2011年12月31日

大规模稀疏代数系统的预条件方法与降阶模型研究

国家自然科学基金

0+阅读 · 2011年12月31日

结构信息最优的分布式视频压缩算法研究

国家自然科学基金

0+阅读 · 2009年12月31日

IMRT逆向计划中多目标优化算法及目标函数研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员