统计子集选择问题快速平行比值 (Fast Parallel Algorithms for Statistical Subset Selection Problems) - 专知论文

会员服务 ·

0

贪心逐层预训练 · 自适应采样 · 特征选择 · 统计量 · FAST ·

2021 年 4 月 1 日

Fast Parallel Algorithms for Statistical Subset Selection Problems

翻译：统计子集选择问题快速平行比值

Sharon Qian,Yaron Singer

In this paper, we propose a new framework for designing fast parallel algorithms for fundamental statistical subset selection tasks that include feature selection and experimental design. Such tasks are known to be weakly submodular and are amenable to optimization via the standard greedy algorithm. Despite its desirable approximation guarantees, the greedy algorithm is inherently sequential and in the worst case, its parallel runtime is linear in the size of the data. Recently, there has been a surge of interest in a parallel optimization technique called adaptive sampling which produces solutions with desirable approximation guarantees for submodular maximization in exponentially faster parallel runtime. Unfortunately, we show that for general weakly submodular functions such accelerations are impossible. The major contribution in this paper is a novel relaxation of submodularity which we call differential submodularity. We first prove that differential submodularity characterizes objectives like feature selection and experimental design. We then design an adaptive sampling algorithm for differentially submodular functions whose parallel runtime is logarithmic in the size of the data and achieves strong approximation guarantees. Through experiments, we show the algorithm's performance is competitive with state-of-the-art methods and obtains dramatic speedups for feature selection and experimental design problems.

翻译：在本文中,我们提出了设计基本统计子选择任务快速平行算法的新框架,其中包括特征选择和实验性设计。这些任务已知是薄弱的子模块,并且可以通过标准的贪婪算法加以优化。尽管有可取的近似保证,贪婪算法本质上是顺序的,而最坏的情况是,其平行运行时间是数据大小的线性。最近,对称为适应性抽样的平行优化技术的兴趣激增,该技术为在快速平行运行时极快的子模块最大化提供了理想的近似保证。不幸的是,我们表明,对于一般而言微弱的子模块函数来说,这种加速是不可能的。本文的主要贡献是亚模块性新颖的松动,我们称之为差异亚模块性。我们首先证明,差异的子模块性是特征选择和实验设计等目标的特征性。我们随后为不同亚模块性功能设计了适应性抽样算法,这些功能的平行运行时间是数据大小的逻辑性,并实现了强烈的近似性保证。我们通过实验显示,该算法的性表现与状态方法相比是竞争性的,并获得特征选择和实验性快速的速度问题。

1

相关内容

贪心逐层预训练

贪心逐层预训练

最新《高级算法》Advanced Algorithms，176页pdf

最新《高级算法》Advanced Algorithms，176页pdf

专知会员服务

92+阅读 · 2020年10月22日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【斯坦福大学博士论文】大规模和高维统计学习方法和算法，147页pdf， Large-scale and high-dimensional statistical learning methods and algorithms

专知会员服务

26+阅读 · 2020年6月13日

【论文】深度学习的最优化:理论和算法（Optimization for deep learning: theory and algorithms）

【论文】深度学习的最优化:理论和算法（Optimization for deep learning: theory and algorithms）

专知会员服务

148+阅读 · 2019年12月28日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【RecSys 2019报告】用于旅游业的推荐系统（Building Useful Recommender Systems for Tourists）

【RecSys 2019报告】用于旅游业的推荐系统（Building Useful Recommender Systems for Tourists）

专知会员服务

32+阅读 · 2019年9月19日

Dropout、梯度消失/爆炸、Adam优化算法，神经网络优化算法看这一篇就够了

Dropout、梯度消失/爆炸、Adam优化算法，神经网络优化算法看这一篇就够了

AI100

14+阅读 · 2019年9月1日

分布式并行架构Ray介绍

分布式并行架构Ray介绍

CreateAMind

10+阅读 · 2019年8月9日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

BranchOut: Regularization for Online Ensemble Tracking with CNN

BranchOut: Regularization for Online Ensemble Tracking with CNN

统计学习与视觉计算组

9+阅读 · 2017年10月7日

最佳实践：深度学习用于自然语言处理（三）

最佳实践：深度学习用于自然语言处理（三）

待字闺中

3+阅读 · 2017年8月20日

Bayesian Optimisation for Constrained Problems

Bayesian Optimisation for Constrained Problems

Arxiv

0+阅读 · 2021年5月27日

Optimization Induced Equilibrium Networks

Arxiv

0+阅读 · 2021年5月27日

The q-Gauss-Newton method for unconstrained nonlinear optimization

Arxiv

0+阅读 · 2021年5月27日

Central Limit Theory for Linear Spectral Statistics of Normalized Separable Sample Covariance Matrix

Arxiv

0+阅读 · 2021年5月27日

Statistical power for cluster analysis

Arxiv

0+阅读 · 2021年5月25日

An efficient iterative method for solving parameter-dependent and random diffusion problems

Arxiv

0+阅读 · 2021年5月25日

A polynomial-degree-robust a posteriori error estimator for Nédélec discretizations of magnetostatic problems

Arxiv

0+阅读 · 2021年5月25日

A Statistical Threshold for Adversarial Classification in Laplace Mechanisms

Arxiv

0+阅读 · 2021年5月25日

Discrete MMSE Precoding for Multiuser MIMO Systems with PSK Modulation

Arxiv

0+阅读 · 2021年5月24日

Optimal Algorithms for Non-Smooth Distributed Optimization in Networks

Arxiv

7+阅读 · 2018年6月1日

VIP会员

文章信息

相关主题

贪心逐层预训练

自适应采样

相关VIP内容

最新《高级算法》Advanced Algorithms，176页pdf

最新《高级算法》Advanced Algorithms，176页pdf

专知会员服务

92+阅读 · 2020年10月22日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【斯坦福大学博士论文】大规模和高维统计学习方法和算法，147页pdf， Large-scale and high-dimensional statistical learning methods and algorithms

专知会员服务

26+阅读 · 2020年6月13日

【论文】深度学习的最优化:理论和算法（Optimization for deep learning: theory and algorithms）

【论文】深度学习的最优化:理论和算法（Optimization for deep learning: theory and algorithms）

专知会员服务

148+阅读 · 2019年12月28日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【RecSys 2019报告】用于旅游业的推荐系统（Building Useful Recommender Systems for Tourists）

【RecSys 2019报告】用于旅游业的推荐系统（Building Useful Recommender Systems for Tourists）

专知会员服务

32+阅读 · 2019年9月19日

热门VIP内容

开通专知VIP会员享更多权益服务

《战区安全决策课程体系》最新244页

《"无人机航母"原型平台》

任务规划与地形分析：现代复杂环境作战导航体系

《攻击场景描述形式化模型研究》

相关资讯

Dropout、梯度消失/爆炸、Adam优化算法，神经网络优化算法看这一篇就够了

Dropout、梯度消失/爆炸、Adam优化算法，神经网络优化算法看这一篇就够了

AI100

14+阅读 · 2019年9月1日

分布式并行架构Ray介绍

分布式并行架构Ray介绍

CreateAMind

10+阅读 · 2019年8月9日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

BranchOut: Regularization for Online Ensemble Tracking with CNN

BranchOut: Regularization for Online Ensemble Tracking with CNN

统计学习与视觉计算组

9+阅读 · 2017年10月7日

最佳实践：深度学习用于自然语言处理（三）

最佳实践：深度学习用于自然语言处理（三）

待字闺中

3+阅读 · 2017年8月20日

相关论文

Bayesian Optimisation for Constrained Problems

Bayesian Optimisation for Constrained Problems

Arxiv

0+阅读 · 2021年5月27日

Optimization Induced Equilibrium Networks

Arxiv

0+阅读 · 2021年5月27日

The q-Gauss-Newton method for unconstrained nonlinear optimization

Arxiv

0+阅读 · 2021年5月27日

Central Limit Theory for Linear Spectral Statistics of Normalized Separable Sample Covariance Matrix

Arxiv

0+阅读 · 2021年5月27日

Statistical power for cluster analysis

Arxiv

0+阅读 · 2021年5月25日

An efficient iterative method for solving parameter-dependent and random diffusion problems

Arxiv

0+阅读 · 2021年5月25日

A polynomial-degree-robust a posteriori error estimator for Nédélec discretizations of magnetostatic problems

Arxiv

0+阅读 · 2021年5月25日

A Statistical Threshold for Adversarial Classification in Laplace Mechanisms

Arxiv

0+阅读 · 2021年5月25日

Discrete MMSE Precoding for Multiuser MIMO Systems with PSK Modulation

Arxiv

0+阅读 · 2021年5月24日

Optimal Algorithms for Non-Smooth Distributed Optimization in Networks

Arxiv

7+阅读 · 2018年6月1日

微信扫码咨询专知VIP会员