非常快速流动子模块函数最大化 (Very Fast Streaming Submodular Function Maximization) - 专知论文

会员服务 ·

0

泛函 · Extensibility · 流 · FAST · INFORMS ·

2021 年 4 月 1 日

Very Fast Streaming Submodular Function Maximization

翻译：非常快速流动子模块函数最大化

Sebastian Buschjäger,Philipp-Jan Honysz,Lukas Pfahler,Katharina Morik

from arxiv, 9 pages, 14 pages appendix, 5 figures, 2 tables, 10 algorithms

Data summarization has become a valuable tool in understanding even terabytes of data. Due to their compelling theoretical properties, submodular functions have been in the focus of summarization algorithms. These algorithms offer worst-case approximations guarantees to the expense of higher computation and memory requirements. However, many practical applications do not fall under this worst-case, but are usually much more well-behaved. In this paper, we propose a new submodular function maximization algorithm called ThreeSieves, which ignores the worst-case, but delivers a good solution in high probability. It selects the most informative items from a data-stream on the fly and maintains a provable performance on a fixed memory budget. In an extensive evaluation, we compare our method against $6$ other methods on $8$ different datasets with and without concept drift. We show that our algorithm outperforms current state-of-the-art algorithms and, at the same time, uses fewer resources. Last, we highlight a real-world use-case of our algorithm for data summarization in gamma-ray astronomy. We make our code publicly available at https://github.com/sbuschjaeger/SubmodularStreamingMaximization.

翻译：数据总和已经成为理解甚至数据百万字节的宝贵工具。由于其令人信服的理论属性, 子模块函数一直处于总化算法的焦点。这些算法提供了最坏情况的近似保证, 以更高的计算和记忆要求为代价。然而, 许多实际应用并不属于最坏的情况, 但是通常要更加守规矩。在本文中, 我们提议一个新的子模块函数最大化算法, 叫做“ 三赛维斯 ”, 它忽略了最坏的情况, 但提供了一种非常可能的良好解决方案。它选择了来自苍蝇上的数据流中信息最丰富的项目, 并在固定的记忆预算上保持了一种可变的性能。在一项广泛的评估中, 我们比较了我们的方法, 在8美元不同的数据集上, 并且没有概念的漂移, 。我们显示我们的算法优于当前最先进的算法, 同时使用的资源也更少。最后, 我们强调我们用于伽玛射线天文学中的数据总和算法的实世应用案例。我们通过 https:// magres/Mabexmasialalalalalal。

0

相关内容

【ICML2021】异质风险最小化，Heterogeneous Risk Minimization

专知会员服务

16+阅读 · 2021年5月21日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

【AAAI2020论文】隐私保留GBDT（Privacy-Preserving Gradient Boosting Decision Trees）

【AAAI2020论文】隐私保留GBDT（Privacy-Preserving Gradient Boosting Decision Trees）

专知会员服务

36+阅读 · 2019年11月15日

【O'Reilly TensorFlow Conference 2019】基于TensorFlow的实时流数据机器学习（Machine learning over real-time streaming data with TensorFlow）

【O'Reilly TensorFlow Conference 2019】基于TensorFlow的实时流数据机器学习（Machine learning over real-time streaming data with TensorFlow）

专知会员服务

28+阅读 · 2019年11月14日

Influence Maximization: Integrating and Expanding Classical Algorithms into the Social Network Context [陈卫微软亚洲研究院] 2019年中国计算机大会机器学习与数据挖掘论坛

Influence Maximization: Integrating and Expanding Classical Algorithms into the Social Network Context [陈卫微软亚洲研究院] 2019年中国计算机大会机器学习与数据挖掘论坛

专知会员服务

10+阅读 · 2019年10月26日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

(TensorFlow)实时语义分割比较研究

(TensorFlow)实时语义分割比较研究

机器学习研究会

9+阅读 · 2018年3月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【数据集】新的YELP数据集官方下载

【数据集】新的YELP数据集官方下载

机器学习研究会

16+阅读 · 2017年8月31日

2nd-order Updates with 1st-order Complexity

2nd-order Updates with 1st-order Complexity

Arxiv

0+阅读 · 2021年5月27日

RADICAL-Pilot and Parsl: Executing Heterogeneous Workflows on HPC Platforms

Arxiv

0+阅读 · 2021年5月27日

Nonlinear Monte Carlo Method for Imbalanced Data Learning

Arxiv

1+阅读 · 2021年5月27日

More applications of the d-neighbor equivalence: acyclicity and connectivity constraints

Arxiv

0+阅读 · 2021年5月26日

Submodular Kernels for Efficient Rankings

Arxiv

0+阅读 · 2021年5月26日

SGD with Coordinate Sampling: Theory and Practice

Arxiv

0+阅读 · 2021年5月25日

AdaGCN:Adaptive Boosting Algorithm for Graph Convolutional Networks on Imbalanced Node Classification

Arxiv

2+阅读 · 2021年5月25日

Text Summarization with Pretrained Encoders

Arxiv

5+阅读 · 2019年8月22日

Meta-Learning with Differentiable Convex Optimization

Arxiv

5+阅读 · 2019年4月23日

Variational Bayesian Reinforcement Learning with Regret Bounds

Arxiv

3+阅读 · 2018年7月25日

VIP会员

文章信息

相关主题

相关VIP内容

【ICML2021】异质风险最小化，Heterogeneous Risk Minimization

专知会员服务

16+阅读 · 2021年5月21日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

【AAAI2020论文】隐私保留GBDT（Privacy-Preserving Gradient Boosting Decision Trees）

【AAAI2020论文】隐私保留GBDT（Privacy-Preserving Gradient Boosting Decision Trees）

专知会员服务

36+阅读 · 2019年11月15日

【O'Reilly TensorFlow Conference 2019】基于TensorFlow的实时流数据机器学习（Machine learning over real-time streaming data with TensorFlow）

【O'Reilly TensorFlow Conference 2019】基于TensorFlow的实时流数据机器学习（Machine learning over real-time streaming data with TensorFlow）

专知会员服务

28+阅读 · 2019年11月14日

Influence Maximization: Integrating and Expanding Classical Algorithms into the Social Network Context [陈卫微软亚洲研究院] 2019年中国计算机大会机器学习与数据挖掘论坛

Influence Maximization: Integrating and Expanding Classical Algorithms into the Social Network Context [陈卫微软亚洲研究院] 2019年中国计算机大会机器学习与数据挖掘论坛

专知会员服务

10+阅读 · 2019年10月26日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《美陆军强化路线侦察系统（第一代）开发报告》2025最新94页报告

最新中文版1.7万字《俄乌战争的经验教训：空中、陆地、海洋、网络、太空、人类域作战洞察》

中文版4000字 | 无人机赋能步兵实现超视距打击

《人工智能与数据分析在武装部队军事情报中的融合集成研究——聚焦情报周期优化：印度视角》50页报告

相关资讯

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

(TensorFlow)实时语义分割比较研究

(TensorFlow)实时语义分割比较研究

机器学习研究会

9+阅读 · 2018年3月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【数据集】新的YELP数据集官方下载

【数据集】新的YELP数据集官方下载

机器学习研究会

16+阅读 · 2017年8月31日

相关论文

2nd-order Updates with 1st-order Complexity

2nd-order Updates with 1st-order Complexity

Arxiv

0+阅读 · 2021年5月27日

RADICAL-Pilot and Parsl: Executing Heterogeneous Workflows on HPC Platforms

Arxiv

0+阅读 · 2021年5月27日

Nonlinear Monte Carlo Method for Imbalanced Data Learning

Arxiv

1+阅读 · 2021年5月27日

More applications of the d-neighbor equivalence: acyclicity and connectivity constraints

Arxiv

0+阅读 · 2021年5月26日

Submodular Kernels for Efficient Rankings

Arxiv

0+阅读 · 2021年5月26日

SGD with Coordinate Sampling: Theory and Practice

Arxiv

0+阅读 · 2021年5月25日

AdaGCN:Adaptive Boosting Algorithm for Graph Convolutional Networks on Imbalanced Node Classification

Arxiv

2+阅读 · 2021年5月25日

Text Summarization with Pretrained Encoders

Arxiv

5+阅读 · 2019年8月22日

Meta-Learning with Differentiable Convex Optimization

Arxiv

5+阅读 · 2019年4月23日

Variational Bayesian Reinforcement Learning with Regret Bounds

Arxiv

3+阅读 · 2018年7月25日

微信扫码咨询专知VIP会员