无限流上 SVD 的片状核心群集 (Sparse Coresets for SVD on Infinite Streams) - 专知论文

会员服务 ·

0

奇异值分解 · 流 · 无限 · 稀疏 · 奇异值 ·

2020 年 11 月 26 日

Sparse Coresets for SVD on Infinite Streams

翻译：无限流上 SVD 的片状核心群集

Vladimir Braverman,Dan Feldman,Harry Lang,Daniela Rus,Adiel Statman

In streaming Singular Value Decomposition (SVD), $d$-dimensional rows of a possibly infinite matrix arrive sequentially as points in $\mathbb{R}^d$. An $\epsilon$-coreset is a (much smaller) matrix whose sum of square distances of the rows to any hyperplane approximates that of the original matrix to a $1 \pm \epsilon$ factor. Our main result is that we can maintain a $\epsilon$-coreset while storing only $O(d \log^2 d / \epsilon^2)$ rows. Known lower bounds of $\Omega(d / \epsilon^2)$ rows show that this is nearly optimal. Moreover, each row of our coreset is a weighted subset of the input rows. This is highly desirable since it: (1) preserves sparsity; (2) is easily interpretable; (3) avoids precision errors; (4) applies to problems with constraints on the input. Previous streaming results for SVD that return a subset of the input required storing $\Omega(d \log^3 n / \epsilon^2)$ rows where $n$ is the number of rows seen so far. Our algorithm, with storage independent of $n$, is the first result that uses finite memory on infinite streams. We support our findings with experiments on the Wikipedia dataset benchmarked against state-of-the-art algorithms.

翻译：在串流 Singulal 值分解( SVD) 中, $d- 维维值数行中, 一个可能无限的矩阵的美元- 维值行依次以美元=mathb{R ⁇ d$ $. ==d$ ==d$。 $silon$- coolset 是一个( 大大小的) 矩阵, 该矩阵将各行的平方距离与任何超高机的平方距离相近, 接近于原始矩阵中的1美元=pm \ pm \ \ = epsilon 系数。我们的主要结果是, 我们可以保持一个 $psilable ; (2) 容易解释; (3) 避免精确错误; (4) 适用于输入限制。 SVD 先前的流结果, 将 $xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxal_ ral_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

0

相关内容

奇异值分解

奇异值分解

奇异值分解（Singular Value Decomposition）是线性代数中一种重要的矩阵分解，奇异值分解则是特征分解在任意矩阵上的推广。在信号处理、统计学等领域有重要应用。

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【实用书】数据科学基础，484页pdf，Foundations of Data Science

【实用书】数据科学基础，484页pdf，Foundations of Data Science

专知会员服务

122+阅读 · 2020年5月28日

【SIGMOD2020】稀疏数据半监督学习的分解图表示，Factorized Graph Representations for Semi-Supervised Learning from Sparse Data

【SIGMOD2020】稀疏数据半监督学习的分解图表示，Factorized Graph Representations for Semi-Supervised Learning from Sparse Data

专知会员服务

15+阅读 · 2020年3月7日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【论文推荐】数据科学中有关矩阵方法的文献综述：A LITERATURE SURVEY OF MATRIX METHODS FOR DATASCIENCE

【论文推荐】数据科学中有关矩阵方法的文献综述：A LITERATURE SURVEY OF MATRIX METHODS FOR DATASCIENCE

专知会员服务

25+阅读 · 2019年12月19日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

197+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

关关的刷题日记97 – Leetcode 105. Construct Binary Tree

关关的刷题日记97 – Leetcode 105. Construct Binary Tree

专知

3+阅读 · 2018年1月14日

【关关的刷题日记63】Leetcode 111 Minimum Depth of Binary Tree

【关关的刷题日记63】Leetcode 111 Minimum Depth of Binary Tree

专知

6+阅读 · 2017年12月11日

【关关的刷题日记53】 Leetcode 100. Same Tree

【关关的刷题日记53】 Leetcode 100. Same Tree

专知

10+阅读 · 2017年12月1日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Bias Reduction as a Remedy to the Consequences of Infinite Estimates in Poisson and Tobit Regression

Arxiv

0+阅读 · 2021年1月18日

Yet Another Representation of Binary Decision Trees: A Mathematical Demonstration

Arxiv

0+阅读 · 2021年1月18日

Least resolved trees for two-colored best match graphs

Arxiv

0+阅读 · 2021年1月18日

On uniqueness and reconstruction of a nonlinear diffusion term in a parabolic equation

Arxiv

0+阅读 · 2021年1月17日

Simultaneous Embedding of Colored Graphs

Arxiv

0+阅读 · 2021年1月17日

Iteratively Reweighted Least Squares for $\ell_1$-minimization with Global Linear Convergence Rate

Arxiv

0+阅读 · 2021年1月15日

Multidimensional Scaling for Big Data

Multidimensional Scaling for Big Data

Arxiv

0+阅读 · 2021年1月14日

On Uniform Convergence and Low-Norm Interpolation Learning

Arxiv

0+阅读 · 2021年1月14日

Factorized Graph Representations for Semi-Supervised Learning from Sparse Data

Arxiv

4+阅读 · 2020年3月5日

Sparse Sequence-to-Sequence Models

Sparse Sequence-to-Sequence Models

Arxiv

5+阅读 · 2019年5月14日

VIP会员

文章信息

相关主题

奇异值分解

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【实用书】数据科学基础，484页pdf，Foundations of Data Science

【实用书】数据科学基础，484页pdf，Foundations of Data Science

专知会员服务

122+阅读 · 2020年5月28日

【SIGMOD2020】稀疏数据半监督学习的分解图表示，Factorized Graph Representations for Semi-Supervised Learning from Sparse Data

【SIGMOD2020】稀疏数据半监督学习的分解图表示，Factorized Graph Representations for Semi-Supervised Learning from Sparse Data

专知会员服务

15+阅读 · 2020年3月7日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【论文推荐】数据科学中有关矩阵方法的文献综述：A LITERATURE SURVEY OF MATRIX METHODS FOR DATASCIENCE

【论文推荐】数据科学中有关矩阵方法的文献综述：A LITERATURE SURVEY OF MATRIX METHODS FOR DATASCIENCE

专知会员服务

25+阅读 · 2019年12月19日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

197+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《代码、指挥与冲突：描绘军事人工智能的未来》报告

【斯坦福博士论文】面向地理空间数据的多模态与多尺度建模：时空生成式人工智能

美国启动“自有军事人工智能计划”：采用谷歌Gemini以推动全军人工智能应用

《创新与适应性作为军事成功的关键因素：来自俄乌战争的战略洞见》报告

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

关关的刷题日记97 – Leetcode 105. Construct Binary Tree

关关的刷题日记97 – Leetcode 105. Construct Binary Tree

专知

3+阅读 · 2018年1月14日

【关关的刷题日记63】Leetcode 111 Minimum Depth of Binary Tree

【关关的刷题日记63】Leetcode 111 Minimum Depth of Binary Tree

专知

6+阅读 · 2017年12月11日

【关关的刷题日记53】 Leetcode 100. Same Tree

【关关的刷题日记53】 Leetcode 100. Same Tree

专知

10+阅读 · 2017年12月1日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

Bias Reduction as a Remedy to the Consequences of Infinite Estimates in Poisson and Tobit Regression

Arxiv

0+阅读 · 2021年1月18日

Yet Another Representation of Binary Decision Trees: A Mathematical Demonstration

Arxiv

0+阅读 · 2021年1月18日

Least resolved trees for two-colored best match graphs

Arxiv

0+阅读 · 2021年1月18日

On uniqueness and reconstruction of a nonlinear diffusion term in a parabolic equation

Arxiv

0+阅读 · 2021年1月17日

Simultaneous Embedding of Colored Graphs

Arxiv

0+阅读 · 2021年1月17日

Iteratively Reweighted Least Squares for $\ell_1$-minimization with Global Linear Convergence Rate

Arxiv

0+阅读 · 2021年1月15日

Multidimensional Scaling for Big Data

Multidimensional Scaling for Big Data

Arxiv

0+阅读 · 2021年1月14日

On Uniform Convergence and Low-Norm Interpolation Learning

Arxiv

0+阅读 · 2021年1月14日

Factorized Graph Representations for Semi-Supervised Learning from Sparse Data

Arxiv

4+阅读 · 2020年3月5日

Sparse Sequence-to-Sequence Models

Sparse Sequence-to-Sequence Models

Arxiv

5+阅读 · 2019年5月14日

微信扫码咨询专知VIP会员