非负矩阵系数化选择排名 (Rank Selection for Non-negative Matrix Factorization) - 专知论文

会员服务 ·

0

秩 · 分解的 · 自助法/自举法 · 估计/估计量 · 数据点 ·

2022 年 11 月 2 日

Rank Selection for Non-negative Matrix Factorization

翻译：非负矩阵系数化选择排名

Yun Cai,Hong Gu,Toby Kenney

Non-Negative Matrix Factorization (NMF) is a widely used dimension reduction method that factorizes a non-negative data matrix into two lower dimensional non-negative matrices: One is the basis or feature matrix which consists of the variables and the other is the coefficients matrix which is the projections of data points to the new basis. The features can be interpreted as sub-structures of the data. The number of sub-structures in the feature matrix is also called the rank which is the only tuning parameter in NMF. An appropriate rank will extract the key latent features while minimizing the noise from the original data. In this paper, we develop a novel rank selection method based on hypothesis testing, using a deconvolved bootstrap distribution to assess the significance level accurately despite the large amount of optimization error. In the simulation section, we compare our method with a rank selection method based on hypothesis testing using bootstrap distribution without deconvolution, and with a cross-validated imputation method1. Through simulations, we demonstrate that our method is not only accurate at estimating the true ranks for NMF especially when the features are hard to distinguish but also efficient at computation. When applied to real microbiome data (e.g. OTU data and functional metagenomic data), our method also shows the ability to extract interpretable sub-communities in the data.

翻译：非临界矩阵系数(NMF)是一种广泛使用的减少维度的方法,它将非负数据矩阵纳入两个低维非负负基矩阵,将非负基数据矩阵纳入两个低维非负基矩阵:一个是由变量构成的基础或特征矩阵,另一个是系数矩阵,即数据点预测到新基点的系数矩阵。这些特征可以被解释为数据结构的子结构。功能矩阵中的子结构数量也称为NMF中唯一调准参数的等级。一个适当的等级将提取关键潜在特征,同时从原始数据中最大限度地减少噪音。在本文中,我们根据假设测试制定了一个新的等级选择方法:一个是基础或特征矩阵,由变量构成基础或特征矩阵,由变量组成,由变量组成,由变量组成,由变量组成,由变量组成,由变量组成,由变量组成,由变量组成,由变量组成,由变量组成,由变量组成;在模拟部分中,我们将我们的方法与根据假设测试进行等级选择的方法进行比较。1 通过模拟,我们的方法不仅精确估计NMF的准确度,特别是当这些特征难以区分真实数据时,并且在精确地测量数据时,也符合实用方法。

0

相关内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Sestrin2/AMPK信号通路调控新生鼠缺氧缺血脑损伤细胞自噬的新机制

国家自然科学基金

0+阅读 · 2015年12月31日

长链非编码RNA-VEC1340靶定KLF4在血管内皮细胞损伤中的调控及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

赖氨酸乙酰化修饰与精子成熟调控

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

基于多价适配体的循环肿瘤细胞的高效捕获和无损释放研究

国家自然科学基金

0+阅读 · 2013年12月31日

Alpha稳定分布噪声条件下相干循环平稳信号的DOA估计

国家自然科学基金

0+阅读 · 2013年12月31日

离子液体包水微乳液中脂肪酶催化C-C形成反应

国家自然科学基金

0+阅读 · 2012年12月31日

约束优化问题的拉格朗日乘子理论与算法研究

国家自然科学基金

1+阅读 · 2011年12月31日

矩阵分解的低延迟并行算法

国家自然科学基金

0+阅读 · 2009年12月31日

Toll 样受体介导的巨噬细胞对prion清除的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

Potential Singularity of the Axisymmetric Euler Equations with $C^α$ Initial Vorticity for A Large Range of $α$. Part II: the $N$-Dimensional Case

Potential Singularity of the Axisymmetric Euler Equations with $C^α$ Initial Vorticity for A Large Range of $α$. Part II: the $N$-Dimensional Case

Arxiv

0+阅读 · 2022年12月22日

Potential Singularity of the Axisymmetric Euler Equations with $C^α$ Initial Vorticity for A Large Range of $α$. Part I: the $3$-Dimensional Case

Potential Singularity of the Axisymmetric Euler Equations with $C^α$ Initial Vorticity for A Large Range of $α$. Part I: the $3$-Dimensional Case

Arxiv

0+阅读 · 2022年12月22日

POD-based reduced order methods for optimal control problems governed by parametric partial differential equation with varying boundary control

Arxiv

0+阅读 · 2022年12月20日

An order-theoretic perspective on modes and maximum a posteriori estimation in Bayesian inverse problems

Arxiv

0+阅读 · 2022年12月20日

Randomized low-rank approximation of monotone matrix functions

Arxiv

0+阅读 · 2022年12月20日

Generalization of Higher Order Methods for Fast Iterative Matrix Inversion Suitable for GPU Acceleration

Arxiv

0+阅读 · 2022年12月20日

Direct covariance matrix estimation with compositional data

Arxiv

0+阅读 · 2022年12月19日

Proportional Control for Stochastic Regulation on Allocation of Multi-Robots

Arxiv

0+阅读 · 2022年12月19日

A Bayesian algorithm for sample selection bias correction

Arxiv

0+阅读 · 2022年12月19日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

VIP会员

文章信息

相关主题

自助法/自举法

估计/估计量

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

反无人机：乌克兰拦截型无人机系列一览

《自适应鲁棒马尔可夫决策过程：协同作战飞机（CCA）对抗性监视任务应用》44页技术报告

物理学中的高级深度学习

观点动力学：全面综述

相关资讯

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Potential Singularity of the Axisymmetric Euler Equations with $C^α$ Initial Vorticity for A Large Range of $α$. Part II: the $N$-Dimensional Case

Potential Singularity of the Axisymmetric Euler Equations with $C^α$ Initial Vorticity for A Large Range of $α$. Part II: the $N$-Dimensional Case

Arxiv

0+阅读 · 2022年12月22日

Potential Singularity of the Axisymmetric Euler Equations with $C^α$ Initial Vorticity for A Large Range of $α$. Part I: the $3$-Dimensional Case

Potential Singularity of the Axisymmetric Euler Equations with $C^α$ Initial Vorticity for A Large Range of $α$. Part I: the $3$-Dimensional Case

Arxiv

0+阅读 · 2022年12月22日

POD-based reduced order methods for optimal control problems governed by parametric partial differential equation with varying boundary control

Arxiv

0+阅读 · 2022年12月20日

An order-theoretic perspective on modes and maximum a posteriori estimation in Bayesian inverse problems

Arxiv

0+阅读 · 2022年12月20日

Randomized low-rank approximation of monotone matrix functions

Arxiv

0+阅读 · 2022年12月20日

Generalization of Higher Order Methods for Fast Iterative Matrix Inversion Suitable for GPU Acceleration

Arxiv

0+阅读 · 2022年12月20日

Direct covariance matrix estimation with compositional data

Arxiv

0+阅读 · 2022年12月19日

Proportional Control for Stochastic Regulation on Allocation of Multi-Robots

Arxiv

0+阅读 · 2022年12月19日

A Bayesian algorithm for sample selection bias correction

Arxiv

0+阅读 · 2022年12月19日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

相关基金

Sestrin2/AMPK信号通路调控新生鼠缺氧缺血脑损伤细胞自噬的新机制

国家自然科学基金

0+阅读 · 2015年12月31日

长链非编码RNA-VEC1340靶定KLF4在血管内皮细胞损伤中的调控及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

赖氨酸乙酰化修饰与精子成熟调控

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

基于多价适配体的循环肿瘤细胞的高效捕获和无损释放研究

国家自然科学基金

0+阅读 · 2013年12月31日

Alpha稳定分布噪声条件下相干循环平稳信号的DOA估计

国家自然科学基金

0+阅读 · 2013年12月31日

离子液体包水微乳液中脂肪酶催化C-C形成反应

国家自然科学基金

0+阅读 · 2012年12月31日

约束优化问题的拉格朗日乘子理论与算法研究

国家自然科学基金

1+阅读 · 2011年12月31日

矩阵分解的低延迟并行算法

国家自然科学基金

0+阅读 · 2009年12月31日

Toll 样受体介导的巨噬细胞对prion清除的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员