用于高分层混血体数据的最佳估计五氯苯甲醚 (Optimally Weighted PCA for High-Dimensional Heteroscedastic Data) - 专知论文

会员服务 ·

0

异方差 · Weight · 优化器 · PCA · 方差 ·

2022 年 9 月 13 日

Optimally Weighted PCA for High-Dimensional Heteroscedastic Data

翻译：用于高分层混血体数据的最佳估计五氯苯甲醚

David Hong,Fan Yang,Jeffrey A. Fessler,Laura Balzano

from arxiv, 39 pages, 9 figures

Modern data are increasingly both high-dimensional and heteroscedastic. This paper considers the challenge of estimating underlying principal components from high-dimensional data with noise that is heteroscedastic across samples, i.e., some samples are noisier than others. Such heteroscedasticity naturally arises, e.g., when combining data from diverse sources or sensors. A natural way to account for this heteroscedasticity is to give noisier blocks of samples less weight in PCA by using the leading eigenvectors of a weighted sample covariance matrix. We consider the problem of choosing weights to optimally recover the underlying components. In general, one cannot know these optimal weights since they depend on the underlying components we seek to estimate. However, we show that under some natural statistical assumptions the optimal weights converge to a simple function of the signal and noise variances for high-dimensional data. Surprisingly, the optimal weights are not the inverse noise variance weights commonly used in practice. We demonstrate the theoretical results through numerical simulations and comparisons with existing weighting schemes. Finally, we briefly discuss how estimated signal and noise variances can be used when the true variances are unknown, and we illustrate the optimal weights on real data from astronomy.

翻译：现代数据日益具有高度和超强性质。本文考虑了从高度数据中估算主要组成部分的内在组成部分的挑战。高度数据具有不同样品的杂交性, 即有些样品比其他样品的杂交性更新。这种杂交性自然产生, 例如, 将来自不同来源或传感器的数据合并起来。计算这种杂交性的一种自然方法, 是使用加权样本变异矩阵的主要偏差因素, 使五氯苯样本中的杂交区块减少重量。我们考虑选择重量的问题, 以便最佳地恢复基本部件。一般来说, 我们无法了解这些最佳加权, 因为它们取决于我们所要估计的基本组成部分。然而, 我们在某些自然统计假设下, 最佳重量与高度数据信号和噪音差异的简单函数相交汇。令人惊讶的是, 最佳重量不是实践中常用的反噪音差异重量。我们通过数字模拟和与现有加权方案比较来展示理论结果。总的来说, 我们无法了解这些最佳重量的重量, 当我们用未知的信号和天平面数据来说明我们如何估计真实差异时, 我们简要地分析。

0

相关内容

异方差

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【IPAM 】张量主元分析中的高维成本景观和梯度下降及其推广（High-dimensional cost landscape and gradient descent in Tensor PCA and its generalisations），附41页pdf

【IPAM 】张量主元分析中的高维成本景观和梯度下降及其推广（High-dimensional cost landscape and gradient descent in Tensor PCA and its generalisations），附41页pdf

专知会员服务

14+阅读 · 2019年11月22日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

专知

19+阅读 · 2018年6月26日

非光滑非凸优化问题的交替线性化算法及其应用

国家自然科学基金

6+阅读 · 2015年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

几类Pfaffian图的结构性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

水莱茵海默氏菌 (Rheinheimera aquimaris) 淬灭细菌群体感应的机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

斑马鱼粒细胞发育调控基因ggf的克隆、功能和调控通路研究

国家自然科学基金

0+阅读 · 2011年12月31日

Levy过程驱动的随机Fast-Diffusion方程的Harnack不等式及其应用

国家自然科学基金

0+阅读 · 2011年12月31日

基于MCMC算法的非线性贝叶斯估计方法及其应用

国家自然科学基金

1+阅读 · 2011年12月31日

非线性方程组迭代方法特征研究及并行计算

国家自然科学基金

0+阅读 · 2008年12月31日

Theoretical Guarantees for Domain Adaptation with Hierarchical Optimal Transport

Arxiv

0+阅读 · 2022年10月24日

Lifted contact dynamics for efficient optimal control of rigid body systems with contacts

Arxiv

0+阅读 · 2022年10月24日

Optimal Discriminant Analysis in High-Dimensional Latent Factor Models

Arxiv

0+阅读 · 2022年10月23日

Explicit Second-Order Min-Max Optimization Methods with Optimal Convergence Guarantee

Arxiv

0+阅读 · 2022年10月23日

Sequential Change-point Detection for High-dimensional and non-Euclidean Data

Arxiv

0+阅读 · 2022年10月21日

A Non-Asymptotic Moreau Envelope Theory for High-Dimensional Generalized Linear Models

A Non-Asymptotic Moreau Envelope Theory for High-Dimensional Generalized Linear Models

Arxiv

0+阅读 · 2022年10月21日

Learning Graphical Factor Models with Riemannian Optimization

Arxiv

0+阅读 · 2022年10月21日

Optimal Pose Estimation and Covariance Analysis with Simultaneous Localization and Mapping Applications

Arxiv

0+阅读 · 2022年10月21日

Neural ODEs as Feedback Policies for Nonlinear Optimal Control

Arxiv

0+阅读 · 2022年10月20日

Sparse high-dimensional linear regression with a partitioned empirical Bayes ECM algorithm

Arxiv

0+阅读 · 2022年10月19日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【IPAM 】张量主元分析中的高维成本景观和梯度下降及其推广（High-dimensional cost landscape and gradient descent in Tensor PCA and its generalisations），附41页pdf

【IPAM 】张量主元分析中的高维成本景观和梯度下降及其推广（High-dimensional cost landscape and gradient descent in Tensor PCA and its generalisations），附41页pdf

专知会员服务

14+阅读 · 2019年11月22日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《俄乌战争中的无人系统：新的战争方式与新兴趋势——来自前线的印象》报告

《海上自主水面船舶远程操作中心：安全可持续运行的多维度分析》

多模态大语言模型下游调优中“保持自我”的重要性

隐身自主无人水下航行器技术如何变革水下作战并重塑海军竞争

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

专知

19+阅读 · 2018年6月26日

相关论文

Theoretical Guarantees for Domain Adaptation with Hierarchical Optimal Transport

Arxiv

0+阅读 · 2022年10月24日

Lifted contact dynamics for efficient optimal control of rigid body systems with contacts

Arxiv

0+阅读 · 2022年10月24日

Optimal Discriminant Analysis in High-Dimensional Latent Factor Models

Arxiv

0+阅读 · 2022年10月23日

Explicit Second-Order Min-Max Optimization Methods with Optimal Convergence Guarantee

Arxiv

0+阅读 · 2022年10月23日

Sequential Change-point Detection for High-dimensional and non-Euclidean Data

Arxiv

0+阅读 · 2022年10月21日

A Non-Asymptotic Moreau Envelope Theory for High-Dimensional Generalized Linear Models

A Non-Asymptotic Moreau Envelope Theory for High-Dimensional Generalized Linear Models

Arxiv

0+阅读 · 2022年10月21日

Learning Graphical Factor Models with Riemannian Optimization

Arxiv

0+阅读 · 2022年10月21日

Optimal Pose Estimation and Covariance Analysis with Simultaneous Localization and Mapping Applications

Arxiv

0+阅读 · 2022年10月21日

Neural ODEs as Feedback Policies for Nonlinear Optimal Control

Arxiv

0+阅读 · 2022年10月20日

Sparse high-dimensional linear regression with a partitioned empirical Bayes ECM algorithm

Arxiv

0+阅读 · 2022年10月19日

相关基金

非光滑非凸优化问题的交替线性化算法及其应用

国家自然科学基金

6+阅读 · 2015年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

几类Pfaffian图的结构性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

水莱茵海默氏菌 (Rheinheimera aquimaris) 淬灭细菌群体感应的机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

斑马鱼粒细胞发育调控基因ggf的克隆、功能和调控通路研究

国家自然科学基金

0+阅读 · 2011年12月31日

Levy过程驱动的随机Fast-Diffusion方程的Harnack不等式及其应用

国家自然科学基金

0+阅读 · 2011年12月31日

基于MCMC算法的非线性贝叶斯估计方法及其应用

国家自然科学基金

1+阅读 · 2011年12月31日

非线性方程组迭代方法特征研究及并行计算

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员