最小广场估算的核心要素 (Core-elements for Least Squares Estimation) - 专知论文

会员服务 ·

0

估计/估计量 · 预测器/决策函数 · 子采样 · 方阵 · 近似 ·

2022 年 11 月 6 日

Core-elements for Least Squares Estimation

翻译：最小广场估算的核心要素

Mengyu Li,Jun Yu,Tao Li,Cheng Meng

The coresets approach, also called subsampling or subset selection, aims to select a subsample as a surrogate for the observed sample. Such an approach has been used pervasively in large-scale data analysis. Existing coresets methods construct the subsample using a subset of rows from the predictor matrix. Such methods can be significantly inefficient when the predictor matrix is sparse or numerically sparse. To overcome the limitation, we develop a novel element-wise subset selection approach, called core-elements. We provide a deterministic algorithm to construct the core-elements estimator, only requiring an $O(\mbox{nnz}(\mathbf{X})+rp^2)$ computational cost, where $\mathbf{X}$ is an $n\times p$ predictor matrix, $r$ is the number of elements selected from each column of $\mathbf{X}$, and $\mbox{nnz}(\cdot)$ denotes the number of non-zero elements. Theoretically, we show that the proposed estimator is unbiased and approximately minimizes an upper bound of the estimation variance. We also provide an approximation guarantee by deriving a coresets-like finite sample bound for the proposed estimator. To handle potential outliers in the data, we further combine core-elements with the median-of-means procedure, resulting in an efficient and robust estimator with theoretical consistency guarantees. Numerical studies on various synthetic and real-world datasets demonstrate the proposed method's superior performance compared to mainstream competitors.

翻译：核心设置方法, 也称为子抽样或子集选择, 目的是选择子样本作为观察到的样本的代名词。这种方法在大规模数据分析中被广泛使用。现有的核心设置方法使用预测矩阵的一组行来构建子样本。当预测矩阵稀少或数字稀少时, 这种方法可能非常低效。为了克服限制, 我们开发了一个新颖的元素子选择方法, 叫做核心要素。我们提供了一种确定性算法, 用于构建核心目标估计器, 只需要一个 $( mbox{ nnz} (mathbbf{X})+rp2) 的高级数据分析成本。 $\ mathb{X} 现有的核心设置方法在预测矩阵少或数字少时, 这种方法可能非常低效率。我们用一个最小值计算器的计算值计算值计算值, 并用一个最精确的计算法, 我们用一个最精确的计算方法来显示一个最精确的数值。我们用一个最精确的计算方法,, 将一个最精确的数值排序的计算方法, 我们用一个最精确的计算方法来显示一个最精确的数值的计算方法。

0

相关内容

估计/估计量

估计/估计量

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

HuR/lincRNA152复合物在脓毒症免疫抑制中的调控机制

国家自然科学基金

0+阅读 · 2015年12月31日

表面接枝含能硼基复合物的构筑及其爆轰加载下的燃烧催化性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

ADS中检测快中子束的GEM探测器的研制

国家自然科学基金

0+阅读 · 2013年12月31日

带跳扩散模型的非参数统计推断研究

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

microRNA调节肿瘤抑制因子Caliban应答DNA损伤的机制

国家自然科学基金

1+阅读 · 2012年12月31日

ATP依赖染色质重塑复合物SWI/SNF在MSCs平滑肌分化中的调控机制

国家自然科学基金

0+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

肾康丸对糖尿病肾病大鼠miR-192介导通路的影响

国家自然科学基金

1+阅读 · 2009年12月31日

葡萄糖敏感型嵌段共聚物胶束用于胰岛素的控制释放

国家自然科学基金

0+阅读 · 2009年12月31日

Local energy estimates for the fractional Laplacian

Arxiv

0+阅读 · 2022年12月28日

Adversarial Virtual Exemplar Learning for Label-Frugal Satellite Image Change Detection

Arxiv

0+阅读 · 2022年12月28日

Waveform inversion via reduced order modeling

Waveform inversion via reduced order modeling

Arxiv

0+阅读 · 2022年12月28日

Statistical inference for high-dimensional spectral density matrix

Arxiv

0+阅读 · 2022年12月28日

EDoG: Adversarial Edge Detection For Graph Neural Networks

Arxiv

0+阅读 · 2022年12月27日

Variance Reduction for Score Functions Using Optimal Baselines

Arxiv

0+阅读 · 2022年12月27日

Estimation of stability index for symmetric α-stable distribution using quantile conditional variance ratios

Arxiv

0+阅读 · 2022年12月27日

Decomposed Mutual Information Estimation for Contrastive Representation Learning

Arxiv

11+阅读 · 2021年6月25日

Spatially Consistent Representation Learning

Arxiv

14+阅读 · 2021年3月10日

3D Hand Shape and Pose Estimation from a Single RGB Image

3D Hand Shape and Pose Estimation from a Single RGB Image

Arxiv

17+阅读 · 2019年3月3日

VIP会员

文章信息

相关主题

估计/估计量

预测器/决策函数

相关VIP内容

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【新书】《知识图谱与大语言模型的协同应用》，544页pdf

军事通信系统：安全行动的支柱

《缓解大语言模型（LLMs）幻觉：面向应用的检索增强生成（RAG）、推理与智能体系统综述》

【新书】机器学习系统，2620页pdf

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

Local energy estimates for the fractional Laplacian

Arxiv

0+阅读 · 2022年12月28日

Adversarial Virtual Exemplar Learning for Label-Frugal Satellite Image Change Detection

Arxiv

0+阅读 · 2022年12月28日

Waveform inversion via reduced order modeling

Waveform inversion via reduced order modeling

Arxiv

0+阅读 · 2022年12月28日

Statistical inference for high-dimensional spectral density matrix

Arxiv

0+阅读 · 2022年12月28日

EDoG: Adversarial Edge Detection For Graph Neural Networks

Arxiv

0+阅读 · 2022年12月27日

Variance Reduction for Score Functions Using Optimal Baselines

Arxiv

0+阅读 · 2022年12月27日

Estimation of stability index for symmetric α-stable distribution using quantile conditional variance ratios

Arxiv

0+阅读 · 2022年12月27日

Decomposed Mutual Information Estimation for Contrastive Representation Learning

Arxiv

11+阅读 · 2021年6月25日

Spatially Consistent Representation Learning

Arxiv

14+阅读 · 2021年3月10日

3D Hand Shape and Pose Estimation from a Single RGB Image

3D Hand Shape and Pose Estimation from a Single RGB Image

Arxiv

17+阅读 · 2019年3月3日

相关基金

HuR/lincRNA152复合物在脓毒症免疫抑制中的调控机制

国家自然科学基金

0+阅读 · 2015年12月31日

表面接枝含能硼基复合物的构筑及其爆轰加载下的燃烧催化性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

ADS中检测快中子束的GEM探测器的研制

国家自然科学基金

0+阅读 · 2013年12月31日

带跳扩散模型的非参数统计推断研究

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

microRNA调节肿瘤抑制因子Caliban应答DNA损伤的机制

国家自然科学基金

1+阅读 · 2012年12月31日

ATP依赖染色质重塑复合物SWI/SNF在MSCs平滑肌分化中的调控机制

国家自然科学基金

0+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

肾康丸对糖尿病肾病大鼠miR-192介导通路的影响

国家自然科学基金

1+阅读 · 2009年12月31日

葡萄糖敏感型嵌段共聚物胶束用于胰岛素的控制释放

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员