改进激光组以获取高维绝对数据 (Improving Group Lasso for high-dimensional categorical data) - 专知论文

会员服务 ·

0

Group Lasso · 分类数据 · GROUP · MoDELS · 稀疏 ·

2022 年 10 月 25 日

Improving Group Lasso for high-dimensional categorical data

翻译：改进激光组以获取高维绝对数据

Szymon Nowakowski,Piotr Pokarowski,Wojciech Rejchel

from arxiv, arXiv admin note: text overlap with arXiv:2112.11114

Sparse modelling or model selection with categorical data is challenging even for a moderate number of variables, because one parameter is roughly needed to encode one category or level. The Group Lasso is a well known efficient algorithm for selection continuous or categorical variables, but all estimates related to a selected factor usually differ. Therefore, a fitted model may not be sparse, which makes the model interpretation difficult. To obtain a sparse solution of the Group Lasso we propose the following two-step procedure: first, we reduce data dimensionality using the Group Lasso; then to choose the final model we use an information criterion on a small family of models prepared by clustering levels of individual factors. We investigate selection correctness of the algorithm in a sparse high-dimensional scenario. We also test our method on synthetic as well as real datasets and show that it performs better than the state of the art algorithms with respect to the prediction accuracy or model dimension.

翻译：即使对于数量不多的变数来说,使用绝对数据进行粗略的建模或模型选择也具有挑战性,因为对于一个类别或层次的编码,大致需要有一个参数。Lasso集团是一个众所周知的用于选择连续或绝对变量的有效算法,但所有与选定因素有关的估计通常各不相同。因此,一个合适的模型可能并不稀疏,因此模型解释难于使用。为了获得Lasso集团的稀疏解决方案,我们建议采用以下两步程序:首先,我们使用Lasso集团来减少数据维度;然后选择我们使用的信息标准来选择一个最后模型,我们使用由个别因素组合层次所制作的模型组成的小系列信息标准。我们调查在稀疏高维情景中选择算法的正确性。我们还在合成和真实数据集方面测试我们的方法,并显示它比预测准确性或模型维度的先进算法状态要好。

0

相关内容

Group Lasso

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

石斑鱼半胱氨酸蛋白酶抑制剂B（CystatinB）在虹彩病毒SGIV感染中的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

瘢痕疙瘩中DAB-1抑制E3连接酶SIAH1对TIEG1泛素化介导TGF-β/Smads信号通路的研究

国家自然科学基金

0+阅读 · 2014年12月31日

GTAT4和Myocardin相互作用调控心肌肥厚

国家自然科学基金

0+阅读 · 2014年12月31日

乳酸菌调控内质网应激在肠粘膜屏障损伤修复中的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

泛函不等式与随机微分方程上的大偏差问题

国家自然科学基金

0+阅读 · 2012年12月31日

PEMFs对去势大鼠成骨细胞Wnt/β-catenin信号通路的影响

国家自然科学基金

0+阅读 · 2011年12月31日

人胚胎干细胞来源的Ⅱ型肺泡上皮细胞的免疫原性

国家自然科学基金

0+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

强非线性椭圆问题

国家自然科学基金

0+阅读 · 2009年12月31日

Deep Learning of Causal Structures in High Dimensions

Arxiv

0+阅读 · 2022年12月9日

Unsupervised Discretization by Two-dimensional MDL-based Histogram

Arxiv

0+阅读 · 2022年12月9日

Machine learning algorithms for three-dimensional mean-curvature computation in the level-set method

Arxiv

0+阅读 · 2022年12月9日

A Double Regression Method for Graphical Modeling of High-dimensional Nonlinear and Non-Gaussian Data

Arxiv

0+阅读 · 2022年12月8日

Explainable Machine Learning for Breakdown Prediction in High Gradient RF Cavities

Arxiv

0+阅读 · 2022年12月8日

Shapley values for cluster importance: How clusters of the training data affect a prediction

Arxiv

0+阅读 · 2022年12月8日

Considerations in Bayesian agent-based modeling for the analysis of COVID-19 data

Arxiv

0+阅读 · 2022年12月8日

Leveraging Structure for Improved Classification of Grouped Biased Data

Arxiv

0+阅读 · 2022年12月7日

CDSM: Cascaded Deep Semantic Matching on Textual Graphs Leveraging Ad-hoc Neighbor Selection

Arxiv

0+阅读 · 2022年12月7日

Few-Shot Preference Learning for Human-in-the-Loop RL

Arxiv

0+阅读 · 2022年12月6日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

面向性能、成本效益、云边隐私与可信性的大小语言模型协作综述

乌克兰太空研究（2022-2024年） | 176页

【CMU博士论文】大型语言模型的隐性特性

国防领域人工智能走向何方？

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Deep Learning of Causal Structures in High Dimensions

Arxiv

0+阅读 · 2022年12月9日

Unsupervised Discretization by Two-dimensional MDL-based Histogram

Arxiv

0+阅读 · 2022年12月9日

Machine learning algorithms for three-dimensional mean-curvature computation in the level-set method

Arxiv

0+阅读 · 2022年12月9日

A Double Regression Method for Graphical Modeling of High-dimensional Nonlinear and Non-Gaussian Data

Arxiv

0+阅读 · 2022年12月8日

Explainable Machine Learning for Breakdown Prediction in High Gradient RF Cavities

Arxiv

0+阅读 · 2022年12月8日

Shapley values for cluster importance: How clusters of the training data affect a prediction

Arxiv

0+阅读 · 2022年12月8日

Considerations in Bayesian agent-based modeling for the analysis of COVID-19 data

Arxiv

0+阅读 · 2022年12月8日

Leveraging Structure for Improved Classification of Grouped Biased Data

Arxiv

0+阅读 · 2022年12月7日

CDSM: Cascaded Deep Semantic Matching on Textual Graphs Leveraging Ad-hoc Neighbor Selection

Arxiv

0+阅读 · 2022年12月7日

Few-Shot Preference Learning for Human-in-the-Loop RL

Arxiv

0+阅读 · 2022年12月6日

相关基金

石斑鱼半胱氨酸蛋白酶抑制剂B（CystatinB）在虹彩病毒SGIV感染中的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

瘢痕疙瘩中DAB-1抑制E3连接酶SIAH1对TIEG1泛素化介导TGF-β/Smads信号通路的研究

国家自然科学基金

0+阅读 · 2014年12月31日

GTAT4和Myocardin相互作用调控心肌肥厚

国家自然科学基金

0+阅读 · 2014年12月31日

乳酸菌调控内质网应激在肠粘膜屏障损伤修复中的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

泛函不等式与随机微分方程上的大偏差问题

国家自然科学基金

0+阅读 · 2012年12月31日

PEMFs对去势大鼠成骨细胞Wnt/β-catenin信号通路的影响

国家自然科学基金

0+阅读 · 2011年12月31日

人胚胎干细胞来源的Ⅱ型肺泡上皮细胞的免疫原性

国家自然科学基金

0+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

强非线性椭圆问题

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员