0.12美元-里dge:通过经验性贝亚噪音水平交叉校验,集体正规脊脊回归 ($σ$-Ridge: group regularized ridge regression via empirical Bayes noise level cross-validation) - 专知论文

会员服务 ·

0

岭回归 · 正则化项 · GROUP · 噪声 · 优化器 ·

2021 年 3 月 4 日

$σ$-Ridge: group regularized ridge regression via empirical Bayes noise level cross-validation

翻译：0.12美元-里dge:通过经验性贝亚噪音水平交叉校验,集体正规脊脊回归

Nikolaos Ignatiadis,Panagiotis Lolas

Features in predictive models are not exchangeable, yet common supervised models treat them as such. Here we study ridge regression when the analyst can partition the features into $K$ groups based on external side-information. For example, in high-throughput biology, features may represent gene expression, protein abundance or clinical data and so each feature group represents a distinct modality. The analyst's goal is to choose optimal regularization parameters $\lambda = (\lambda_1, \dotsc, \lambda_K)$ -- one for each group. In this work, we study the impact of $\lambda$ on the predictive risk of group-regularized ridge regression by deriving limiting risk formulae under a high-dimensional random effects model with $p\asymp n$ as $n \to \infty$. Furthermore, we propose a data-driven method for choosing $\lambda$ that attains the optimal asymptotic risk: The key idea is to interpret the residual noise variance $\sigma^2$, as a regularization parameter to be chosen through cross-validation. An empirical Bayes construction maps the one-dimensional parameter $\sigma$ to the $K$-dimensional vector of regularization parameters, i.e., $\sigma \mapsto \widehat{\lambda}(\sigma)$. Beyond its theoretical optimality, the proposed method is practical and runs as fast as cross-validated ridge regression without feature groups ($K=1$).

翻译：预测模型中的特性是不可互换的, 但是常见的监管模型对待它们。我们在这里研究峰值回归, 当分析师能够根据外部侧信息将特性分割成 $K$ 的组别时。例如, 在高通量生物学中, 特征代表基因表达、蛋白含量或临床数据, 所以每个特性组代表一种截然不同的模式。分析师的目标是选择最佳的规范参数$\lambda = (\ lambda_ 1,\ dotsc,\ lambda_ K) $ -- 每个组都这样对待它们。在此工作中, 我们研究 $\ lambda 的回归对于基于外部侧信息将特性分割成 $K 的组群集常规回归风险风险。例如, 在高通量随机效果模型中, 地标可以是 $\ \ = intimmaty 。此外, 我们提出一个数据驱动方法, 用于选择 $\ lambda $ 达到最佳的度风险 : 关键的想法是解释残余的噪音差异 $\ 2 美元美元。。美元 i- sigmagimax 2} listral rodeal rodeal logy res res laus res res res res res latial laus res res res res res lavial res res restiual res res latiualtibaltialtial res res res

0

相关内容

岭回归

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

【经典书】线性代数，436页pdf

专知会员服务

77+阅读 · 2021年3月16日

【经典书】计算最优传输，209页pdf，Computational Optimal Transport

【经典书】计算最优传输，209页pdf，Computational Optimal Transport

专知会员服务

75+阅读 · 2021年1月10日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【经典书】统计学理论，925页pdf

【经典书】统计学理论，925页pdf

专知会员服务

168+阅读 · 2020年12月6日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【干货书】机器学习Primer，122页pdf

【干货书】机器学习Primer，122页pdf

专知会员服务

109+阅读 · 2020年10月5日

【经典书】贝叶斯编程，378页pdf，Bayesian Programming

【经典书】贝叶斯编程，378页pdf，Bayesian Programming

专知会员服务

250+阅读 · 2020年5月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

已删除

将门创投

3+阅读 · 2019年10月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

算法｜随机森林（Random Forest）

算法｜随机森林（Random Forest）

全球人工智能

3+阅读 · 2018年1月8日

动手写机器学习算法：SVM支持向量机（附代码）

动手写机器学习算法：SVM支持向量机（附代码）

七月在线实验室

12+阅读 · 2017年12月5日

【直观详解】支持向量机SVM

【直观详解】支持向量机SVM

机器学习研究会

18+阅读 · 2017年11月8日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Regularised Least-Squares Regression with Infinite-Dimensional Output Space

Arxiv

0+阅读 · 2021年4月28日

Spatially Clustered Regression

Arxiv

0+阅读 · 2021年4月28日

On the Asymptotic Optimality of Cross-Validation based Hyper-parameter Estimators for Regularized Least Squares Regression Problems

Arxiv

0+阅读 · 2021年4月27日

Bayesian predictive inference without a prior

Arxiv

0+阅读 · 2021年4月26日

Algorithms for ridge estimation with convergence guarantees

Arxiv

0+阅读 · 2021年4月26日

Performance of Empirical Risk Minimization for Linear Regression with Dependent Data

Arxiv

0+阅读 · 2021年4月25日

Bayesian Analysis on Limiting the Student-$t$ Linear Regression Model

Arxiv

0+阅读 · 2021年4月25日

Prediction in latent factor regression: Adaptive PCR and beyond

Arxiv

0+阅读 · 2021年4月23日

Regularized Nonlinear Regression for Simultaneously Selecting and Estimating Key Model Parameters

Arxiv

0+阅读 · 2021年4月23日

A XGBoost risk model via feature selection and Bayesian hyper-parameter optimization

Arxiv

5+阅读 · 2019年1月24日

VIP会员

文章信息

相关主题

相关VIP内容

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

【经典书】线性代数，436页pdf

专知会员服务

77+阅读 · 2021年3月16日

【经典书】计算最优传输，209页pdf，Computational Optimal Transport

【经典书】计算最优传输，209页pdf，Computational Optimal Transport

专知会员服务

75+阅读 · 2021年1月10日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【经典书】统计学理论，925页pdf

【经典书】统计学理论，925页pdf

专知会员服务

168+阅读 · 2020年12月6日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【干货书】机器学习Primer，122页pdf

【干货书】机器学习Primer，122页pdf

专知会员服务

109+阅读 · 2020年10月5日

【经典书】贝叶斯编程，378页pdf，Bayesian Programming

【经典书】贝叶斯编程，378页pdf，Bayesian Programming

专知会员服务

250+阅读 · 2020年5月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《无人机战争时代的战时法：大国竞争中的区分原则、相称性原则与行动建议》最新75页

《构建强健军事力量的设计挑战：提升海军兵力支持系统效能的多分辨率建模方法》69页

正视无人机心理战：恐惧效应与战略反思

《精确反蜂群防御系统：三维运动探测与定向空爆拦截技术融合》最新24页

相关资讯

已删除

将门创投

3+阅读 · 2019年10月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

算法｜随机森林（Random Forest）

算法｜随机森林（Random Forest）

全球人工智能

3+阅读 · 2018年1月8日

动手写机器学习算法：SVM支持向量机（附代码）

动手写机器学习算法：SVM支持向量机（附代码）

七月在线实验室

12+阅读 · 2017年12月5日

【直观详解】支持向量机SVM

【直观详解】支持向量机SVM

机器学习研究会

18+阅读 · 2017年11月8日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

Regularised Least-Squares Regression with Infinite-Dimensional Output Space

Arxiv

0+阅读 · 2021年4月28日

Spatially Clustered Regression

Arxiv

0+阅读 · 2021年4月28日

On the Asymptotic Optimality of Cross-Validation based Hyper-parameter Estimators for Regularized Least Squares Regression Problems

Arxiv

0+阅读 · 2021年4月27日

Bayesian predictive inference without a prior

Arxiv

0+阅读 · 2021年4月26日

Algorithms for ridge estimation with convergence guarantees

Arxiv

0+阅读 · 2021年4月26日

Performance of Empirical Risk Minimization for Linear Regression with Dependent Data

Arxiv

0+阅读 · 2021年4月25日

Bayesian Analysis on Limiting the Student-$t$ Linear Regression Model

Arxiv

0+阅读 · 2021年4月25日

Prediction in latent factor regression: Adaptive PCR and beyond

Arxiv

0+阅读 · 2021年4月23日

Regularized Nonlinear Regression for Simultaneously Selecting and Estimating Key Model Parameters

Arxiv

0+阅读 · 2021年4月23日

A XGBoost risk model via feature selection and Bayesian hyper-parameter optimization

Arxiv

5+阅读 · 2019年1月24日

微信扫码咨询专知VIP会员