Features in predictive models are not exchangeable, yet common supervised models treat them as such. Here we study ridge regression when the analyst can partition the features into $K$ groups based on external side-information. For example, in high-throughput biology, features may represent gene expression, protein abundance or clinical data and so each feature group represents a distinct modality. The analyst's goal is to choose optimal regularization parameters $\lambda = (\lambda_1, \dotsc, \lambda_K)$ -- one for each group. In this work, we study the impact of $\lambda$ on the predictive risk of group-regularized ridge regression by deriving limiting risk formulae under a high-dimensional random effects model with $p\asymp n$ as $n \to \infty$. Furthermore, we propose a data-driven method for choosing $\lambda$ that attains the optimal asymptotic risk: The key idea is to interpret the residual noise variance $\sigma^2$, as a regularization parameter to be chosen through cross-validation. An empirical Bayes construction maps the one-dimensional parameter $\sigma$ to the $K$-dimensional vector of regularization parameters, i.e., $\sigma \mapsto \widehat{\lambda}(\sigma)$. Beyond its theoretical optimality, the proposed method is practical and runs as fast as cross-validated ridge regression without feature groups ($K=1$).
翻译:预测模型中的特性是不可互换的, 但是常见的监管模型对待它们。 我们在这里研究峰值回归, 当分析师能够根据外部侧信息将特性分割成 $K$ 的组别时。 例如, 在高通量生物学中, 特征代表基因表达、 蛋白含量或临床数据, 所以每个特性组代表一种截然不同的模式。 分析师的目标是选择最佳的规范参数$\lambda = (\ lambda_ 1,\ dotsc,\ lambda_ K) $ -- 每个组都这样对待它们。 在此工作中, 我们研究 $\ lambda 的回归对于基于外部侧信息将特性分割成 $K 的组群集常规回归风险风险。 例如, 在高通量随机效果模型中, 地标可以是 $\ \ = intimmaty 。 此外, 我们提出一个数据驱动方法, 用于选择 $\ lambda $ 达到 最佳的 度风险 : 关键的想法是解释残余的噪音差异 $\ 2 美元 美元 。 。 美元 i- sigmagimax 2} listral rodeal rodeal logy res res laus res res res res res latial laus res res res res res lavial res res restiual res res latiualtibaltialtial res res res