Estimation of signal-to-noise ratios and noise variances in high-dimensional linear models have important applications in statistical inference, hyperparameter selection, and heritability estimation in genomics. One common approach in practice is maximum likelihood estimation under random effects models. This paper aims to conduct model misspecification analysis on the consistency of this method, in which the true model only has fixed effects. Assume that the ratio between the number of samples and features converges to a nonzero constant, our results provide conditions on the design matrices under which random effects model based maximum likelihood estimation is asymptotically consistent in estimating the SNR and noise variance. Our model misspecification analysis also extends to the high-dimensional linear models with feature groups, in which group SNR estimation has important applications such as tuning parameter selection for group ridge regression.
翻译:高维线性模型信号到噪音比率和噪音差异的估算在统计推断、超参数选择和基因组学的遗传性估计中具有重要的应用。在实践中,一种常见的做法是随机效应模型下的最大可能性估计。本文的目的是对这一方法的一致性进行模型性区分分析,而真正的模型只有固定效果。假设样本和特征数量与特征之间的比例接近于非零常数,我们的结果为设计矩阵提供了条件,根据这些设计矩阵,基于随机效应模型的最大可能性估计在估计SNR和噪音差异时基本一致。我们模型性区分分析还扩展到具有特征组的高维线模型,其中SNR组的估算具有重要的应用,如调整群脊回归参数的选择。