Lasso is a popular and efficient approach to simultaneous estimation and variable selection in high-dimensional regression models. In this paper, a robust LAD-lasso method for multiple outcomes is presented that addresses the challenges of non-normal outcome distributions and outlying observations. Measured covariate data from space or time, or spectral bands or genomic positions often have natural correlation structure arising from measuring distance between the covariates. The proposed multi-outcome approach includes handling of such covariate blocks by a group fusion penalty, which encourages similarity between neighboring regression coefficient vectors by penalizing their differences for example in sequential data situation. Properties of the proposed approach are first illustrated by extensive simulations, and secondly the method is applied to a real-life skewed data example on retirement behavior with heteroscedastic explanatory variables.
翻译:在高维回归模型中,Lasso是同时估算和变量选择的流行和有效方法。在本文中,提出了一种针对多重结果的稳健的LAD-lasso方法,以应对非正常结果分布和外围观测的挑战。从空间或时间或光谱带或基因组位置测量的共变数据往往具有因测量共变之间距离而产生的自然关联结构。提议的多结果方法包括由一组组合罚款处理这种共变区块,这通过惩罚相邻的回归系数矢量之间的差异,例如相继数据情况,鼓励相邻的相近的回归系数矢量之间的相似性。拟议方法的属性首先通过广泛的模拟加以说明,其次,其次,该方法应用到一个真实的、有代谢性解释变量的退休行为扭曲数据实例。