While Weighted Lasso sparse regression has appealing statistical guarantees that would entail a major real-world impact in finance, genomics, and brain imaging applications, it is typically scarcely adopted due to its complex high-dimensional space composed by thousands of hyperparameters. On the other hand, the latest progress with high-dimensional hyperparameter optimization (HD-HPO) methods for black-box functions demonstrates that high-dimensional applications can indeed be efficiently optimized. Despite this initial success, HD-HPO approaches are mostly applied to synthetic problems with a moderate number of dimensions, which limits its impact in scientific and engineering applications. We propose LassoBench, the first benchmark suite tailored for Weighted Lasso regression. LassoBench consists of benchmarks for both well-controlled synthetic setups (number of samples, noise level, ambient and effective dimensionalities, and multiple fidelities) and real-world datasets, which enables the use of many flavors of HPO algorithms to be studied and extended to the high-dimensional Lasso setting. We evaluate 6 state-of-the-art HPO methods and 3 Lasso baselines, and demonstrate that Bayesian optimization and evolutionary strategies can improve over the methods commonly used for sparse regression while highlighting limitations of these frameworks in very high-dimensional and noisy settings.
翻译:虽然重力激光微弱回归具有令人兴奋的统计保障,在金融、基因组学和脑成像应用方面,这会产生重大的现实世界影响,但由于由数千个超参数组成的复杂高维空间,它通常很少被采用。另一方面,高维超光谱优化(HD-HPO)方法在黑箱功能方面的最新进展表明,高维应用确实可以有效优化。尽管取得了这一初步成功,但HD-HPO方法大多用于具有中等尺寸的合成问题,限制了其在科学和工程应用方面的影响。我们建议LassoBench,这是为轻视激光成像回归量设计的首套基准套。LassoBench是精心控制的合成组合(样品数量、噪声水平、环境与有效维度和多维度)和真实世界数据集(这些数据集使HPO算法的许多口味得以研究并推广到高维度激光测程设置。我们评估了6个最先进的HPO方法,3个标准套套套,同时展示了这些高维度的精确度基准和标准,从而改进了Basso的升级和标准。