Sorted l1 regularization has been incorporated into many methods for solving high-dimensional statistical estimation problems, including the SLOPE estimator in linear regression. In this paper, we study how this relatively new regularization technique improves variable selection by characterizing the optimal SLOPE trade-off between the false discovery proportion (FDP) and true positive proportion (TPP) or, equivalently, between measures of type I error and power. Assuming a regime of linear sparsity and working under Gaussian random designs, we obtain an upper bound on the optimal trade-off for SLOPE, showing its capability of breaking the Donoho-Tanner power limit. To put it into perspective, this limit is the highest possible power that the Lasso, which is perhaps the most popular l1-based method, can achieve even with arbitrarily strong effect sizes. Next, we derive a tight lower bound that delineates the fundamental limit of sorted l1 regularization in optimally trading the FDP off for the TPP. Finally, we show that on any problem instance, SLOPE with a certain regularization sequence outperforms the Lasso, in the sense of having a smaller FDP, larger TPP and smaller l2 estimation risk simultaneously. Our proofs are based on a novel technique that reduces a calculus of variations problem to a class of infinite-dimensional convex optimization problems and a very recent result from approximate message passing theory.
翻译:在本文中,我们研究了这一相对较新的正规化技术如何通过将虚假发现比例(FDP)与真实正比(TPP)之间的最佳 SLOPE 权衡(TPP),或等效的I型误差和权力等量之间的最佳 SLOPE 权衡取舍(TPP) 。假设一个线性宽度制度,并在Gaussian随机设计下工作,我们在SLOPE的最佳交易中获得一个上限,显示其打破Donoho-Tanner权力限制的能力。在本文中,我们研究这一相对较新的正规化技术如何通过描述虚假发现比例(FDP)与真实正比(TPP)之间的最佳平衡来改进变量选择。我们假设一个线性宽度制度,在高比随机设计下,我们在SLOPE的最佳交易中得到了一个最高限,显示它能够打破Donoho-Tanner权力限制的能力。要将其展示出它打破Doloho-Tanner 的某种正规化序列的能力。要将Lasso-laso(可能是最受欢迎的IPULasso) 的理论上一个更小的更小的变。