Nesterov's accelerated gradient (AG) is a popular technique to optimize objective functions comprising two components: a convex loss and a penalty function. While AG methods perform well for convex penalties, such as the LASSO, convergence issues may arise when it is applied to nonconvex penalties, such as SCAD. A recent proposal generalizes Nesterov's AG method to the nonconvex setting but has never been applied to sparse statistical learning problems. There are several hyperparameters to be set before running the proposed algorithm. However, there is no explicit rule as to how the hyperparameters should be selected. In this article, we consider the application of this nonconvex AG algorithm to high-dimensional linear and logistic sparse learning problems, and propose a hyperparameter setting based on the complexity upper bound to accelerate convergence. We further establish the rate of convergence and present a simple and useful bound for the damping sequence. Simulation studies show that convergence can be made, on average, considerably faster than that of the conventional ISTA algorithm. Our experiments also show that the proposed method generally outperforms the current state-of-the-art method in terms of signal recovery.
翻译:Nesterov的加速梯度( AG) 是优化包括两个组成部分的客观功能的一种流行技术: 螺旋损失和惩罚功能。 虽然 AG方法在诸如LASSO等对螺旋惩罚方面表现良好, 当它适用于非convex惩罚时, 可能会出现趋同问题, 例如 SCAD。 最近的一项提案将 Nesterov 的AG方法概括为非convex 设置, 但却从未适用于稀少的统计学习问题。 在执行拟议的算法之前, 有几个超参数要设定。 但是, 在如何选择超参数方面没有明确的规则。 在本条中, 我们考虑将这种非convex AG算法应用到高度的线性和后勤性稀少的学习问题, 并提议基于复杂性的超参数设置来加速趋同。 我们进一步确定趋同率, 并为阻断序列提供一个简单和有用的约束。 模拟研究表明, 平均而言, 趋同速度可以大大超过常规 ISTA 算法的速度。 我们的实验还表明, 拟议的方法在目前的信号恢复方面普遍超过目前的状态。