Non-smooth optimization is a core ingredient of many imaging or machine learning pipelines. Non-smoothness encodes structural constraints on the solutions, such as sparsity, group sparsity, low-rank and sharp edges. It is also the basis for the definition of robust loss functions and scale-free functionals such as square-root Lasso. Standard approaches to deal with non-smoothness leverage either proximal splitting or coordinate descent. These approaches are effective but usually require parameter tuning, preconditioning or some sort of support pruning. In this work, we advocate and study a different route, which operates a non-convex but smooth over-parametrization of the underlying non-smooth optimization problems. This generalizes quadratic variational forms that are at the heart of the popular Iterative Reweighted Least Squares (IRLS). Our main theoretical contribution connects gradient descent on this reformulation to a mirror descent flow with a varying Hessian metric. This analysis is crucial to derive convergence bounds that are dimension-free. This explains the efficiency of the method when using small grid sizes in imaging. Our main algorithmic contribution is to apply the Variable Projection (VarPro) method which defines a new formulation by explicitly minimizing over part of the variables. This leads to a better conditioning of the minimized functional and improves the convergence of simple but very efficient gradient-based methods, for instance quasi-Newton solvers. We exemplify the use of this new solver for the resolution of regularized regression problems for inverse problems and supervised learning, including total variation prior and non-convex regularizers.
翻译:非摩擦优化是许多成像或机器学习管道的核心成分。 非摩擦优化包含解决方案的结构性限制, 如 粘贴、 群体宽度、 低端和尖锐边缘。 它也是定义稳健的损失函数和无比例功能的基础, 如平底Lasso 。 处理非湿度的标准方法可以拉动原始分裂或协调下降。 这些方法是有效的, 但通常需要参数调整、 先决条件或某种支持运行。 在这项工作中, 我们倡导并研究一种不同的路径, 这条路径运行着非曲线、 群体宽度、 低端和尖锐边缘等解决方案的结构性限制。 它也是定义稳健的损失函数的稳健性损失函数定义的基础。 我们的主要理论贡献将这种重整中的梯度下降与反正下流和基于不同Hess的测量联系起来。 这种分析对于实现无维度趋同线的趋同线的趋同范围至关重要。 这解释了在使用精度的精度变异性变异性变异性分析中, 将这个方法的精细性变异性变的精度的精度 。