The recently developed average-case analysis of optimization methods allows a more fine-grained and representative convergence analysis than usual worst-case results. In exchange, this analysis requires a more precise hypothesis over the data generating process, namely assuming knowledge of the expected spectral distribution (ESD) of the random matrix associated with the problem. This work shows that the concentration of eigenvalues near the edges of the ESD determines a problem's asymptotic average complexity. This a priori information on this concentration is a more grounded assumption than complete knowledge of the ESD. This approximate concentration is effectively a middle ground between the coarseness of the worst-case scenario convergence and the restrictive previous average-case analysis. We also introduce the Generalized Chebyshev method, asymptotically optimal under a hypothesis on this concentration and globally optimal when the ESD follows a Beta distribution. We compare its performance to classical optimization algorithms, such as gradient descent or Nesterov's scheme, and we show that, in the average-case context, Nesterov's method is universally nearly optimal asymptotically.
翻译:最近开发的优化方法的普通情况分析比通常最坏情况的结果更精细和更具代表性的趋同分析。交换时,这一分析要求对数据生成过程有一个更精确的假设,即假定了解与问题相关的随机矩阵的预期光谱分布(ESD)。这项工作表明,在环境SD边缘附近,乙基值的浓度决定了问题的无症状平均复杂性。关于这种集中的先验信息比对教育SD的完全了解更有根据的假设。这种近似集中实际上是最坏情况趋同与以往限制性平均情况分析之间的中间点。我们还引入了普遍化的Chebyshev方法,在假设这一集中性假设下,该方法是尽可能最佳的,而在环境SD进行Beta分布时,全球最优化。我们将其表现与典型的优化算法,例如梯度下降或Nesterov的算法相比较,我们表明,在平均情况下,Nestrov的方法几乎是最佳的。