The filtering-clustering models, including trend filtering and convex clustering, have become an important source of ideas and modeling tools in machine learning and related fields. The statistical guarantee of optimal solutions in these models has been extensively studied yet the investigations on the computational aspect have remained limited. In particular, practitioners often employ the first-order algorithms in real-world applications and are impressed by their superior performance regardless of ill-conditioned structures of difference operator matrices, thus leaving open the problem of understanding the convergence property of first-order algorithms. This paper settles this open problem and contributes to the broad interplay between statistics and optimization by identifying a \textit{global error bound} condition, which is satisfied by a large class of dual filtering-clustering problems, and designing a class of \textit{generalized dual gradient ascent} algorithm, which is \textit{optimal} first-order algorithms in deterministic, finite-sum and online settings. Our results are new and help explain why the filtering-clustering models can be efficiently solved by first-order algorithms. We also provide the detailed convergence rate analysis for the proposed algorithms in different settings, shedding light on their potential to solve the filtering-clustering models efficiently. We also conduct experiments on real datasets and the numerical results demonstrate the effectiveness of our algorithms.
翻译:过滤组群模型,包括趋势过滤和组合组合,已成为机器学习及相关领域思想和模型工具的重要来源。这些模型中最佳解决方案的统计保障已经得到广泛研究,但计算方面的调查仍然有限。特别是,实践者经常在现实世界应用中采用一级算法,对其优异操作者矩阵结构条件差的优异表现印象深刻,从而解决了理解第一阶算法趋同属性的问题。本文解决了这一开放的问题,通过确定一个\textit{全球错误捆绑}条件,促进了统计与优化之间的广泛互动,这为大量双重过滤组群问题所满足,并设计了一类\textit{普遍化的双重梯度算法,这是在确定性、有限和在线环境中的第一阶算法的优异结构。我们的结果是新的,有助于解释为什么过滤组群集模型可以通过第一阶算法高效率地解决。我们还提供了详细的合并率分析,用于在不同的数据组群集模型中进行我们提出的数字分析。我们还提供了关于不同数据分析结果的高效化模型。