Sum-of-norms clustering is a clustering formulation based on convex optimization that automatically induces hierarchy. Multiple algorithms have been proposed to solve the optimization problem: subgradient descent by Hocking et al., ADMM and ADA by Chi and Lange, stochastic incremental algorithm by Panahi et al. and semismooth Newton-CG augmented Lagrangian method by Sun et al. All algorithms yield approximate solutions, even though an exact solution is demanded to determine the correct cluster assignment. The purpose of this paper is to close the gap between the output from existing algorithms and the exact solution to the optimization problem. We present a clustering test that identifies and certifies the correct cluster assignment from an approximate solution yielded by any primal-dual algorithm. Our certification validates clustering for both unit and multiplicative weights. The test may not succeed if the approximation is inaccurate. However, we show the correct cluster assignment is guaranteed to be certified by a primal-dual path following algorithm after sufficiently many iterations, provided that the model parameter $\lambda$ avoids a finite number of bad values. Numerical experiments are conducted on Gaussian mixture and half-moon data, which indicate that carefully chosen multiplicative weights increase the recovery power of sum-of-norms clustering.
翻译:中枢集群是一种基于Convex优化的集群配方,它自动引发等级。为了解决优化问题,已经提出了多种算法:Hocking等人、ADMM和ADA(Chi和Lange)的亚梯下降、Panahi等人和Panahi等人的Stochatic 递增算法以及Sun等人的半smooth Newton-CG 增强Lagrangian方法。所有算法都产生近似解决方案,尽管需要精确的解决方案来确定正确的集群任务。本文的目的是缩小现有算法产出与优化问题确切解决方案之间的差距。我们提出了一个组合测试,从任何原始算法产生的近似解决方案中确定和验证正确的集群任务。我们的认证验证验证了单位和倍增权的组合。如果近似不准确,则测试可能不会成功。然而,我们证明正确的集群分配在足够多的迭代法之后,通过原始路径得到验证,条件是模型参数 $\lambda$避免了坏的半数的回收率。