We extend the theoretical study of a recently proposed nonparametric clustering algorithm called Adaptive Weights Clustering (AWC). In particular, we are interested in the case of high-dimensional data lying in the vicinity of a lower-dimensional non-linear submanifold with positive reach. After a slight adjustment and under rather general assumptions for the cluster structure, the algorithm turns out to be nearly optimal in detecting local inhomogeneities, while aggregating homogeneous data with a high probability. We also adress the problem of parameter tuning.
翻译:我们扩展了最近提出的非参数组合算法(AWC)的理论研究,该算法名为“适应性加权组合法 ” ( AWC ) 。 特别是,我们对高维数据处于低维非线性子分层附近并具有正面影响的情况很感兴趣。 经过稍作调整并根据对分组结构的相当一般的假设,该算法在发现本地不均匀性的同时几乎是最佳的,同时将同质数据集中在一起的可能性也很高。 我们还解决了参数调整问题。