We derive and analyze a generic, recursive algorithm for estimating all splits in a finite cluster tree as well as the corresponding clusters. We further investigate statistical properties of this generic clustering algorithm when it receives level set estimates from a kernel density estimator. In particular, we derive finite sample guarantees, consistency, rates of convergence, and an adaptive data-driven strategy for choosing the kernel bandwidth. For these results we do not need continuity assumptions on the density such as H\"{o}lder continuity, but only require intuitive geometric assumptions of non-parametric nature.
翻译:我们得出并分析一种通用的、循环的算法,用于估算有限组群树及相应组群中的所有分解。 当从内核密度估计器收到定值估计数时,我们进一步调查这种通用组群算法的统计特性。 特别是, 我们得出有限的样本保障、 一致性、 趋同率 以及一个用于选择内核带宽的适应性数据驱动战略。 对于这些结果, 我们不需要对密度的连续性假设, 如 H\\ { o} lder 连续性, 只需要非参数性质的直观几何假设 。