In this work we build a unifying framework to interpolate between density-driven and geometry-based algorithms for data clustering, and specifically, to connect the mean shift algorithm with spectral clustering at discrete and continuum levels. We seek this connection through the introduction of Fokker-Planck equations on data graphs. Besides introducing new forms of mean shift algorithms on graphs, we provide new theoretical insights on the behavior of the family of diffusion maps in the large sample limit as well as provide new connections between diffusion maps and mean shift dynamics on a fixed graph. Several numerical examples illustrate our theoretical findings and highlight the benefits of interpolating density-driven and geometry-based clustering algorithms.
翻译:在这项工作中,我们建立了一个统一框架,将数据组群的密度驱动算法和基于几何的算法相互调和,特别是将平均转换算法与离散和连续水平的光谱集成联系起来。我们通过在数据图表中引入Fokker-Planck等式来寻求这种联系。除了在图表中引入新形式的平均转换算法外,我们还在理论上对大样本限量范围内的传播地图大家庭的行为提供了新的见解,并在传播地图和固定图上的平均转变动态之间提供了新的联系。几个数字例子说明了我们的理论研究结果,并突出了内插密度驱动和基于几何计量的组合算法的好处。