High-order clustering aims to identify heterogeneous substructure in multiway dataset that arises commonly in neuroimaging, genomics, and social network studies. The non-convex and discontinuous nature of the problem poses significant challenges in both statistics and computation. In this paper, we propose a tensor block model and the computationally efficient methods, \emph{high-order Lloyd algorithm} (HLloyd) and \emph{high-order spectral clustering} (HSC), for high-order clustering in tensor block model. The convergence of the proposed procedure is established, and we show that our method achieves exact clustering under reasonable assumptions. We also give the complete characterization for the statistical-computational trade-off in high-order clustering based on three different signal-to-noise ratio regimes. Finally, we show the merits of the proposed procedures via extensive experiments on both synthetic and real datasets.
翻译:高顺序集群旨在确定在神经成像、基因组和社会网络研究中常见的多路数据集中的不同子结构。 问题的非混凝土和不连续性质在统计和计算两方面都构成重大挑战。 在本文中,我们提出一个高压区块模型和计算效率方法,即:\emph{高阶劳埃德算法}(Hloyd)和\emph{高阶光谱集成),用于高序集成成成成成的抗聚体模型。 确定了拟议程序的趋同,并表明我们的方法在合理的假设下实现了精确的集群。 我们还根据三种不同的信号对噪音比率制度,对高序组群中的统计-对等交易进行了完整的定性。 最后,我们通过对合成数据集和真实数据集进行广泛的试验,展示了拟议程序的优点。