Higher-order multiway data is ubiquitous in machine learning and statistics and often exhibits community-like structures, where each component (node) along each different mode has a community membership associated with it. In this paper we propose the tensor mixed-membership blockmodel, a generalization of the tensor blockmodel positing that memberships need not be discrete, but instead are convex combinations of latent communities. We establish the identifiability of our model and propose a computationally efficient estimation procedure based on the higher-order orthogonal iteration algorithm (HOOI) for tensor SVD composed with a simplex corner-finding algorithm. We then demonstrate the consistency of our estimation procedure by providing a per-node error bound, which showcases the effect of higher-order structures on estimation accuracy. To prove our consistency result, we develop the $\ell_{2,\infty}$ tensor perturbation bound for HOOI under independent, possibly heteroskedastic, subgaussian noise that may be of independent interest. Our analysis uses a novel leave-one-out construction for the iterates, and our bounds depend only on spectral properties of the underlying low-rank tensor under nearly optimal signal-to-noise ratio conditions such that tensor SVD is computationally feasible. Whereas other leave-one-out analyses typically focus on sequences constructed by analyzing the output of a given algorithm with a small part of the noise removed, our leave-one-out analysis constructions use both the previous iterates and the additional tensor structure to eliminate a potential additional source of error. Finally, we apply our methodology to real and simulated data, including applications to two flight datasets and a trade network dataset, demonstrating some effects not identifiable from the model with discrete community memberships.
翻译:在机器学习和统计中,高阶多端数据在机器学习和统计中是普遍存在的,而且往往展示了类似于社区的结构,每个不同模式的每个组成部分(节点)都有与其相关的社区成员。在本文中,我们建议采用高压混合成员制区块模型,对高阶区块模型进行概括化,假设成员不需要离散,而是隐形社区的组合。我们建立模型的可识别性,并提议基于更高顺序或异式离析互换算法(HOOOI)的计算效率小估算程序,由简单x角落调查算法组成,每个不同模式的每个组成部分(节点)都有与其相关的社区成员。在本文中,我们提出我们估算程序的一致性,通过提供单点误差限制,展示高阶结构结构对准确性的影响。为了证明我们的一致性结果,我们开发了$ell=2,(inftyty) Exor experfty) Experturbation, 将我们的模型与HOI结合独立,可能具有超前置的内置、内置、内置、内置、内置、内置-内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置数据分析、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置数据分析、内置、内置、内置、内置数据分析,以及内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置数据分析、内置、内置、内置、内置、内置、