We consider the problem of structured tensor denoising in the presence of unknown permutations. Such data problems arise commonly in recommendation system, neuroimaging, community detection, and multiway comparison applications. Here, we develop a general family of smooth tensor models up to arbitrary index permutations; the model incorporates the popular tensor block models and Lipschitz hypergraphon models as special cases. We show that a constrained least-squares estimator in the block-wise polynomial family achieves the minimax error bound. A phase transition phenomenon is revealed with respect to the smoothness threshold needed for optimal recovery. In particular, we find that a polynomial of degree up to $(m-2)(m+1)/2$ is sufficient for accurate recovery of order-$m$ tensors, whereas higher degree exhibits no further benefits. This phenomenon reveals the intrinsic distinction for smooth tensor estimation problems with and without unknown permutations. Furthermore, we provide an efficient polynomial-time Borda count algorithm that provably achieves optimal rate under monotonicity assumptions. The efficacy of our procedure is demonstrated through both simulations and Chicago crime data analysis.
翻译:我们认为,在出现未知的变异的情况下,结构化的压强分解问题。这些数据问题通常出现在建议系统、神经成像、社区检测和多路比较应用中。在这里,我们开发了一个光滑的振动模型,直至任意的指数变异;模型将受欢迎的高压区块模型和Lipschitz高光谱模型作为特例纳入其中。我们表明,在块状的多种族大家庭中,一个受限制的最小偏差估计器达到了迷你麦克斯错误的约束。在最佳恢复所需的平稳临界值方面,一个阶段过渡现象已经暴露出来。特别是,我们发现一个最高达$(m-2)(m+1)/2美元(m+1)的多元度足以准确恢复单价(m2),而更高度则没有显示进一步的好处。这个现象揭示了单调假设下平稳的振动度估算问题的内在区别。此外,我们提供了一种高效的多元时波尔达计算法,可以达到最佳的最佳率。我们程序的效力通过模拟和芝加哥犯罪数据分析得到证明。