Multiway data analysis is aimed at inferring patterns from data represented as a multi-dimensional array. Estimating covariance from multiway data is a fundamental statistical task, however, the intrinsic high dimensionality poses significant statistical and computational challenges. Recently, several factorized covariance models, paired with estimation algorithms, have been proposed to circumvent these obstacles. Despite several promising results on the algorithmic front, it remains under-explored whether and when such a model is valid. To address this question, we define the notion of Kronecker-separable multiway covariance, which can be written as a sum of $r$ tensor products of mode-wise covariances. The question of whether a given covariance can be represented as a separable multiway covariance is then reduced to an equivalent question about separability of quantum states. Using this equivalence, it follows directly that a generic multiway covariance tends to be non-separable (even if $r \to \infty$), and moreover, finding its best separable approximation is NP-hard. These observations imply that factorized covariance models are restrictive and should be used only when there is a compelling rationale for such a model.
翻译:多维数据分析旨在从以多维阵列形式呈现的数据中推断模式。从多维阵列数据中估算出共差是一个根本性的统计任务,然而,从多维数据中估算出共差是一个根本性的统计任务,内在的高维性提出了重大的统计和计算挑战。最近,提出了几个因素化共差模型,加上估算算法,以绕过这些障碍。尽管算法方面有一些有希望的结果,但这种模型是否有效以及何时有效,仍然未得到充分探讨。为了解决这个问题,我们定义了Kronecker-可分离的多路变差的概念,这个概念可以写成一种以美元为单位的多维度产品的总和,但是,内在的高度维度提出了重大的统计和计算挑战。一个特定共变差能否作为分数的多维度模型,然后被减为量状态的分离问题。使用这一等等值后,可以直接认为通用的多路变差往往不可分离(即使$r\to\ infty$),此外,找到其最佳的可分离性近差差差差差值的模型,在使用这种限制性的模型时,只能用作一种限制性的模型。