In this paper, we investigate the optimal statistical performance and the impact of computational constraints for independent component analysis (ICA). Our goal is twofold. On the one hand, we characterize the precise role of dimensionality on sample complexity and statistical accuracy, and how computational consideration may affect them. In particular, we show that the optimal sample complexity is linear in dimensionality, and interestingly, the commonly used sample kurtosis-based approaches are necessarily suboptimal. However, the optimal sample complexity becomes quadratic, up to a logarithmic factor, in the dimension if we restrict ourselves to estimates that can be computed with low-degree polynomial algorithms. On the other hand, we develop computationally tractable estimates that attain both the optimal sample complexity and minimax optimal rates of convergence. We study the asymptotic properties of the proposed estimates and establish their asymptotic normality that can be readily used for statistical inferences. Our method is fairly easy to implement and numerical experiments are presented to further demonstrate its practical merits.
翻译:在本文中,我们研究了独立成分分析(ICA)的最佳统计性能以及计算限制对其的影响。我们的目标是双重的。一方面,我们表征了维度在样本复杂度和统计准确性方面的精确作用,以及计算考虑如何影响它们。特别是,我们表明最佳样本复杂度与维度成线性关系,有趣的是,常用的基于样本峰度的方法必然是次优的。然而,如果我们限制自己只对可以使用低次多项式算法计算的估计进行限制,就会发现最优样本复杂度变成了维度平方,最多再加上一个对数因子。另一方面,我们开发了可计算的估计,使其同时达到了最优样本复杂度和最佳收敛速度。我们研究了所提出估计的渐近属性,并建立了其容易用于统计推断的渐近正态性。我们的方法相当易于实现,并展示了数值实验以进一步证明其实用性。