Motivated by applications in single-cell biology and metagenomics, we consider matrix reordering based on the noisy disordered matrix model. We first establish the fundamental statistical limit for the matrix reordering problem in a decision-theoretic framework and show that a constrained least square estimator is rate-optimal. Given the computational hardness of the optimal procedure, we analyze a popular polynomial-time algorithm, spectral seriation, and show that it is suboptimal. We then propose a novel polynomial-time adaptive sorting algorithm with guaranteed improvement on the performance. The superiority of the adaptive sorting algorithm over the existing methods is demonstrated in simulation studies and in the analysis of two real single-cell RNA sequencing datasets.
翻译:以单细胞生物学和 medagenomics 的应用为动力,我们考虑根据噪音无序矩阵模型对矩阵进行重新排序。我们首先在决策理论框架内为矩阵重新排序问题确定基本统计限制,并显示一个受限制的最小平方估计器是最佳的。考虑到最佳程序的计算难度,我们分析流行的多球时算法、光谱测试,并显示它不理想。然后我们提出一个新的多球时适应性排序算法,保证性能的改进。适应性排序算法优于现有方法,在模拟研究和分析两个真正的单细胞 RNA 测序数据集时,都证明了它优于现有方法。