Motivated by the problem of determining the atomic structure of macromolecules using single-particle cryo-electron microscopy (cryo-EM), we study the sample and computational complexities of the sparse multi-reference alignment (MRA) model: the problem of estimating a sparse signal from its noisy, circularly shifted copies. Based on its tight connection to the crystallographic phase retrieval problem, we establish that if the number of observations is proportional to the square of the variance of the noise, then the sparse MRA problem is statistically feasible for sufficiently sparse signals. To investigate its computational hardness, we consider three types of computational frameworks: projection-based algorithms, bispectrum inversion, and convex relaxations. We show that a state-of-the-art projection-based algorithm achieves the optimal estimation rate, but its computational complexity is exponential in the sparsity level. The bispectrum framework provides a statistical-computational trade-off: it requires more observations (so its estimation rate is suboptimal), but its computational load is provably polynomial in the signal's length. The convex relaxation approach provides polynomial time algorithms (with a large exponent) that recover sufficiently sparse signals at the optimal estimation rate. We conclude the paper by discussing potential statistical and algorithmic implications for cryo-EM.
翻译:以使用单粒冷冻电子显微镜(cryo-EM)确定大型分子的原子结构问题为动力,我们研究了稀少多参考匹配模型的样本和计算复杂性:从其噪音、循环移动的复制件中估算一个稀少信号的问题。基于其与晶晶体阶段检索问题的紧密联系,我们确定,如果观测数量与噪音差异的平方成正比,那么稀少的MRA问题在统计上对于足够少的信号是可行的。为了调查其计算硬性,我们考虑了三种计算框架:基于预测的算法、双谱反转和 convex放松。我们表明,基于预测的状态算法达到了最佳估计率,但其计算复杂性在音调水平上是指数交换交易的指数框架提供了一种统计-对等的交换:它需要更多的观察(因此其估计率是次优化的),但其计算负荷是可理解的多式计算框架,其精确的计算值是:预测性的算法、双谱反转、和折算法的计算法,其模型的回缩率是高的模型。