This paper considers the sparse recovery with shuffled labels, i.e., $\by = \bPitrue \bX \bbetatrue + \bw$, where $\by \in \RR^n$, $\bPi\in \RR^{n\times n}$, $\bX\in \RR^{n\times p}$, $\bbetatrue\in \RR^p$, $\bw \in \RR^n$ denote the sensing result, the unknown permutation matrix, the design matrix, the sparse signal, and the additive noise, respectively. Our goal is to reconstruct both the permutation matrix $\bPitrue$ and the sparse signal $\bbetatrue$. We investigate this problem from both the statistical and computational aspects. From the statistical aspect, we first establish the minimax lower bounds on the sample number $n$ and the \emph{signal-to-noise ratio} ($\snr$) for the correct recovery of permutation matrix $\bPitrue$ and the support set $\supp(\bbetatrue)$, to be more specific, $n \gtrsim k\log p$ and $\log\snr \gtrsim \log n + \frac{k\log p}{n}$. Then, we confirm the tightness of these minimax lower bounds by presenting an exhaustive-search based estimator whose performance matches the lower bounds thereof up to some multiplicative constants. From the computational aspect, we impose a parsimonious assumption on the number of permuted rows and propose a computationally-efficient estimator accordingly. Moreover, we show that our proposed estimator can obtain the ground-truth $(\bPitrue, \supp(\bbetatrue))$ under mild conditions. Furthermore, we provide numerical experiments to corroborate our claims.
翻译:本文考虑疏散恢复中的标签混淆问题,即$\by = \bPitrue \bX \bbetatrue + \bw$,其中$\by \in \RR^n$,$\bPi\in \RR^{n \times n}$,$\bX\in \RR^{n \times p}$,$\bbetatrue\in \RR^p$,$\bw \in \RR^n$表示传感结果,未知置换矩阵,设计矩阵,稀疏信号和加性噪声。我们的目标是重构置换矩阵$\bPitrue$和稀疏信号$\bbetatrue$。我们从统计和计算两个方面研究这个问题。从统计角度,我们首先建立了样本数量$n$和信噪比($\snr$)的最小值下限,以正确恢复置换矩阵$\bPitrue$和支持集$\supp(\bbetatrue)$为例,即$n \gtrsim k\log p$和$\log\snr \gtrsim \log n + \frac{k\log p}{n}$。然后,我们通过提供一个基于穷举搜索的估计器,证实了这些最小极限的紧密性,其性能与其下界匹配。从计算的角度,我们对置换行数进行了简化假设,并相应地提出了一种计算效率高的估计器。此外,我们还展示了我们的估计器在温和条件下可以获得地面真相$(\bPitrue, \supp(\bbetatrue))$。此外,我们提供了数值实验来证明我们的论点。