We consider the problem of subset selection for $\ell_{p}$ subspace approximation, that is, to efficiently find a \emph{small} subset of data points such that solving the problem optimally for this subset gives a good approximation to solving the problem optimally for the original input. Previously known subset selection algorithms based on volume sampling and adaptive sampling \cite{DeshpandeV07}, for the general case of $p \in [1, \infty)$, require multiple passes over the data. In this paper, we give a one-pass subset selection with an additive approximation guarantee for $\ell_{p}$ subspace approximation, for any $p \in [1, \infty)$. Earlier subset selection algorithms that give a one-pass multiplicative $(1+\epsilon)$ approximation work under the special cases. Cohen \textit{et al.} \cite{CohenMM17} gives a one-pass subset section that offers multiplicative $(1+\epsilon)$ approximation guarantee for the special case of $\ell_{2}$ subspace approximation. Mahabadi \textit{et al.} \cite{MahabadiRWZ20} gives a one-pass \emph{noisy} subset selection with $(1+\epsilon)$ approximation guarantee for $\ell_{p}$ subspace approximation when $p \in \{1, 2\}$. Our subset selection algorithm gives a weaker, additive approximation guarantee, but it works for any $p \in [1, \infty)$.
翻译:我们考虑下空间近似 $\ ell{p} 的子集选择问题, 也就是说, 有效找到一个数据点子集, 这样对于子集来说, 解决这个子集的子集选择问题能给最初输入的问题提供最佳的近似。 以前已知的子集选择算法基于量取样和适应性取样 {DeshpandeV07}, 通用的 $p =in [1, \ infty], 需要多次通过数据。 在本文中, 我们给出一个一等子选择, 以 $\ ell{ small} $ 的添加近似值保证, 对于任何 $\ ell{ lippr} 子集近似值来说, 提供 $\ liver_ liver_ pell} succilates supilates $.