We explore algorithms and limitations for sparse optimization problems such as sparse linear regression and robust linear regression. The goal of the sparse linear regression problem is to identify a small number of key features, while the goal of the robust linear regression problem is to identify a small number of erroneous measurements. Specifically, the sparse linear regression problem seeks a $k$-sparse vector $x\in\mathbb{R}^d$ to minimize $\|Ax-b\|_2$, given an input matrix $A\in\mathbb{R}^{n\times d}$ and a target vector $b\in\mathbb{R}^n$, while the robust linear regression problem seeks a set $S$ that ignores at most $k$ rows and a vector $x$ to minimize $\|(Ax-b)_S\|_2$. We first show bicriteria, NP-hardness of approximation for robust regression building on the work of [OWZ15] which implies a similar result for sparse regression. We further show fine-grained hardness of robust regression through a reduction from the minimum-weight $k$-clique conjecture. On the positive side, we give an algorithm for robust regression that achieves arbitrarily accurate additive error and uses runtime that closely matches the lower bound from the fine-grained hardness result, as well as an algorithm for sparse regression with similar runtime. Both our upper and lower bounds rely on a general reduction from robust linear regression to sparse regression that we introduce. Our algorithms, inspired by the 3SUM problem, use approximate nearest neighbor data structures and may be of independent interest for solving sparse optimization problems. For instance, we demonstrate that our techniques can also be used for the well-studied sparse PCA problem.
翻译:我们探索了稀薄优化问题的算法和限制, 如瘦线性回归和强线性回归。 稀线性回归问题的目标是找出少量关键特征, 而强线性回归问题的目标是找出少量错误的测量。 具体地说, 稀线性回归问题寻求的是美元- 零散的矢量 $x\ in\ mathbb{R ⁇ d$ $, 以最小化 $Ax- b ⁇ 2$, 以最小化的输入矩阵 $A\ in\ mathb{R ⁇ n\ times d} $ 和一个目标矢量 $b\ inmathb{R ⁇ }, 而强性线性回归问题的目标是确定少量的关键特征, 而强度的线性回归问题的目标是确定一个固定的美元值 $( $- b) 和 一个矢量的矢量 美元, 以最小化的矢量的矢量回归 。 我们首先展示的是, 在[ OWOWZ15] 的工作上, 强的回归中, NP- hold 的精确度回归中, 的精确度的精确度, 和精确的递增的递增的递增的算法,, 可能会显示我们从最小化的精度的精确的精确的精确的精确的递解的递解的精确的算法,, 。