A formal link between regression and classification has been tenuous. Even though the margin maximization term $\|w\|$ is used in support vector regression, it has at best been justified as a regularizer. We show that a regression problem with $M$ samples lying on a hyperplane has a one-to-one equivalence with a linearly separable classification task with $2M$ samples. We show that margin maximization on the equivalent classification task leads to a different regression formulation than traditionally used. Using the equivalence, we demonstrate a ``regressability'' measure, that can be used to estimate the difficulty of regressing a dataset, without needing to first learn a model for it. We use the equivalence to train neural networks to learn a linearizing map, that transforms input variables into a space where a linear regressor is adequate.
翻译:回归与分类之间的形式化联系一直较为薄弱。尽管支持向量回归中使用了边界最大化项$\\|w\\|$,但至多只能将其视为正则化项。我们证明,包含$M$个位于超平面上的样本的回归问题,与包含$2M$个样本的线性可分分类任务存在一一对应等价关系。我们证明,在等价分类任务上的边界最大化会导致与传统回归方法不同的回归公式。利用该等价关系,我们提出一种“可回归性”度量,可用于评估数据集回归的难度,而无需先学习模型。我们运用该等价关系训练神经网络学习线性化映射,将输入变量转换到线性回归器适用的空间。