In this paper, we introduce directional feedback in the ordinal regression setting, in which the learner receives feedback on whether the predicted label is on the left or the right side of the actual label. This is a weak supervision setting for ordinal regression compared to the full information setting, where the learner can access the labels. We propose an online algorithm for ordinal regression using directional feedback. The proposed algorithm uses an exploration-exploitation scheme to learn from directional feedback efficiently. Furthermore, we introduce its kernel-based variant to learn non-linear ordinal regression models in an online setting. We use a truncation trick to make the kernel implementation more memory efficient. The proposed algorithm maintains the ordering of the thresholds in the expected sense. Moreover, it achieves the expected regret of $\mathcal{O}(\log T)$. We compare our approach with a full information and a weakly supervised algorithm for ordinal regression on synthetic and real-world datasets. The proposed approach, which learns using directional feedback, performs comparably (sometimes better) to its full information counterpart.
翻译:本文在序数回归设定中引入了方向反馈机制,学习者通过该机制可获知预测标签位于真实标签的左侧或右侧。相较于能够直接获取标签的完全信息设定,这是一种弱监督的序数回归学习框架。我们提出了一种利用方向反馈的在线序数回归算法。该算法采用探索-利用策略,以高效地从方向反馈中学习。此外,我们引入了其基于核函数的变体,以在线学习非线性序数回归模型。我们采用截断技巧来提高核函数实现的内存效率。所提算法在期望意义上保持了阈值的有序性。此外,其期望遗憾达到 $\mathcal{O}(\log T)$。我们在合成数据集和真实数据集上将所提方法与完全信息及弱监督的序数回归算法进行了比较。结果表明,利用方向反馈进行学习的所提方法,其性能与完全信息方法相当(有时更优)。